K Number
K223133
Device Name
VisiRad XR
Manufacturer
Date Cleared
2023-08-03

(304 days)

Product Code
Regulation Number
892.2070
Reference & Predicate Devices
Predicate For
N/A
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

VisiRad XR is a computer-aided detection (CADe) device intended to identify and mark regions of interest that may be suspicious for lung nodules and masses on chest radiographs. It identifies features associated with pulmonary nodules and masses from 6-60mm in size. Detection of suspicious findings by VisiRad XR is intended as an aid only after the physician has performed an initial interpretation; it is not intended to replace the review by a qualified radiologist and is not intended to be used for trage or to make or confirm a diagnosis. The intended patient population for VisRad XR consists of patients >21 years of age on whom chest radiographs have been acquired in an outpatient or emergency department setting.

Device Description

VisiRad XR is a computer aided detection (CADe) software as a medical device (SaMD) product intended to detect lung nodules and masses from 6-60mm in chest radiographs. VisiRad XR takes DICOM images as input, utilizes machine learning algorithms to detect suspicious regions and outputs a secondary DICOM with annotated regions of interest (ROIs). VisiRad XR's output secondary DICOM includes text that it was analyzed by VisiRad XR and a link to the user manual. If no ROIs are detected by VisiRad XR, the returned secondary DICOM states "No Nodules/Masses Found". VisiRad XR is intended to be used as a second-read only after the clinician has performed their initial interpretation. The secondary DICOM does not overwrite or replace the primary radiograph, it is returned such that it hangs, using standard DICOM hanging protocol, behind the primary image.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study details for VisiRad XR based on the provided document:

Acceptance Criteria and Reported Device Performance

Acceptance Criteria (Endpoint)Reported Device Performance (VisiRad XR)
Standalone Sensitivity0.83 (95% Cl: 0.81-0.84)
Standalone False Positives/Image1.5
Standalone AUC0.73 (95% Cl: 0.71-0.74)
Aided vs. Unaided AUCAverage improvement across both sites: 0.027 (Site I: 0.035 (95% Cl: 0.021, 0.048); Site II: 0.018 (95% Cl: 0.005, 0.031)) - Statistically significant
Aided vs. Unaided SensitivityAverage increase across all readers: 0.076 (Site I: 0.097; Site II: 0.053)
Aided vs. Unaided SpecificityAverage decrease across all readers: 0.086 (Site I: 0.114; Site II: 0.06)

Note on Acceptance Criteria: The document explicitly states the primary endpoint for the standalone test was sensitivity and for the clinical study was superiority of aided vs. unaided AUC. Other metrics served as secondary endpoints. The acceptance criteria themselves are implicitly defined by achieving "superiority" and demonstrating "safety and effectiveness" comparable to the predicate.

Study Details

  1. Sample size used for the test set and the data provenance:

    • Standalone Test Set: Not explicitly stated as a single number but consisted of data from three sources: National Lung Screening Trials (NLST) and two independent data sites. These independent sites were a Level II trauma center in rural Montana and a Level I trauma center in metropolitan Colorado. Data was acquired from each site's emergency department between 2016 and 2021. The NLST is described as a high-quality, outpatient dataset of current or former heavy smokers with geographic and demographic representation across the country.
    • Clinical Performance Test Set: 600 total patient images (300 per site). The data was retrospective chest radiographs from patients in emergency department and outpatient settings. The patient population was from across the United States (Colorado, Ohio, New Jersey, South Carolina, Iowa, Wisconsin) and represented a range of age, racial, ethnic groups, and geographic diversity. 56% were women, and 47% of those who disclosed racial data identified as a racial group other than white or Caucasian.
  2. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:

    • The document mentions that the clinical study ground truth was established by a "reference standard" against which both unaided and aided reader performance was compared. However, it does not explicitly state the number of experts or their qualifications used to establish this reference standard for either the standalone or clinical test sets.
  3. Adjudication method (e.g. 2+1, 3+1, none) for the test set:

    • The document does not explicitly state the adjudication method used to establish the ground truth for the test set.
  4. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:

    • Yes, a fully-crossed MRMC retrospective reader study was performed.
    • Effect Size: The average reader improvement in overall average AUC for both sites was 0.027.
      • Site I demonstrated an average AUC improvement of 0.035 (95% Cl: 0.021, 0.048).
      • Site II demonstrated an average AUC improvement of 0.018 (95% Cl: 0.005, 0.031).
    • Average sensitivity across all readers increased by 0.076.
    • Average specificity across all readers decreased by 0.086.
  5. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:

    • Yes, a standalone performance test was executed on VisiRad XR.
  6. The type of ground truth used (expert consensus, pathology, outcomes data, etc.):

    • For the standalone test, the document says performance was assessed on a "broad, representative dataset" but does not explicitly state the type of ground truth (e.g., expert consensus, pathology, follow-up).
    • For the clinical performance test, reader performance was compared "as compared to the reference standard." The nature of this "reference standard" (e.g., expert consensus, pathology, follow-up) is not explicitly defined.
  7. The sample size for the training set:

    • The document does not provide information regarding the sample size used for the training set.
  8. How the ground truth for the training set was established:

    • The document does not provide information on how the ground truth for the training set was established.

§ 892.2070 Medical image analyzer.

(a)
Identification. Medical image analyzers, including computer-assisted/aided detection (CADe) devices for mammography breast cancer, ultrasound breast lesions, radiograph lung nodules, and radiograph dental caries detection, is a prescription device that is intended to identify, mark, highlight, or in any other manner direct the clinicians' attention to portions of a radiology image that may reveal abnormalities during interpretation of patient radiology images by the clinicians. This device incorporates pattern recognition and data analysis capabilities and operates on previously acquired medical images. This device is not intended to replace the review by a qualified radiologist, and is not intended to be used for triage, or to recommend diagnosis.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Design verification and validation must include:
(i) A detailed description of the image analysis algorithms including a description of the algorithm inputs and outputs, each major component or block, and algorithm limitations.
(ii) A detailed description of pre-specified performance testing methods and dataset(s) used to assess whether the device will improve reader performance as intended and to characterize the standalone device performance. Performance testing includes one or more standalone tests, side-by-side comparisons, or a reader study, as applicable.
(iii) Results from performance testing that demonstrate that the device improves reader performance in the intended use population when used in accordance with the instructions for use. The performance assessment must be based on appropriate diagnostic accuracy measures (
e.g., receiver operator characteristic plot, sensitivity, specificity, predictive value, and diagnostic likelihood ratio). The test dataset must contain a sufficient number of cases from important cohorts (e.g., subsets defined by clinically relevant confounders, effect modifiers, concomitant diseases, and subsets defined by image acquisition characteristics) such that the performance estimates and confidence intervals of the device for these individual subsets can be characterized for the intended use population and imaging equipment.(iv) Appropriate software documentation (
e.g., device hazard analysis; software requirements specification document; software design specification document; traceability analysis; description of verification and validation activities including system level test protocol, pass/fail criteria, and results; and cybersecurity).(2) Labeling must include the following:
(i) A detailed description of the patient population for which the device is indicated for use.
(ii) A detailed description of the intended reading protocol.
(iii) A detailed description of the intended user and user training that addresses appropriate reading protocols for the device.
(iv) A detailed description of the device inputs and outputs.
(v) A detailed description of compatible imaging hardware and imaging protocols.
(vi) Discussion of warnings, precautions, and limitations must include situations in which the device may fail or may not operate at its expected performance level (
e.g., poor image quality or for certain subpopulations), as applicable.(vii) Device operating instructions.
(viii) A detailed summary of the performance testing, including: test methods, dataset characteristics, results, and a summary of sub-analyses on case distributions stratified by relevant confounders, such as lesion and organ characteristics, disease stages, and imaging equipment.