(121 days)
Imbio IQ-UIP is a computer-aided software indicated for use in passively notifying specialists associated with interstitial lung disease (ILD) centers of radiological findings suggestive of radiological usual interstitial pneumonia (UIP) in non-contrast, chest CT scans of adults. Imbio IQ-UIP uses an artificial intelligence algorithm to analyze images and identify positive findings on a worklist application separate from and in parallel to the standard of care radiological image interpretation. Identification of positive findings include summary reports with a clinical guideline reference for the definition of UIP pattern that are meant for informational purposes only. The device does not alter the original medical image and is not intended to be used as a diagnostic device.
The results of Imbio IQ-UIP are used to notify specialists at an ILD center of radiological findings that may be consistent with UIP. These specialists are qualified clinicians experienced in evaluating chest CTs for ILD. Input images originate from within the same hospital network associated with the ILD center. The results of Imbio IQ-UIP are intended to be used in conjunction with additional patient information and based on the user's professional judgment, to assist with the review of medical images. Notified clinicians are responsible for viewing full image series and making final clinical determinations.
Imbio IQ-UIP is a computer-aided software indicated for use in notifying specialists associated with Interstitial Lung Disease (ILD) Centers of radiological findings suggestive of radiological Usual Interstitial Pneumonia (UIP) in non-contrast, chest CT scans of adults.
Imbio IQ-UIP uses an artificial intelligence algorithm to analyze images and identify positive findings on a worklist application separate from and in parallel to the standard of care radiological image interpretation. Identification of positive findings include summary reports with a clinical guideline reference for the definition of UIP pattern that are meant for informational purposes only. The device does not alter the original medical image and is not intended to be used as a diagnostic device.
The development of the deep learning inference model utilized anonymized, multi-center, retrospective, volumetric chest CT scans from several different, private and public data sources including multiple hospitals, clinical imaging centers, and imaging databases. Chest CT datasets were identified where each dataset represented an individual subject and acquisition. Data was subdivided into "bins" between the two stages of model development roughly 80%:20%: 1) model training and validation (i.e., hyper-parameter tuning) and 2) model testing (i.e. performance assessment). Site independence was maintained for several of the databases with clinical location data labels by randomly assigning each clinic location an integer value between 1 and 1000. Then, increasing from the lowest to highest random integer value, all data sets from a specific clinic location were assigned to the training bin until 80% of the total number of datasets from a database had been assigned to the training bin. The remaining were assigned to the testing bin. The testing data set was locked and quarantined from the datasets used in the device's model training and validation.
The results of Imbio IQ-UIP are intended to be used in conjunction with other patient information and based on the user's professional judgment, to assist with the review of medical images. Notified clinicians are responsible for viewing full image series and making final clinical determinations.
This document details the acceptance criteria and the study that proves the device (Imbio IQ-UIP) meets these criteria, based on the provided FDA 510(k) summary.
Device Name: Imbio IQ-UIP
Intended Use: Computer-aided software indicated for passively notifying specialists associated with interstitial lung disease (ILD) centers of radiological findings suggestive of radiological usual interstitial pneumonia (UIP) in non-contrast, chest CT scans of adults. It uses an AI algorithm to analyze images and identify positive findings on a worklist application, separate from and in parallel to standard-of-care radiological image interpretation. The device does not alter the original medical image and is not intended to be used as a diagnostic device.
1. Table of Acceptance Criteria and Reported Device Performance
The acceptance criteria are not explicitly stated as quantitative thresholds in the provided document. However, the study focuses on evaluating the device's performance metrics (AUC ROC, PPV, Specificity, Sensitivity) in identifying radiological UIP patterns. The "acceptance" is implied by the reported performance figures that demonstrate the device's ability to meet its intended purpose of identifying findings "suggestive of radiological usual interstitial pneumonia."
Performance Metric | Reported Device Performance |
---|---|
AUC ROC | 96.6 [95.4, 97.7] |
PPV | 77.9 [73.3, 82.8] |
Specificity | 91.5 [89.2, 93.7] |
Sensitivity | 90.2 [86.2, 94.3] |
2. Sample Size Used for the Test Set and Data Provenance
- Sample Size for Test Set: 804 individual patient images.
- Data Provenance: Anonymized, multi-center, retrospective, volumetric chest CT scans from several different, private and public data sources including multiple hospitals, clinical imaging centers, and imaging databases. The country of origin is not explicitly stated but can be inferred to be primarily the United States given the use of U.S. board-certified radiologists for ground truthing.
3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications
- Number of Experts: Five experts (referred to as "truthers").
- Qualifications of Experts:
- U.S. board-certified radiologists.
- Practicing within the United States.
- Minimum of 5+ years experience evaluating chest CTs for ILDs.
- Clinical familiarity with using the ATS/ERS/JRS/ALAT diagnostic categories for UIP pattern.
- None involved in the development of the algorithm/device, ensuring independence.
4. Adjudication Method for the Test Set
The document does not explicitly state the adjudication method (e.g., 2+1, 3+1). It only mentions that five experts "performed ground truthing" of the performance datasets. Therefore, the specific method for resolving disagreements or arriving at a consensus ground truth amongst the five experts is not detailed.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done
The provided information does not indicate that an MRMC comparative effectiveness study was done to compare human readers with AI assistance vs. without AI assistance. The study focuses on a standalone performance assessment of the AI algorithm.
6. If a Standalone (Algorithm Only Without Human-in-the-Loop Performance) Was Done
Yes, a standalone performance assessment was done. The reported performance metrics (AUC ROC, PPV, Specificity, Sensitivity) are from the device's independent analysis of images, without human intervention during the assessment. The document explicitly calls this "standalone performance assessment."
7. The Type of Ground Truth Used
The ground truth used was expert consensus based on the evaluation by five U.S. board-certified radiologists with specific experience in ILD and UIP pattern diagnosis using established clinical guidelines (ATS/ERS/JRS/ALAT diagnostic categories).
8. The Sample Size for the Training Set
The document states that data was subdivided into "bins" for model development, with roughly 80% assigned to model training and validation (i.e., hyper-parameter tuning) and 20% for model testing (performance assessment). Since the test set was 804 images, the total number of unique datasets used for both training/validation and testing would be approximately 804 / 0.20 = 4020.
Therefore, the training set sample size would be approximately 3216 datasets (80% of 4020).
9. How the Ground Truth for the Training Set Was Established
The document states that for model development, data was comprised of "anonymized, multi-center, retrospective, volumetric chest CT scans from several different, private and public data sources including multiple hospitals, clinical imaging centers, and imaging databases." It does not explicitly detail the method for establishing ground truth for the training set. However, given the nature of AI/ML model development for medical imaging, it is highly probable that the training data was also annotated or labeled by experts, or derived from clinical records/diagnoses that implicitly represent ground truth. The emphasis on independent "truthers" for the test set suggests a rigorous approach to testing, but the specifics of training data labeling are not provided in this summary.
N/A