(89 days)
IDx-DR is indicated for use by health care providers to automatically detect more than mild diabetic retinopathy (mtmDR) in adults diagnosed with diabetes who have not been previously diagnosed with diabetic retinopathy. IDx-DR is indicated for use with the Topcon NW400.
The IDx-DR consists of several components. A fundus camera is attached to a computer, where the IDx-DR Client is installed. The Client allows the user to interact with the server-based analysis software over a secure internet connection. Using the Client, users identify two fundus images per eye to be dispatched to IDx-Service is installed on a server hosted at a secure datacenter. IDx-DR Analysis, which runs inside IDx-Service, processes the fundus images and returns information on the image quality and the presence or absence of mtmDR to IDx-Service. IDx- Service then returns the results to the IDx-DR Client.
Acceptance Criteria and Device Performance for IDx-DR
This document details the acceptance criteria for the IDx-DR device and summarizes the study conducted to demonstrate its performance.
1. Acceptance Criteria and Reported Device Performance
The primary outcomes for the IDx-DR study were sensitivity and specificity for detecting more than mild diabetic retinopathy (mtmDR). Pre-defined performance thresholds were established, and the study results demonstrate the device met these criteria.
Metric | Acceptance Criteria (Threshold) | Reported Device Performance (Full Analyzable Set) | 95% Confidence Interval (Reported) |
---|---|---|---|
Sensitivity | 85.0% | 87.4% | 81.9% - 92.9% |
Specificity | 82.5% | 89.5% | 86.9% - 93.1% |
Imageability | Not explicitly stated | 96.1% | 94.0% - 96.8% |
Positive Predictive Value (PPV) | Not explicitly stated | 72.7% | (Implicitly provided as 173/238) |
Negative Predictive Value (NPV) | Not explicitly stated | 95.7% | (Implicitly provided as 556/581) |
Note: The reported performance also includes enrichment-corrected sensitivity and specificity, which were also high and met the thresholds.
2. Sample Size and Data Provenance for Test Set
- Sample Size (Test Set): 819 participants were fully analyzable in the pivotal clinical study.
- Data Provenance: The data was collected prospectively from 10 primary care sites across the United States. The target population was adults diagnosed with diabetes who had not been previously diagnosed with diabetic retinopathy. The study population was enriched by targeting enrollment of subjects with elevated Hemoglobin A1c (HbA1C) levels.
3. Number and Qualifications of Experts for Ground Truth (Test Set)
- Number of Experts: Three experienced and validated readers.
- Qualifications of Experts: The readers were certified by the Fundus Photography Reading Center (FPRC) and had expertise in evaluating the severity of retinopathy and diabetic macular edema (DME) according to the Early Treatment for Diabetic Retinopathy Study (ETDRS) scale and Diabetic Retinopathy Clinical Research Network (DRCR) grading paradigm.
4. Adjudication Method for the Test Set
The adjudication method used for establishing the ground truth from the FPRC readers was a majority voting paradigm for the four widefield stereo image pairs.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
No explicit MRMC comparative effectiveness study involving human readers' improvement with AI vs. without AI assistance was reported. The study focused on the standalone performance of the IDx-DR device against an expert-derived reference standard.
6. Standalone (Algorithm Only) Performance
Yes, a standalone performance study was conducted. The reported sensitivity, specificity, PPV, and NPV values are for the IDx-DR algorithm operating autonomously, without human-in-the-loop assistance during the diagnostic process.
7. Type of Ground Truth Used (Test Set)
The ground truth used was expert consensus based on comprehensive ophthalmic imaging (dilated four widefield stereo color fundus photography and macular optical coherence tomography (OCT) imaging) read by three experienced and validated readers at the Fundus Photography Reading Center (FPRC). The severity of retinopathy and DME was determined according to the ETDRS scale and DRCR grading paradigm, using a majority voting paradigm.
8. Sample Size for the Training Set
The document does not explicitly state the sample size used for the training set. It describes the clinical study as a pivotal clinical study with 900 enrolled patients, which formed the basis for evaluating the device's performance, but it does not specify what portion (if any) of this dataset was used for training or validation during the development phase. The language focuses on the "analyzable fraction" of participants for the primary outcomes, implying this was the test set.
9. How the Ground Truth for the Training Set Was Established
The document does not provide details on how the ground truth was established for the training set. It primarily describes the methodology for establishing the ground truth for the test set used in the pivotal clinical study. It mentions that IDx has provided a full characterization of the technical parameters of the software, including a description of the algorithms, and that IDx will make future algorithm improvements under a consistent medically relevant framework. However, the details of training data ground truth establishment are not discussed.
§ 886.1100 Retinal diagnostic software device.
(a)
Identification. A retinal diagnostic software device is a prescription software device that incorporates an adaptive algorithm to evaluate ophthalmic images for diagnostic screening to identify retinal diseases or conditions.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Software verification and validation documentation, based on a comprehensive hazard analysis, must fulfill the following:
(i) Software documentation must provide a full characterization of technical parameters of the software, including algorithm(s).
(ii) Software documentation must describe the expected impact of applicable image acquisition hardware characteristics on performance and associated minimum specifications.
(iii) Software documentation must include a cybersecurity vulnerability and management process to assure software functionality.
(iv) Software documentation must include mitigation measures to manage failure of any subsystem components with respect to incorrect patient reports and operator failures.
(2) Clinical performance data supporting the indications for use must be provided, including the following:
(i) Clinical performance testing must evaluate sensitivity, specificity, positive predictive value, and negative predictive value for each endpoint reported for the indicated disease or condition across the range of available device outcomes.
(ii) Clinical performance testing must evaluate performance under anticipated conditions of use.
(iii) Statistical methods must include the following:
(A) Where multiple samples from the same patient are used, statistical analysis must not assume statistical independence without adequate justification.
(B) Statistical analysis must provide confidence intervals for each performance metric.
(iv) Clinical data must evaluate the variability in output performance due to both the user and the image acquisition device used.
(3) A training program with instructions on how to acquire and process quality images must be provided.
(4) Human factors validation testing that evaluates the effect of the training program on user performance must be provided.
(5) A protocol must be developed that describes the level of change in device technical specifications that could significantly affect the safety or effectiveness of the device.
(6) Labeling must include:
(i) Instructions for use, including a description of how to obtain quality images and how device performance is affected by user interaction and user training;
(ii) The type of imaging data used, what the device outputs to the user, and whether the output is qualitative or quantitative;
(iii) Warnings regarding image acquisition factors that affect image quality;
(iv) Warnings regarding interpretation of the provided outcomes, including:
(A) A warning that the device is not to be used to screen for the presence of diseases or conditions beyond its indicated uses;
(B) A warning that the device provides a screening diagnosis only and that it is critical that the patient be advised to receive followup care; and
(C) A warning that the device does not treat the screened disease;
(v) A summary of the clinical performance of the device for each output, with confidence intervals; and
(vi) A summary of the clinical performance testing conducted with the device, including a description of the patient population and clinical environment under which it was evaluated.