FDA 510(k) Search - By Innolitics (SaMD and AI Experts)

IDx-DR is indicated for use by healthcare providers to automatically detect more than mild diabetic retimopathy (mtmDR) in adults diagnosed with diabetes who have not been previously diagnosed with diabetic retinopathy. IDx-DR is indicated for use with the Topcon NW400.

Device Description

The IDx-DR device is an autonomous, artificial intelligence (AI)-based system for the automated detection of more than mild diabetic retinopathy (mtmDR). It consists of several component parts: IDx-DR Analysis, IDx-DR Client, and IDx-DR Service. The IDx-DR Analysis software analyzes patient images and determines exam quality and the presence/absence of mtmDR. The IDx-DR Client is a software application running on a computer connected to the fundus camera, allowing users to transfer images and receive results. The IDx-DR Service comprises a general exam analysis service delivery software package with a webserver front-end, database, and logging system, and is responsible for device cybersecurity. The system workflow involves image acquisition using the Topcon NW400, transfer to IDx-DR Service, analysis by IDx-DR Analysis System, and display of results on the IDx-DR Client.

AI/ML Overview

The provided text describes a 510(k) submission for IDx-DR v2.3, a diabetic retinopathy detection device. The submission aims to demonstrate substantial equivalence to a predicate device (IDx-DR v2.0).

Here's an analysis of the acceptance criteria and the study that proves the device meets them:

1. A table of acceptance criteria and the reported device performance

The document implicitly uses the performance of the predicate device (IDx-DR v2.0) as the acceptance criteria for the new version (IDx-DR v2.3). The study's goal is to show that IDx-DR v2.3 performs comparably to or better than IDx-DR v2.0. The primary endpoints are sensitivity, specificity, and "diagnosability." Secondary endpoints are positive prophetic value (PPV) and negative predictive value (NPV).

Here's a table comparing the performance of the subject device (IDx-DR v2.3) and the predicate device (IDx-DR v2.0) based on "final submission" images, which are the most relevant for diagnostic performance. The document presents ranges for performance, but for the sake of clarity, I've used the point estimates presented in the tables for both the subject and predicate devices. The values in parentheses are the 95% Confidence Intervals.

Characteristic	Predicate Device (IDx-DR v2.0)	Subject Device (IDx-DR v2.3)
Primary Endpoints
Diagnosability (Final Sub.)	96.35% (94.86%, 97.51%)	95.18% (93.51%, 96.52%)
Sensitivity	87.37% (81.93%, 91.66%)	87.69% (82.24%, 91.95%)
Specificity	89.53% (86.85%, 91.83%)	90.07% (87.42%, 92.32%)
Secondary Endpoints
Positive Predictive Value	72.69% (66.56%, 78.25%)	73.71% (67.55%, 79.25%)
Negative Predictive Value	95.70% (93.71%, 97.20%)	95.84% (93.87%, 97.32%)

The document concludes that "The results of the clinical study support a determination of substantial equivalence between IDx-DR v2.3 and IDx-DR v2.0." This implies that the observed performance of IDx-DR v2.3 falls within an acceptable range, demonstrating non-inferiority or similar performance to the predicate device. Specific numerical acceptance thresholds (e.g., "must be at least X%") are not explicitly stated, but the comparison to the existing cleared device acts as the benchmark.

2. Sample size used for the test set and the data provenance

Sample Size for Test Set: Data from 892 participants from the pivotal study of the predicate device were used. Of these, images from 850 participants were available for analysis and were diagnosable by the clinical reference standard, making them evaluable for performance.
Data Provenance: The data was retrospectively collected from the pivotal study of the predicate device ("IDx-DR v2.0"; Abràmoff et al. Digital Medicine 2018;1:39). The country of origin is not explicitly stated in the provided text.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

The document refers to a "clinical reference standard" and states that IDx-DR has "the ability to perform analysis on the specific disease features that are important to a retina specialist for diagnostic screening of diabetic retinopathy." However, the exact number of experts, their specific qualifications (e.g., number of years of experience, board certification), and their role in establishing the ground truth for the test set are not explicitly detailed in the provided text. It mentions an article by Abràmoff et al. (2018), which likely describes the ground truth establishment for the original pivotal study.

4. Adjudication method for the test set

The adjudication method used to establish the clinical reference standard for the test set is not explicitly stated in the provided text. It mentions a "clinical reference standard" but does not detail how it was established (e.g., 2+1, 3+1, etc.).

5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

No, a multi-reader multi-case (MRMC) comparative effectiveness study was not conducted. The study evaluated the standalone performance of the algorithm (IDx-DR v2.3) by comparing it against the clinical reference standard, and then comparing its performance to the predicate algorithm (IDx-DR v2.0). There is no mention of human readers assisting the AI, nor is there any data on human reader improvement with AI assistance.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Yes, a standalone (algorithm only) performance study was conducted. The study assesses the ability of IDx-DR v2.3 to automatically detect more than mild diabetic retinopathy (mtmDR) and compares its sensitivity, specificity, and diagnosability to the predicate device's standalone performance.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.)

The ground truth used for the test set is referred to as a "clinical reference standard" by which participants were "diagnosable." This strongly implies expert consensus by retina specialists, as suggested by the mention of the algorithm identifying "specific disease features that are important to a retina specialist." However, the exact methodology is not detailed within this document.

8. The sample size for the training set

The document does not provide information regarding the sample size of the training set used for IDx-DR v2.3. The provided study is a retrospective validation of the modified algorithm using a pre-existing dataset.

9. How the ground truth for the training set was established

The document does not provide information on how the ground truth for the training set was established. It focuses solely on the clinical performance testing (validation) of the device using a pre-existing test set.

Summary

Regulation Number and Section

§ 886.1100 Retinal diagnostic software device.

(a)
Identification. A retinal diagnostic software device is a prescription software device that incorporates an adaptive algorithm to evaluate ophthalmic images for diagnostic screening to identify retinal diseases or conditions.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Software verification and validation documentation, based on a comprehensive hazard analysis, must fulfill the following:
(i) Software documentation must provide a full characterization of technical parameters of the software, including algorithm(s).
(ii) Software documentation must describe the expected impact of applicable image acquisition hardware characteristics on performance and associated minimum specifications.
(iii) Software documentation must include a cybersecurity vulnerability and management process to assure software functionality.
(iv) Software documentation must include mitigation measures to manage failure of any subsystem components with respect to incorrect patient reports and operator failures.
(2) Clinical performance data supporting the indications for use must be provided, including the following:
(i) Clinical performance testing must evaluate sensitivity, specificity, positive predictive value, and negative predictive value for each endpoint reported for the indicated disease or condition across the range of available device outcomes.
(ii) Clinical performance testing must evaluate performance under anticipated conditions of use.
(iii) Statistical methods must include the following:
(A) Where multiple samples from the same patient are used, statistical analysis must not assume statistical independence without adequate justification.
(B) Statistical analysis must provide confidence intervals for each performance metric.
(iv) Clinical data must evaluate the variability in output performance due to both the user and the image acquisition device used.
(3) A training program with instructions on how to acquire and process quality images must be provided.
(4) Human factors validation testing that evaluates the effect of the training program on user performance must be provided.
(5) A protocol must be developed that describes the level of change in device technical specifications that could significantly affect the safety or effectiveness of the device.
(6) Labeling must include:
(i) Instructions for use, including a description of how to obtain quality images and how device performance is affected by user interaction and user training;
(ii) The type of imaging data used, what the device outputs to the user, and whether the output is qualitative or quantitative;
(iii) Warnings regarding image acquisition factors that affect image quality;
(iv) Warnings regarding interpretation of the provided outcomes, including:
(A) A warning that the device is not to be used to screen for the presence of diseases or conditions beyond its indicated uses;
(B) A warning that the device provides a screening diagnosis only and that it is critical that the patient be advised to receive followup care; and
(C) A warning that the device does not treat the screened disease;
(v) A summary of the clinical performance of the device for each output, with confidence intervals; and
(vi) A summary of the clinical performance testing conducted with the device, including a description of the patient population and clinical environment under which it was evaluated.