(106 days)
The AEYE-DS is indicated for use by health care providers to automatically detect more than mild diabetic retinopathy (mtmDR) in adults diagnosed with diabetes who have not been previously diagnosed with diabetic retinopathy. The AEYE-DS is indicated for use with the Topcon NW400 camera and the Optomed Aurora camera.
AEYE-DS is a retinal diagnostic software device that incorporates an algorithm to evaluate retinal images for diagnostic screening to identify retinal diseases or conditions. Specifically, the AEYE-DS is designed to perform diagnostic screening for the condition of more-than-mild diabetic retinopathy (mtmDR).
The AEYE-DS is comprised of 5 software components: (1) Client; (2) Service; (3) Analytics; (4) Reporting and Archiving; and (5) System Security.
The AEYE-DS device is based on the principle of operation, whereby a fundus camera is used to obtain retinal images. The fundus camera is attached to a computer, where the Client module/software is installed. The Client module/software guides the user to acquire the images and enables the user to interact with the server-based analysis software over a secure internet connection. Using the Client module/software, users identify the fundus images per eye to be dispatched to the Service module/software. The Service module/software is installed on a server hosted at a secure datacenter, receives the fundus images and transfers them to the Analytics module/software. The Analytics module/software, which runs alongside the Service module/software, processes the fundus images and returns information on the image quality and the presence or absence of mtmDR to the Service module/software. The Service module/software then returns the results to the Client module/software.
Here's a breakdown of the acceptance criteria and the study that proves the AEYE-DS device meets them, based on the provided text:
1. Table of Acceptance Criteria and Reported Device Performance
The document primarily focuses on establishing substantial equivalence to a predicate device (AEYE-DS K221183), rather than explicitly listing pre-defined, quantitative acceptance criteria for each metric in the same way one might find in a clinical trial protocol. However, we can infer the implicitly accepted performance by comparing the subject device's results to the predicate's and demonstrating robust performance across two studies. The table below presents the key performance metrics reported for the subject device (AEYE-DS K240058 with Optomed Aurora camera) and the predicate device (AEYE-DS K221183 with Topcon NW400 camera).
Metric | Acceptance Criteria (Implied by Predicate Performance) | AEYE-DS Device (K240058) with Optomed Aurora (Study 1) | AEYE-DS Device (K240058) with Optomed Aurora (Study 2) |
---|---|---|---|
Sensitivity | ≥ 93% | 92% [79%; 97%] (Fundus-based & Multi-modality-based) | 93% [80%; 97%] (Fundus-based) |
90% [77%; 96%] (Multi-modality-based) | |||
Specificity | ≥ 91% | 94% [90%; 96%] (Fundus-based & Multi-modality-based) | 89% [85%; 92%] (Fundus-based & Multi-modality-based) |
Imageability | ≥ 99% | 99% [98%; 100%] | 99% [97%; 100%] |
PPV | ≥ 60% | 68% [54%; 79%] | 53% [41%; 64%] |
NPV | ≥ 99% | 99% [96%; 100%] | 99% [97%; 100%] (Fundus-based) |
98% [96%; 99%] (Multi-modality-based) |
Note: While PPV in Study 2 (53%) for the subject device is below the predicate's performance (60%), the document attributes this to the actual prevalence of mtmDR+ patients in the study's diabetic population (i.e., 12%), stating that the robustness of the studies is demonstrated by the similar PPV and NPV results across both studies despite this. The overall conclusion is substantial equivalence.
2. Sample Sizes Used for the Test Set and Data Provenance
- Study 1 Sample Size: 317 subjects
- Study 2 Sample Size: 362 subjects
- Data Provenance: Both studies were prospective, multi-center, single-arm, blinded studies conducted at study sites in the United States.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications
The ground truth was established by an independent reading center. While the exact number of experts (readers) is not specified, their role in determining the severity of retinopathy and clinically significant diabetic macular edema (DME) according to the Early Treatment for Diabetic Retinopathy Study severity (ETDRS) scale implies a high level of expertise, typical of ophthalmic specialists or certified graders.
4. Adjudication Method for the Test Set
The document states that the "Reading Center diagnostic results formed the reference standard (ground truth) for the study." It does not explicitly describe an adjudication method (e.g., 2+1, 3+1) among multiple readers within the reading center. It implies a single, definitive determination by the reading center.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
No multi-reader multi-case (MRMC) comparative effectiveness study was done. The studies were designed to evaluate the standalone performance of the AEYE-DS device, not to compare its performance in assisting human readers. The device is intended to "automatically detect" mtmDR.
6. Standalone (Algorithm Only) Performance
Yes, a standalone (algorithm only) performance evaluation was done. The reported sensitivity, specificity, PPV, and NPV values are for the AEYE-DS device's automated detection of mtmDR.
7. Type of Ground Truth Used
The ground truth used was expert consensus / standardized clinical assessment based on:
- Dilation four widefield color fundus images
- Lens photography for media opacity assessment
- Macular optical coherence tomography (OCT) imaging
- Severity determination according to the Early Treatment for Diabetic Retinopathy Study (ETDRS) scale by an independent reading center.
8. Sample Size for the Training Set
The document does not explicitly state the sample size for the training set. The clinical studies (Study 1 and Study 2) are described as the basis for the performance evaluation of the device (i.e., the test set performance). The training of the AI model would have occurred prior to these validation studies.
9. How the Ground Truth for the Training Set Was Established
The document does not explicitly describe how the ground truth for the training set was established. However, it is standard practice for AI models in medical imaging to be trained on large datasets where ground truth is established by experienced clinical experts (e.g., ophthalmologists, retina specialists) thoroughly reviewing and annotating images, often with consensus protocols, similar to the method described for the test set's ground truth (ETDRS grading by a reading center). Given the device's predicate status and the detailed description of the ground truth for the test sets, it is highly probable that a rigorous, expert-based process was applied to the training data as well.
§ 886.1100 Retinal diagnostic software device.
(a)
Identification. A retinal diagnostic software device is a prescription software device that incorporates an adaptive algorithm to evaluate ophthalmic images for diagnostic screening to identify retinal diseases or conditions.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Software verification and validation documentation, based on a comprehensive hazard analysis, must fulfill the following:
(i) Software documentation must provide a full characterization of technical parameters of the software, including algorithm(s).
(ii) Software documentation must describe the expected impact of applicable image acquisition hardware characteristics on performance and associated minimum specifications.
(iii) Software documentation must include a cybersecurity vulnerability and management process to assure software functionality.
(iv) Software documentation must include mitigation measures to manage failure of any subsystem components with respect to incorrect patient reports and operator failures.
(2) Clinical performance data supporting the indications for use must be provided, including the following:
(i) Clinical performance testing must evaluate sensitivity, specificity, positive predictive value, and negative predictive value for each endpoint reported for the indicated disease or condition across the range of available device outcomes.
(ii) Clinical performance testing must evaluate performance under anticipated conditions of use.
(iii) Statistical methods must include the following:
(A) Where multiple samples from the same patient are used, statistical analysis must not assume statistical independence without adequate justification.
(B) Statistical analysis must provide confidence intervals for each performance metric.
(iv) Clinical data must evaluate the variability in output performance due to both the user and the image acquisition device used.
(3) A training program with instructions on how to acquire and process quality images must be provided.
(4) Human factors validation testing that evaluates the effect of the training program on user performance must be provided.
(5) A protocol must be developed that describes the level of change in device technical specifications that could significantly affect the safety or effectiveness of the device.
(6) Labeling must include:
(i) Instructions for use, including a description of how to obtain quality images and how device performance is affected by user interaction and user training;
(ii) The type of imaging data used, what the device outputs to the user, and whether the output is qualitative or quantitative;
(iii) Warnings regarding image acquisition factors that affect image quality;
(iv) Warnings regarding interpretation of the provided outcomes, including:
(A) A warning that the device is not to be used to screen for the presence of diseases or conditions beyond its indicated uses;
(B) A warning that the device provides a screening diagnosis only and that it is critical that the patient be advised to receive followup care; and
(C) A warning that the device does not treat the screened disease;
(v) A summary of the clinical performance of the device for each output, with confidence intervals; and
(vi) A summary of the clinical performance testing conducted with the device, including a description of the patient population and clinical environment under which it was evaluated.