(143 days)
EyeArt is indicated for use by healthcare providers to automatically detect more than mild diabetic retinopathy and visionthreatening diabetic retinopathy (severe non-proliferative diabetic retinopathy or proliferative diabetic retinopathy and/or diabetic macular edema) in eyes of adults diagnosed with diabetes who have not been previously diagnosed with more than mild diabetic retinopathy. EyeArt is indicated for use with Canon CR-2 Plus AF cameras in both primary care and eye care settings.
EyeArt is a software as a medical device that consists of several components - Client, Server, and Analysis Computation Engine. A retinal fundus camera, used to capture retinal fundus images of the patient, is connected to a computer where the EyeArt Client software is installed. The EyeArt Client software provides a graphical user interface (GUI) that allows the EyeArt operator to transfer the appropriate fundus images to and receive results from the remote EyeArt Analysis Computation Engine through the EyeArt Server. The EyeArt Analysis Computation Engine is installed on remote computer(s) in a secure data center and uses artificial intelligence algorithms to analyze the fundus images and return results. EyeArt is intended to be used with color fundus images of resolution 1.69 megapixels or higher captured using one of the indicated color fundus cameras (Canon CR-2 AF and Canon CR-2 Plus AF) with 45 degrees field of view. EyeArt is specified for use with two color fundus images per eye: optic nerve head (ONH) centered and macula centered.
For each patient eye, the EyeArt results separately indicate whether more than mild diabetic retinopathy (mtmDR) and vision-threatening diabetic retinopathy (vtDR) are detected. More than mild diabetic retinopathy is defined as the presence of moderate non-proliferative diabetic retinopathy or worse on the International Clinical Diabetic Retinopathy (ICDR) severity scale and/or the presence of diabetic macular edema. Vision-threatening diabetic retinopathy is defined as the presence of severe non-proliferative diabetic retinopathy or proliferative diabetic retinopathy on the ICDR severity scale and/or the presence of diabetic macular edema.
Here's a breakdown of the EyeArt device's acceptance criteria and the study that proves it meets them, based on the provided text:
Acceptance Criteria and Reported Device Performance
Device: EyeArt (v2.1.0)
Indication for Use: To automatically detect more than mild diabetic retinopathy (mtmDR) and vision-threatening diabetic retinopathy (vtDR) in adults diagnosed with diabetes who have not been previously diagnosed with more than mild diabetic retinopathy.
Metric | Acceptance Criteria (Implicit by achieving high performance) | Reported Device Performance (Worst Case Across Cohorts/Outcomes) |
---|---|---|
Sensitivity (mtmDR) | High (e.g., above 90%) | 92.9% (Enrichment-permitted, Primary Care) |
Specificity (mtmDR) | High (e.g., above 85%) | 85.2% (Enrichment-permitted, Ophthalmology) |
Sensitivity (vtDR) | High (e.g., near 90% or 100% for smaller groups) | 88.9% (Sequential, Ophthalmology) |
Specificity (vtDR) | High (e.g., above 89%) | 89.8% (Enrichment-permitted, Ophthalmology) |
Imageability | High (e.g., above 95%) | 96.5% (Enrichment-permitted, Ophthalmology, Sequence P1/P2/P3) |
Intra-operator Repeatability (OA - mtmDR) | High (e.g., above 90%) | 93.5% (Cohort P2) |
Intra-operator Repeatability (OA - vtDR) | High (e.g., above 96%) | 96.8% (Cohort P2) |
Inter-operator Reproducibility (OA - mtmDR) | High (e.g., above 90%) | 90.3% (Cohort P1) |
Inter-operator Reproducibility (OA - vtDR) | High (e.g., above 96%) | 96.8% (Cohort P1) |
Note: The document does not explicitly state numerical acceptance criteria thresholds. The reported performance values are the actual outcomes from the clinical study, which are implicitly considered acceptable for substantial equivalence based on the FDA's clearance. The "worst case" reported here refers to the lowest performance observed across the different cohorts for each metric. Confidence intervals are provided in the tables and are generally tight, indicating reliability.
Study Details Proving Device Meets Acceptance Criteria
2. Sample Size Used for the Test Set and Data Provenance
- Test Set Size:
- Clinical Efficacy Study: 655 participants (after exclusions from an initial 942 screened), comprising 1290 eyes (assuming 2 eyes per subject, though analyses are eye-level). These 655 participants were divided into two main cohorts:
- Sequential Enrollment: 235 subjects (45 in primary care, 190 in ophthalmology)
- Enrichment-Permitted Enrollment: 420 subjects (335 in primary care, 85 in ophthalmology)
- Precision (Repeatability/Reproducibility) Study: 62 subjects (31 subjects each at 2 US primary care sites), resulting in 186 pairs of images for repeatability analysis (Cohort P1) and 62 subjects for Cohort P2 (3 repeats each).
- Clinical Efficacy Study: 655 participants (after exclusions from an initial 942 screened), comprising 1290 eyes (assuming 2 eyes per subject, though analyses are eye-level). These 655 participants were divided into two main cohorts:
- Data Provenance: Prospective, multi-center pivotal clinical trial conducted across 11 US study sites (primary care centers and general ophthalmology centers).
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications
- Number of Experts: At least 2 independent graders and an additional adjudication grader (more experienced) were used for each subject's images.
- Qualifications of Experts: Experienced and certified graders at the Fundus Photography Reading Center (FPRC). They were certified to grade according to the Early Treatment for Diabetic Retinopathy Study severity (ETDRS) scale. Specific experience levels (e.g., "10 years of experience") are not detailed beyond "experienced and certified."
4. Adjudication Method for the Test Set
- Method: 2+1 adjudication. Each subject's images were independently graded by 2 experienced and certified graders. In case of significant differences (determined using prespecified significance levels) between the two independent gradings, a more experienced adjudication grader graded the same images to establish the final ground truth.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- Was it done? No, a comparative effectiveness study evaluating how much human readers improve with AI vs. without AI assistance was not reported in this document. The study primarily evaluated the standalone performance of the EyeArt device against expert human grading (FPRC reference standard).
6. Standalone (i.e., algorithm only without human-in-the-loop performance)
- Was it done? Yes, the entire clinical testing section (Section D) describes the standalone performance of the EyeArt algorithm. The EyeArt results (positive, negative, or ungradable for mtmDR and vtDR) for each eye were compared directly to the clinical reference standard established by FPRC grading.
7. Type of Ground Truth Used
- Type: Expert Consensus Grading (adjudicated) from the Fundus Photography Reading Center (FPRC). This grading was based on dilated 4-wide field stereo fundus imaging and applied the Early Treatment for Diabetic Retinopathy Study (ETDRS) severity scale.
- mtmDR Ground Truth: Positive if ETDRS level was 35 or greater (but not equal to 90) or clinically significant macular edema (CSME) grade was CSME present. Negative if ETDRS levels were 10-20 and CSME grade was CSME absent.
- vtDR Ground Truth: Positive if ETDRS level was 53 or greater (but not equal to 90) or CSME grade was CSME present. Negative if ETDRS levels were 10-47 and CSME grade was CSME absent.
8. Sample Size for the Training Set
- The document does not specify the sample size for the training set. It mentions that the EyeArt Analysis Computation Engine uses "an ensemble of clinically aligned machine learning (deep learning) algorithms" but provides no details on their training data.
9. How the Ground Truth for the Training Set Was Established
- The document does not specify how the ground truth for the training set was established. While the "clinically aligned framework" is mentioned, the specific methodology for annotating or establishing ground truth for the training data is not detailed in this submission summary.
§ 886.1100 Retinal diagnostic software device.
(a)
Identification. A retinal diagnostic software device is a prescription software device that incorporates an adaptive algorithm to evaluate ophthalmic images for diagnostic screening to identify retinal diseases or conditions.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Software verification and validation documentation, based on a comprehensive hazard analysis, must fulfill the following:
(i) Software documentation must provide a full characterization of technical parameters of the software, including algorithm(s).
(ii) Software documentation must describe the expected impact of applicable image acquisition hardware characteristics on performance and associated minimum specifications.
(iii) Software documentation must include a cybersecurity vulnerability and management process to assure software functionality.
(iv) Software documentation must include mitigation measures to manage failure of any subsystem components with respect to incorrect patient reports and operator failures.
(2) Clinical performance data supporting the indications for use must be provided, including the following:
(i) Clinical performance testing must evaluate sensitivity, specificity, positive predictive value, and negative predictive value for each endpoint reported for the indicated disease or condition across the range of available device outcomes.
(ii) Clinical performance testing must evaluate performance under anticipated conditions of use.
(iii) Statistical methods must include the following:
(A) Where multiple samples from the same patient are used, statistical analysis must not assume statistical independence without adequate justification.
(B) Statistical analysis must provide confidence intervals for each performance metric.
(iv) Clinical data must evaluate the variability in output performance due to both the user and the image acquisition device used.
(3) A training program with instructions on how to acquire and process quality images must be provided.
(4) Human factors validation testing that evaluates the effect of the training program on user performance must be provided.
(5) A protocol must be developed that describes the level of change in device technical specifications that could significantly affect the safety or effectiveness of the device.
(6) Labeling must include:
(i) Instructions for use, including a description of how to obtain quality images and how device performance is affected by user interaction and user training;
(ii) The type of imaging data used, what the device outputs to the user, and whether the output is qualitative or quantitative;
(iii) Warnings regarding image acquisition factors that affect image quality;
(iv) Warnings regarding interpretation of the provided outcomes, including:
(A) A warning that the device is not to be used to screen for the presence of diseases or conditions beyond its indicated uses;
(B) A warning that the device provides a screening diagnosis only and that it is critical that the patient be advised to receive followup care; and
(C) A warning that the device does not treat the screened disease;
(v) A summary of the clinical performance of the device for each output, with confidence intervals; and
(vi) A summary of the clinical performance testing conducted with the device, including a description of the patient population and clinical environment under which it was evaluated.