(275 days)
Aurora is a Software as a Medical Device (SaMD) that establishes sleep quality. Aurora automatically analyzes, displays, and summarizes Photoplethysmogram (PPG) data collected during sleep using compatible devices. Aurora is intended for use by and by order of a healthcare professional to aid in the diagnosis of sleep disorders including sleep apnea in adults.
The Aurora output, including automatically detected respiratory events and parameters, may be displayed and edited by a qualified healthcare professional. The Aurora output is not intended to be interpreted or clinical action taken without consultation of a qualified healthcare professional.
Aurora is not intended for use with polysomnography devices.
Aurora is a Class II Software as a Medical Device (SaMD), intended to aid in the evaluation of sleep disorders, where it may inform or drive clinical management. Aurora is a software application that is indicated for use on a general-purpose computing platform. It is a cloud-based software-as-a-medicaldevice (SaMD) with a user interface that runs in a web browser.
Aurora automatically analyzes and displays photoplethsmography (PPG) signal data including SPO2 and pulse/heart rate only from compatible FDA-cleared medical purpose pulse oximeters that meet Aurora's data acquisition requirements for sampling rate, digital resolution, measurement range, and accuracy range.
Following upload of a compatible PPG study to the cloud software, the algorithm functions by verifying minimum signal quality, study length, and technical adequacy requirements, preprocessing the data including normalization, digital filtration, and artifact detection/rejection procedures, applying machine learning algorithms including multiple deep neural network machine learning models, statistical signal processing analyses including time-domain and time-frequency domain analyses over multiple time and resolution scales, and other analyses output a detected set of events and derived signals for the PPG study that are post-processed and logically filtered according to algorithm rules based on the American Academy of Sleep Medicine (AASM) recommended scoring event, desaturation, and association rules. Aurora algorithm outputs, including scored respiratory events, sleep stages, Aurora Apnea-Hypopnea Index (eAHI), Total Sleep Time (TST), Sleep Efficiency (SE), Sleep Latency (SL), Wake After Sleep Onset (WASO), and Oxygen Desaturation Events Index (ODI) measures, are stored and made available for display, editing, and review in Aurora by qualified healthcare professionals.
Aurora reports results of the automated data analysis based on AASM guidelines, including the Aurora output Apnea-Hypopnea Index (eAHI) and total sleep time (TST). The algorithm outputs are graphical and numerical displays and reports of sleep latency, sleep quality, and sleep pathologies including sleep disordered breathing. The Aurora displays and reports are for the order of physicians, trained technicians, or other healthcare professionals to evaluate sleep disorders where it may inform or drive clinical management taking into consideration other factors that normally are considered for clinical management of sleep disorders for adults.
The clinician can view raw data for interpretation, edit events, write clinical notes, and customize sleep reports for the patient.
Aurora output is not intended to be interpreted or clinical action taken without consultation of a qualified healthcare professional.
The document provides detailed information about the performance evaluation of the Aurora device, a Software as a Medical Device (SaMD) intended to aid in the diagnosis of sleep disorders.
Here's a breakdown of the acceptance criteria and the study that proves the device meets them:
1. Acceptance Criteria and Reported Device Performance
The acceptance criteria for Aurora are implied by the performance metrics reported, demonstrating its accuracy in detecting Apnea Hypopnea Index (eAHI) and performing sleep staging against polyomnography (PSG) ground truth. While explicit numerical "acceptance criteria" tables are not provided, the reported sensitivity, specificity, and regression/Bland-Altman statistics serve as the evidence of meeting performance expectations for substantial equivalence.
Table of Performance Data (Implied Acceptance Criteria)
Metric | Acceptance Criteria (Implied) | Reported Device Performance (Aurora) |
---|---|---|
Apnea Hypopnea Index (eAHI) - 3% Desaturation | High Sensitivity and Specificity at AHI >= 5 cutoff, comparable to predicate. | Sensitivity: 92.6% (87.2%, 97.2%) |
Specificity: 71.6% (59.2%, 83.7%) | ||
Apnea Hypopnea Index (eAHI) - 4% Desaturation | High Sensitivity and Specificity at AHI >= 5 cutoff, comparable to predicate. | Sensitivity: 89.4% (81.6%, 96.1%) |
Specificity: 76.8% (67.1%, 85.4%) | ||
Sleep Staging - Wake | High Sensitivity and Specificity for Wake epoch detection. | Sensitivity: 86.7% (86.5%, 87.0%) |
Specificity: 93.5% (93.4%, 93.7%) | ||
Sleep Staging - Light NREM | High Sensitivity and Specificity for Light NREM epoch detection. | Sensitivity: 80.9% (80.6%, 81.2%) |
Specificity: 85.5% (85.2%, 85.7%) | ||
Sleep Staging - Deep NREM | Reasonably high Sensitivity and Specificity for Deep NREM epoch detection, balancing known challenges in this stage. | Sensitivity: 63.4% (62.4%, 64.3%) |
Specificity: 95.9% (95.7%, 96.0%) | ||
Sleep Staging - REM | High Sensitivity and Specificity for REM epoch detection. | Sensitivity: 83.6% (83.0%, 84.2%) |
Specificity: 97.5% (97.4%, 97.5%) | ||
Sleep Profile & Oxygen Saturation Accuracy (eAHI 3%) | Deming Regression slope near 1, intercept near 0; Bland-Altman Mean Difference near 0, narrow limits. | Deming Regression: Slope: 0.936 (0.853, 1.033), Intercept: 0.023 (-1.185, 1.122) |
Bland-Altman: Mean Difference: 1.000 (0.630, 1.367), ULOA: 14.575 (13.779, 15.363), LLOA: -12.574 (-13.371, -11.786) | ||
Sleep Profile & Oxygen Saturation Accuracy (eAHI 4%) | Deming Regression slope near 1, intercept near 0; Bland-Altman Mean Difference near 0, narrow limits. | Deming Regression: Slope: 0.982 (0.903, 1.130), Intercept: 1.219 (0.116, 1.985) |
Bland-Altman: Mean Difference: -1.039 (-1.326, -0.749), ULOA: 9.307 (8.692, 9.931), LLOA: -11.386 (-12.001, -10.763) | ||
Sleep Profile & Oxygen Saturation Accuracy (TST) | Deming Regression slope near 1, intercept near 0; Bland-Altman Mean Difference near 0, narrow limits. | Deming Regression: Slope: 1.159 (1.035, 1.318), Intercept: -0.695 (-1.576, -0.005) |
Bland-Altman: Mean Difference: -0.093 (-0.132, -0.059), ULOA: 1.145 (1.060, 1.216), LLOA: -1.330 (-1.414, -1.259) | ||
Sleep Profile & Oxygen Saturation Accuracy (SE) | Deming Regression slope near 1, intercept near 0; Bland-Altman Mean Difference near 0, narrow limits. | Deming Regression: Slope: 1.154 (1.031, 1.317), Intercept: -0.088 (-0.205, 0.003) |
Bland-Altman: Mean Difference: -0.011 (-0.017, -0.007), ULOA: 0.163 (0.151, 0.173), LLOA: -0.185 (-0.198, -0.176) | ||
Sleep Profile & Oxygen Saturation Accuracy (SL) | Deming Regression slope near 1, intercept near 0; Bland-Altman Mean Difference near 0, narrow limits. | Deming Regression: Slope: 1.114 (0.997, 1.290), Intercept: -0.023 (-0.185, 0.090) |
Bland-Altman: Mean Difference: -0.129 (-0.154, -0.089), ULOA: 0.884 (0.831, 0.970), LLOA: -1.143 (-1.196, -1.057) | ||
Sleep Profile & Oxygen Saturation Accuracy (WASO) | Deming Regression slope near 1, intercept near 0; Bland-Altman Mean Difference near 0, narrow limits. | Deming Regression: Slope: 1.073 (0.938, 1.219), Intercept: -0.271 (-0.436, -0.121) |
Bland-Altman: Mean Difference: 0.167 (0.140, 0.196), ULOA: 1.131 (1.073, 1.193), LLOA: -0.797 (-0.855, -0.735) | ||
Sleep Profile & Oxygen Saturation Accuracy (ODI) | Deming Regression slope near 1, intercept near 0; Bland-Altman Mean Difference near 0, narrow limits. | Deming Regression: Slope: 0.962 (0.896, 1.056), Intercept: 1.667 (0.330, 2.847) |
Bland-Altman: Mean Difference: -1.046 (-1.417, -0.677), ULOA: 13.223 (12.426, 14.015), LLOA: -15.315 (-16.111, -14.522) |
2. Sample Size Used for the Test Set and Data Provenance
- Test Set Sample Size:
- For eAHI performance (sensitivity/specificity): 158 adult patients.
- For Sleep Staging:
- Wake: 52,622 epochs
- Light NREM: 69,438 epochs
- Deep NREM: 10,195 epochs
- REM: 14,459 epochs
- Data Provenance: The document does not explicitly state the country of origin but implies clinical settings where PSG (Polysomnography) and HSAT (Home Sleep Apnea Test) recordings are collected. The study involved simultaneous PSG and HSAT recordings, suggesting a prospective collection of data for testing purposes to facilitate direct comparison.
3. Number of Experts and Qualifications for Ground Truth
- Number of Experts: Three registered polysomnographic technologists were used for manual scoring, and one board-certified sleep physician reviewed each PSG.
- Qualifications of Experts:
- Scorers: Registered polysomnographic technologists.
- Reviewer/Confirmer: Board-certified sleep physician.
4. Adjudication Method for the Test Set
- Adjudication Method: A 2+1 consensus method. For an event to be officially scored or reported, a consensus of at least two-thirds among the three scorers was required. Additionally, each PSG was reviewed by a board-certified sleep physician to provide clinical confirmation of scoring and technical adequacy, serving as a final adjudication layer.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- The document does not indicate that a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was done to assess how much human readers improve with AI vs. without AI assistance. The study focuses on the standalone performance of the Aurora algorithm against expert-scored ground truth. The device output may be displayed and edited by a qualified healthcare professional, suggesting a human-in-the-loop workflow, but the study described does not quantify the effect of AI assistance on human reader performance.
6. Standalone Performance Study
- Yes, a standalone performance study was done. The reported sensitivity, specificity, Deming regression, and Bland-Altman analyses directly evaluate the algorithm's performance (Aurora) against the expert-scored PSG as ground truth, without a human in the loop for the performance metrics themselves.
7. Type of Ground Truth Used
- The type of ground truth used was expert consensus from manual scoring of Polysomnography (PSG) data. Specifically, PSG recordings were manually scored by three registered polysomnographic technologists using guidelines following the 3% desaturation guidance. This was further reviewed and confirmed by a board-certified sleep physician.
8. Sample Size for the Training Set
- The document does not specify the sample size for the training set. The provided details pertain exclusively to the test set used for performance validation.
9. How the Ground Truth for the Training Set Was Established
- The document does not specify how the ground truth for the training set was established. Information regarding the training data, its collection, or annotation methods is not included in this summary.
§ 868.2375 Breathing frequency monitor.
(a)
Identification. A breathing (ventilatory) frequency monitor is a device intended to measure or monitor a patient's respiratory rate. The device may provide an audible or visible alarm when the respiratory rate, averaged over time, is outside operator settable alarm limits. This device does not include the apnea monitor classified in § 868.2377.(b)
Classification. Class II (performance standards).