(254 days)
DeepRESP is an aid in the diagnosis of various sleep disorders where subjects are often evaluated during the initiation or follow-up of treatment of various sleep disorders. The recordings to be analyzed by DeepRESP can be performed in a hospital, patient home, or an ambulatory setting. It is indicated for use with adults (22 years and above) in a clinical environment by or on the order of a medical professional.
DeepRESP is intended to mark sleep study signals to aid in the identification of events and annotation of traces; automatically calculate measures obtained from recorded signals (e.g., magnitude, time, frequency, and statistical measures of marked events); infer sleep staging with arousals with EEG and in the absence of EEG. All output is subject to verification by a medical professional.
DeepRESP is a cloud-based software as a medical device (SaMD), designed to perform analysis of sleep study recordings, with and without EEG signals, providing data for the assessment and diagnosis of sleep-related disorders. Its algorithmic framework provides the derivation of sleep staging including arousals, scoring of respiratory events and key parameters such as the Apnea-Hypopnea Index (AHI).
DeepRESP is hosted on a serverless stack. It consists of:
- A web Application Programming Interface (API) intended to interface with a third-party client application, allowing medical professionals to access DeepRESP's analytical capabilities.
- Predefined sequences called Protocols that run data analyses, including artificial intelligence and rule-based models for the scoring of sleep studies, and a parameter calculation service.
- A Result storage using an object storage service to temporarily store outputs from the DeepRESP Protocols.
Here's a breakdown of the acceptance criteria and the study details for the DeepRESP device, based on the provided FDA 510(k) summary:
1. Table of Acceptance Criteria & Reported Device Performance:
The document doesn't explicitly state "acceptance criteria" as a separate table, but it compares DeepRESP's performance against manual scoring and predicate devices. I've extracted the performance metrics that effectively serve as acceptance criteria given the "non-inferiority" and "superiority" claims against established devices.
Metric (Against Manual Scoring) | DeepRESP Performance (95% CI) | Equivalent Predicate Performance (Nox Sleep System K192469) (95% CI) | Superiority/Non-inferiority Claim | Relevant Study Type |
---|---|---|---|---|
Severity Classification (AHI ≥ 5) | ||||
PPA% | 87.5 [86.2, 89.0] | 73.6 [PPA% reported for predicate] | Superiority | Type I/II |
NPA% | 91.9 [87.4, 95.8] | 65.8 [NPA% reported for predicate] | Non-inferiority | Type I/II |
OPA% | 87.9 [86.6, 89.3] | 73.0 [OPA% reported for predicate] | Superiority | Type I/II |
Severity Classification (AHI ≥ 15) | ||||
PPA% | 74.1 [72.0, 76.5] | 54.5 [PPA% reported for predicate] | Superiority | Type I/II |
NPA% | 94.7 [93.2, 96.2] | 89.8 [NPA% reported for predicate] | Non-inferiority | Type I/II |
OPA% | 81.5 [79.9, 83.3] | 67.2 [OPA% reported for predicate] | Superiority | Type I/II |
Respiratory Events | ||||
PPA% | 72.0 [70.9, 73.2] | 58.5 [PPA% reported for predicate] | Non-inferiority (Superiority for OPA claimed) | Type I/II |
NPA% | 94.2 [94.0, 94.5] | 95.4 [NPA% reported for predicate] | Non-inferiority | Type I/II |
OPA% | 87.2 [86.8, 87.5] | 81.7 [OPA% reported for predicate] | Superiority | Type I/II |
Sleep State Estimation (Wake) | ||||
PPA% | 95.4 [95.1, 95.6] | 56.7 [PPA% reported for predicate] | Non-inferiority | Type I/II |
NPA% | 94.6 [94.4, 94.9] | 98.1 [NPA% reported for predicate] | Non-inferiority | Type I/II |
OPA% | 94.8 [94.6, 95.0] | 89.8 [OPA% reported for predicate] | Non-inferiority | Type I/II |
Arousal Events | ||||
ArI ICC (against Sleepware G3 K202142) | 0.63 [ArI ICC] | 0.794 [ArI ICC for additional predicate] | Non-inferiority | Type I/II |
PPA% | 62.2 [61.2, 63.1] | N/A (Manual for primary predicate) | N/A | Type I/II |
NPA% | 89.3 [88.8, 89.7] | N/A (Manual for primary predicate) | N/A | Type I/II |
OPA% | 81.4 [81.1, 81.7] | N/A (Manual for primary predicate) | N/A | Type I/II |
Type III Severity Classification (AHI ≥ 5) | ||||
PPA% | 93.1 [92.2, 93.9] | 82.4 [PPA% reported for predicate] | Superiority | Type III |
NPA% | 81.1 [75.1, 86.6] | 56.6 [NPA% reported for predicate] | Non-inferiority | Type III |
OPA% | 92.5 [91.7, 93.3] | 81.1 [OPA% reported for predicate] | Non-inferiority | Type III |
Type III Respiratory Events | ||||
PPA% | 75.4 [74.6, 76.1] | 58.5 [PPA% reported for predicate] | Superiority | Type III |
NPA% | 87.8 [87.4, 88.1] | 95.4 [NPA% reported for predicate] | Non-inferiority | Type III |
OPA% | 83.7 [83.4, 84.0] | 81.7 [OPA% reported for predicate] | Superiority | Type III |
Type III Arousal Events | ||||
ArI ICC (against Sleepware G3 K202142) | 0.76 [ArI ICC] | 0.73 [ArI ICC for additional predicate] | Non-inferiority | Type III |
2. Sample Size Used for the Test Set and Data Provenance:
- Type I/II Studies (EEG present): 2,224 sleep recordings
- Type III Studies (No EEG): 3,488 sleep recordings (including 2,213 Type I recordings and 1,275 Type II recordings, processed to utilize only Type III relevant signals).
- Provenance: Retrospective study. Data originated from sleep clinics in the United States, collected as part of routine clinical work for patients suspected of sleep disorders. The patient population showed diversity in age, BMI, and race/ethnicity (Caucasian or White, Black or African American, Other, Not Reported) and was considered representative of patients seeking medical services for sleep disorders in the United States.
3. Number of Experts and Qualifications for Ground Truth:
The document explicitly states that the studies used "manually scored sleep recordings" but does not specify the number of experts or their specific qualifications (e.g., "radiologist with 10 years of experience"). It implicitly relies on the quality of "manual scoring" from routine clinical work in US sleep clinics as the ground truth.
4. Adjudication Method for the Test Set:
The document does not describe any specific adjudication method (e.g., 2+1, 3+1). It refers to "manual scoring" as the established ground truth.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:
No, a MRMC comparative effectiveness study was not reported. The study design was a retrospective data analysis comparing the algorithm's performance against existing manual scoring (ground truth) and established predicate devices. There is no information about human readers improving with AI vs. without AI assistance. The device is intended to provide automatic scoring subject to verification by a medical professional.
6. Standalone (Algorithm Only) Performance:
Yes, the study report describes the standalone performance of the DeepRESP algorithm. The reported PPA, NPA, OPA percentages, and ICC values represent the agreement of the automated scoring by DeepRESP compared to the manual ground truth. The device produces output "subject to verification by a medical professional," but the performance metrics provided are for the algorithmic output itself.
7. Type of Ground Truth Used:
The ground truth used was expert consensus (manual scoring). The document states "It used manually scored sleep recordings... The studies were done by evaluating the agreement in scoring and clinical indices resulting from the automatic scoring by DeepRESP compared to manual scoring."
8. Sample Size for the Training Set:
The document does not explicitly state the sample size used for the training set. The clinical validation study is described as a "retrospective study" used for validation, but details about the training data are not provided in this summary.
9. How the Ground Truth for the Training Set Was Established:
The document does not specify how the ground truth for the training set was established. It only describes the ground truth for the validation sets as "manually scored sleep recordings" from routine clinical work.
§ 882.1400 Electroencephalograph.
(a)
Identification. An electroencephalograph is a device used to measure and record the electrical activity of the patient's brain obtained by placing two or more electrodes on the head.(b)
Classification. Class II (performance standards).