Search Results

The Dreem 3S is intended for prescription use to measure, record, display, transmit and analyze the electrical activity of the brain to assess sleep and awake in the home or healthcare environment. The Dreem 3S can also output a hypnogram of sleep scoring by 30-second epoch and summary of sleep metrics derived from this hypnogram.

The Dreem 3S is used for the assessment of sleep on adult individuals (22 to 65 years old). The Dreem 3S allows for the generation of user/predefined reports based on the subject's data.

Device Description

The Dreem 3S headband contains microelectronics, within a flexible case made of plastic, foam, and fabric. It includes 6 EEG electrodes and a 3D accelerometer sensor.

The EEG signal is measured by two electrodes in the frontal position) and two at the back of the head (occipital position), along with one reference electrode and one ground electrode.

The 3D accelerometer is embedded in the top of the headband to ensure accurate measurements of the wearer's head movement during the night. The raw EEG and accelerometer data are transferred to Dreem's servers for further analysis after the night is over.

The device includes a bone-conduction speaker with volume control to provide notifications to the wearer, and a power button circled by a multicolor LED light

The device generates a sleep report that includes a sleep staging for each 30-second epoch during the night. This output is produced using an algorithm that analyzes data from the headband EEG and accelerometer sensors. A raw data file is also available in EDF format.

AI/ML Overview

The provided text is a 510(k) summary for the Dreem 3S device. It does not contain a comprehensive study detailing acceptance criteria and device performance. Instead, it states that no new testing was performed because the current submission is primarily for the inclusion of a Predetermined Change Control Plan (PCCP). It relies on the performance characteristics previously reported for the predicate device (K223539).

Therefore, I cannot provide a table of acceptance criteria with reported performance, or details about the sample sizes and ground truth for a new study, as none was conducted or reported in this document.

However, based on the information for the predicate device, and the intent behind the PCCP, I can infer and summarize what would typically be expected for such a device and what the PCCP aims to maintain:

Inferred Acceptance Criteria based on Predicate Device (K223539) and PCCP:
The document states, "clinical performance validation will also be repeated, and will require that the performance of any modification to Dreem 3S to be non-inferior to the all previously released versions of the Dreem 3S device." This indicates that the primary acceptance criterion for any future algorithmic updates under the PCCP is non-inferiority to the performance established in the original clearance (K223539). While the specific metrics are not detailed in this current summary, for a sleep staging device, these would typically include accuracy metrics like Cohen's Kappa, Sensitivity, Specificity, and overall accuracy for differentiating sleep stages (Wake, NREM1, NREM2, NREM3, REM).

Regarding Study Information (based on the original clearance of K223539, not detailed here):

Since the provided document explicitly states, "No bench testing, animal testing, or clinical testing was performed to support this submission," I cannot fill in the details for a new study. The performance information relates to the predicate device (K223539).

However, based on the Predetermined Change Control Plan (PCCP) section, which outlines how future algorithmic modifications will be validated, I can describe the methodology for future performance validation under that plan:

Inferred Acceptance Criteria and Future Performance Validation Methodology (based on PCCP)

1. Table of Acceptance Criteria and Reported Device Performance:

Acceptance Criterion (Inferred from PCCP)	Reported Device Performance (From K223539 - Not detailed in this document)
Non-inferiority of sleep staging performance to previously cleared versions	Specific performance metrics (e.g., Kappa, Accuracy, Sensitivity, Specificity for sleep stages) measured in K223539.
Maintain performance across specific sleep stages (Wake, N1, N2, N3, REM)	Specific performance metrics for each stage from K223539.
Robustness to signal preprocessing, ML model, and postprocessing updates	Performance maintained within non-inferiority margins after updates.

Note: The actual numerical performance metrics for the predicate device (K223539) are not provided in this document. They would have been part of the original K223539 submission. The PCCP ensures that future algorithmic changes meet these same (or non-inferior) performance levels.

2. Sample Size Used for the Test Set and Data Provenance:

For future updates under PCCP: The PCCP states, "Recordings that are used for any purpose (e.g., training, tuning, failure analysis, etc.) that might lead to direct or indirect insight regarding the performance of a modified sleep staging algorithm on this recording, other than execution of the clinical performance validation per the methods specified in the PCCP, are excluded from the test dataset." This implies that a new, independent test set will be used for each validation under the PCCP.
Sample Size: Not specified for future PCCP validations, but it is stated that "Quality checks will ensure that the test data are sufficiently high quality and representative of the intended use population."
Data Provenance: Not explicitly stated, but for sleep studies, typically involves polysomnography (PSG) data. The "human variability estimated from comparison of expert scoring from 284 American Academy of Sleep Medicine (AASM) compliant polysomnography recordings" suggests a U.S. or internationally recognized standard for data interpretation. The fact that the device assesses adult individuals (22 to 65 years old) means the test set would be composed of data from this age demographic. Retrospective or prospective is not specified, but typically retrospective datasets are used for initial clearances.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications:

For future updates under PCCP: "Non-inferiority margins were selected based on the level of human variability estimated from comparison of expert scoring from 284 American Academy of Sleep Medicine (AASM) compliant polysomnography recordings." This strongly implies that the ground truth for validation (both for K223539 and subsequent PCCP validations) is expert consensus scoring based on AASM guidelines.
Number of Experts: Not explicitly stated, but "expert scoring" typically implies one or more certified sleep technologists or sleep physicians. The mention of "human variability" often means comparison between at least two independent expert scorings.
Qualifications: "American Academy of Sleep Medicine (AASM) compliant polysomnography recordings" strongly suggests that the experts would be board-certified sleep physicians or registered polysomnographic technologists (RPSGTs) with experience in AASM sleep staging. The number of years of experience is not specified.

4. Adjudication Method for the Test Set:

Not explicitly defined in the provided text. However, for "expert scoring" and estimating "human variability," common adjudication methods include:
- Consensus: Multiple experts independently score, and a final consensus is reached (e.g., by discussion or a third adjudicator if initial scores differ significantly).
- Majority vote: If more than two experts, the majority decision prevails.
- Pairwise agreement: Often used to quantify inter-rater variability for tasks like sleep staging.

5. Multi Reader Multi Case (MRMC) Comparative Effectiveness Study:

The document does not report on an MRMC comparative effectiveness study where human readers improve with AI vs. without AI assistance for this specific submission (K242094). This submission is for a PCCP and relies on the predicate's performance.

6. Standalone (Algorithm Only) Performance Study:

Yes, the document implies that a standalone performance study was conducted for the predicate device (K223539). The algorithm "analyzes data from the headband EEG and accelerometer sensors" and "uses raw EEG data and accelerometer data to provide automatic sleep staging according to the AASM classification." The PCCP is about maintaining and improving this algorithm's standalone performance.
The "clinical performance validation will also be repeated, and will require that the performance of any modification to Dreem 3S to be non-inferior" to previous versions. This directly refers to the algorithm's standalone performance.

7. Type of Ground Truth Used:

Expert Consensus: The phrase "automatic sleep staging according to the AASM classification" and "comparison of expert scoring from 284 American Academy of Sleep Medicine (AASM) compliant polysomnography recordings" strongly indicates that the ground truth is established by expert scoring conforming to AASM guidelines. This is the standard for sleep staging.

8. Sample Size for the Training Set:

Not specified in this document. This refers to the original training data used for the predicate device (K223539). For future updates, the PCCP mentions "Retraining with an updated training/tuning dataset" but does not specify the size of these datasets.

9. How the Ground Truth for the Training Set Was Established:

Not explicitly specified for the training set itself, but it is highly probable that the ground truth for the training set was established through expert consensus scoring according to AASM guidelines, similar to how the test set's ground truth is (or will be for PCCP updates) established. This is standard practice for supervised machine learning models in this domain.

Ask a Question

Ask a specific question about this device

K Number

K233438

Device Name

SleepStageML

Manufacturer

Beacon Biosignals, Inc.

Date Cleared

2024-03-08

(147 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K221179,K153412

Predicate For

N/A

Intended Use

SleepStageML is intended for assisting the diagnostic evaluation by a qualified clinician to assess sleep quality from level 1 polysomnography (PSG) recordings in a clinical environment in patients aged 18 and older.

SleepStageML is a software-only medical device to be used to analyze physiological signals and automatically score sleep stages. All outputs are subject to review by a qualified clinician.

Device Description

SleepStageML is an Artificial Intelligence/Machine Learning (Al/ML)-enabled software-only medical device that analyzes polysomnography (PSG) recordings and automatically scores sleep stages. It is intended for assisting the diagnostic evaluation by a qualified clinician to assess sleep quality in patients aged 18 and older.

Qualified clinicians (also referred to as clinical users) such as sleep physicians, sleep technicians, or registered PSG technologists (RPSGTs) who are qualified to review PSG studies, provide PSG recordings in European Data Format (EDF) file format through a secure file transfer system to Beacon Biosignals. SleepStageML automatically analyzes the provided PSG recording and return an EDF file containing the original PSG recording with software-generated sleep stage annotations (i.e., Wake (W), non-REM 1 (N1), non-REM 2 (N2), non-REM 3 (N3), and REM (R)) back to the clinical user. The EDF files containing PSG signals as well as sleep stage annotations are referred to as EDF+. The returned EDF+ files can then be reviewed by the qualified clinicians via the users' PSG viewing software. The recordings processed by SleepStageML are level-1 PSG recordings obtained in an attended setting in accordance with American Association of Sleep Medicine (AASM) recommendations with respect to minimum sampling rate, electroencephalography (EEG) channels, and EEG locations. SleepStageML only uses the EEG signals in provided PSGs and does not consider electromyography (EMG) or electrooculography (EOG) signals when performing sleep staging. The sleep stage outputs of SleepStageML are intended to be comparable to sleep stages as defined by AASM guidelines. SleepStageML software outputs are subject to qualified clinician's review.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided FDA 510(k) summary for SleepStageML:

Acceptance Criteria and Reported Device Performance

Sleep Staging Comparisons	Acceptance Criteria (Predicate Reference: Sleep Profiler, K153412, N=43 subjects)	Reported Device Performance (SleepStageML, N=100 subjects)
Overall Agreement (OA)
W	89%	96.1% (95% CI: 95.4%, 96.8%)
N1	89%	94.5% (95% CI: 93.7%, 95.2%)
N2	81%	87.1% (95% CI: 85.9%, 88.3%)
N3	91%	92.9% (95% CI: 91.8%, 93.8%)
R	95%	97.3% (95% CI: 96.7%, 97.9%)
Positive Agreement (PA)
W	73%	88.9% (95% CI: 86.5%, 91.2%)
N1	25%	58.4% (95% CI: 54.2%, 62.4%)
N2	77%	79.8% (95% CI: 77.7%, 81.8%)
N3	76%	93.0% (95% CI: 89.8%, 95.7%)
R	74%	93.1% (95% CI: 91.5%, 94.5%)
Negative Agreement (NA)
W	94%	98.5% (95% CI: 98.2%, 98.8%)
N1	93%	96.2% (95% CI: 95.4%, 96.9%)
N2	84%	94.2% (95% CI: 93.2%, 95.0%)
N3	94%	92.9% (95% CI: 91.7%, 93.9%)
R	97%	98.0% (95% CI: 97.3%, 98.6%)
Multi-stage Agreement	Not explicitly stated for predicate in a comparable way, but implied.	84.02% (Calculated from N=100 subjects total epochs: 86,983 overall, 2,289 no consensus)

Study Details:

Sample sizes used for the test set and data provenance:
- Test Set Sample Size: 100 patients.
- Data Provenance: Retrospective pivotal validation study using previously collected clinical polysomnography (PSG) recordings. The recordings were randomly selected from three Level 1 clinical PSG data sources. The document does not specify the country of origin of the data.
Number of experts used to establish the ground truth for the test set and their qualifications:
- Number of Experts: Three (3) registered PSG technologists (RPSGTs).
- Qualifications: Each RPSGT had at least 5 years of experience in clinical scoring of sleep studies.
Adjudication method for the test set:
- Method: 2/3 majority scoring. Expert consensus sleep stages were constructed using the stage per epoch where at least 2 of the 3 experts agreed. Epochs where all 3 RPSGTs disagreed were excluded.
If a multi-reader multi-case (MRMC) comparative effectiveness study was done:
- No, an MRMC study comparing human readers with AI vs. without AI assistance was not explicitly detailed. The study focused on the standalone performance of the AI algorithm against human expert consensus to demonstrate non-inferiority to a predicate device. The device's indication for use explicitly states, "All outputs are subject to review by a qualified clinician," indicating a human-in-the-loop design, but the described performance study is primarily a standalone evaluation.
If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:
- Yes, the clinical validation test evaluated the SleepStageML software's performance "against the expert consensus sleep stages" in a standalone manner. The device's outputs are intended to be reviewed by a clinician, but the performance metrics reported are for the algorithm's direct output compared to ground truth.
The type of ground truth used:
- Type: Expert Consensus. The ground truth was established by three RPSGTs, with a 2/3 majority rule for consensus.
The sample size for the training set:
- The document states, "SleepStageML uses a deep learning algorithm based on convolutional neural networks, which was trained on a large and diverse set of PSG recordings with sleep staging labels." However, a specific sample size for the training set is not provided in the summary.
How the ground truth for the training set was established:
- The document states the training was on "PSG recordings with sleep staging labels." It does not explicitly detail the method for establishing ground truth for the training set (e.g., if it was also expert consensus, single expert, or another method). However, given the nature of sleep staging, it is highly likely that these labels were also derived from expert annotations, similar to the test set, though possibly not with the same rigorous 3-expert consensus and adjudication for every record.

Ask a Question

Ask a specific question about this device

K Number

K223539

Device Name

Dreem 3S

Manufacturer

Beacon Biosignals, Inc.

Date Cleared

2023-08-18

(268 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K170138,K192469

Predicate For

K242094

Intended Use

The Dreem 3S is intended for prescription use to measure, record, display, transmit and analyze the electrical activity of the brain to assess sleep and awake in the home or healthcare environment.

The Dreem 3S can also output a hypnogram of sleep scoring by 30-second epoch and summary of sleep metrics derived from this hypnogram.

The Dreem 3S is used for the assessment of sleep on adult individuals (22 to 65 years old). The Dreem 3S allows for the generation of user/predefined reports based on the subject's data.

Device Description

The Dreem 3S headband contains microelectronics, within a flexible case made of plastic, foam, and fabric. It includes 6 EEG electrodes and a 3D accelerometer sensor.

The EEG signal is measured by two electrodes in the frontal position) and two at the back of the head (occipital position), along with one reference electrode and one ground electrode.

The device includes a bone-conduction speaker with volume control to provide notifications to the wearer, and a power button circled by a multicolor LED light

The algorithm uses raw EEG data and accelerometer data to provide automatic sleep staging according to the AASM classification. The algorithm is implemented with an artificial neural network. Frequency spectrums are computed from raw data and then passed to several neural network layers including recurrent layers and attention layers. The algorithm outputs prediction for several epochs of 30 seconds at the same time, every 30 seconds. The various outputs for a single epoch of 30 seconds are combined to provide robust sleep scoring.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study details for the Dreem 3S device based on the provided text:

Acceptance Criteria and Device Performance

Acceptance Criteria (Implicit from Study Results)	Reported Device Performance (Dreem 3S vs. Expert-scored PSG)
Sleep Stage Classification Accuracy
Wake Classification Performance	Positive Agreement (PA): 88.5% (85.1%, 91.3% CI)
N1 Classification Performance	Positive Agreement (PA): 58.0% (52.7%, 63.0% CI)
N2 Classification Performance	Positive Agreement (PA): 83.4% (80.7%, 85.7% CI)
N3 Classification Performance	Positive Agreement (PA): 98.2% (96.73%, 99.3% CI)
REM Classification Performance	Positive Agreement (PA): 91.57% (86.63%, 95.72% CI)
EEG Data Quality for Manual Scoring	96.6% epochs per night of recording were acceptable for manual scoring and sleep staging by at least two out of three reviewers.
Minimum Scoreable Data	All data recordings reviewed had ≥4 hours of data considered to be scoreable by at least two out three reviewers.
Usability in Home Setting	The device could be successfully used and was tolerated by study subjects.

Note: The document primarily presents performance results rather than explicitly stating pre-defined acceptance criteria with numerical thresholds the device needed to meet. The "acceptance criteria" listed above are inferred from the demonstrated performance that supported substantial equivalence.

Study Details

Sample size used for the test set and the data provenance:
- Sample Size: 38 subjects
- Data Provenance: The study was a "clinical investigation... completed... in a sleep lab setting." Subjects ranged from 23 to 66 years old, equally split between male and female, and included individuals self-identified as White, Black African American, Asian, Hispanic, and some not identified. This suggests a prospective study with diverse participants, likely conducted in a single country, though the specific country of origin is not explicitly stated. The study included a total of 36447 epochs, corresponding to about 303 hours and 43 minutes of sleep.
Number of experts used to establish the ground truth for the test set and the qualifications of those experts:
- The ground truth for sleep staging was established by "expert-scored sleep stages from a cleared device." For EEG data quality, this was assessed by "at least two out of three reviewers qualified to read EEG and/or PSG data." The specific number of experts used for the primary sleep staging ground truth is not explicitly stated (e.g., whether it was one expert, or a consensus of multiple). Their exact qualifications (e.g., years of experience as a radiologist) are also not detailed beyond "expert-scored" and "qualified to read EEG and/or PSG data."
Adjudication method (e.g., 2+1, 3+1, none) for the test set:
- For the primary sleep staging (Table 2), the ground truth is referred to as "Consensus from manual staging" or "expert-scored PSG." The specific adjudication method (e.g., 2+1, 3+1) is not detailed.
- For EEG data quality, acceptability was determined if "at least two out of three reviewers qualified to read EEG and/or PSG data" agreed. This implies a 2-out-of-3 consensus (similar to a 2+1 method if one reviewer was the primary and two others adjudicated).
If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:
- No, a multi-reader multi-case (MRMC) comparative effectiveness study focusing on human reader improvement with AI assistance was not done. This study solely evaluated the standalone performance of the Dreem 3S algorithm against expert-scored PSG.
If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:
- Yes, a standalone study was done. The clinical performance evaluation directly compares the "Dreem 3S (Automated analysis)" output to "Consensus from manual staging" (expert-scored PSG), indicating the performance of the algorithm without human intervention in the loop.
The type of ground truth used (expert consensus, pathology, outcomes data, etc.):
- Expert consensus of manual staging from a 510(k)-cleared PSG system.
The sample size for the training set:
- The sample size for the training set is not specified in the provided document. The document only mentions "The algorithm is implemented with an artificial neural network. Frequency spectrums are computed from raw data and then passed to several neural network layers including recurrent layers and attention layers," which implies a training process, but no details on the training data size are given.
How the ground truth for the training set was established:
- How the ground truth for the training set was established is not specified in the provided document.

Ask a Question

Ask a specific question about this device

Page 1 of 1