Search Results

Automatic scoring of sleep EEG data to identify stages of sleep according the American Academy of Sleep Medicine definitions, rules and guidelines. It is to be used with adult populations.

Device Description

The Neurosom EEG Assessment Technology (NEAT) is a medical device software application that allows users to perform sleep staging post-EEG acquisition. NEAT allows users to review sleep stages on scored MFF files and perform sleep scoring on unscored MFF files.

NEAT software is designed in a client-server model and comprises a User Interface (UI) that runs on a Chrome web browser in the client computer and a Command Line Interface (CLI) software that runs on a Forward-Looking Operations Workflow (FLOW) server.

The user interacts with the NEAT UI through the FLOW front-end application to initiate the NEAT workflow on unscored MFF files and visualize sleep-scoring results. Sleep stages are scored by the containerized neat-cli software on the FLOW server using the EEG data. The sleep stages are then added to the input MFF file as an event track file in XML format. Once the new event track file is created, the NEAT UI component retrieves the sleep events from the FLOW server and displays a hypnogram (visual representation of sleep stages over time) on the screen, along with sleep statistics and other subject details. Additionally, a summary of the sleep scoring is automatically generated and added to the same participant in the FLOW server in PDF format.

AI/ML Overview

The FDA 510(k) Clearance Letter for NEAT 001 provides information about the device's acceptance criteria and the study conducted to prove its performance.

Acceptance Criteria and Device Performance

The core acceptance criteria for NEAT 001, as demonstrated by the comparative clinical study, are based on its ability to classify sleep stages (Wake, N1, N2, N3, REM) with performance comparable to the predicate device, EnsoSleep, and within the variability observed among expert human raters.

Table of Acceptance Criteria and Reported Device Performance

The document does not explicitly state pre-defined numerical "acceptance criteria" for each metric (Sensitivity, Specificity, Overall Agreement) that NEAT 001 had to meet. Instead, the approach was a comparative effectiveness study against a predicate device (EnsoSleep), with the overarching criterion being "substantial equivalence" as interpreted by performance falling within the range of differences expected among expert human raters.

Therefore, the "acceptance criteria" are implied by the findings of substantial equivalence. The "reported device performance" is given in terms of the comparison between NEAT and EnsoSleep, and their differences relative to human agreement variability.

Metric / Sleep Stage	NEAT Performance (vs. Predicate EnsoSleep)	Acceptance Criteria (Implied)
Wake (Wa)	Equivalent performance (1-2% difference)	Difference within range of human agreement variability
REM (R)	EnsoSleep performed better (3-4% difference)	Difference within range of human agreement variability (stated as 3% for CSF dataset)
N1 (Overall Performance)	EnsoSleep better (4-7%)	Difference within range of human agreement variability (only in BEL data set was this difference bigger than human agreement)
N1 (Sensitivity)	NEAT substantially better (8-20%)	Not a primary equivalence metric, but noted as an area where NEAT excels.
N1 (Specificity)	EnsoSleep better (5-9%)	Not a primary equivalence metric, but noted.
N2 (Overall Performance)	EnsoSleep marginally better (5%) for BEL data set	Difference within range of human agreement variability
N2 (Sensitivity)	EnsoSleep more sensitive (22%)	Not a primary equivalence metric, but noted.
N2 (Specificity)	EnsoSleep less specific (9-11%)	Not a primary equivalence metric, but noted.
N3 (Overall Performance)	Equivalent (1% difference overall)	Difference within range of human agreement variability
N3 (Sensitivity)	NEAT substantially better (15-39%)	Not a primary equivalence metric, but noted as an area where NEAT excels.
N3 (Specificity)	EnsoSleep marginally better (3-4%)	Not a primary equivalence metric, but noted.
General Conclusion	Statistically significant differences, but practically within the range of differences expected among expert human raters.	Substantial equivalence to predicate device.

Study Details

Here's a breakdown of the study details based on the provided text:

1. Sample Size and Data Provenance

Test Set Sample Size: The exact number of participants or EEG recordings in the test set is not explicitly stated. The document refers to "two data sets" (referred to as "BEL data set" and "CSF data set") used for testing both NEAT and EnsoSleep. The large resampling number (R=2000 resamples for bootstrapping) suggests a dataset size sufficient to yield small confidence intervals.
Data Provenance:
- Country of Origin: Not explicitly stated.
- Retrospective or Prospective: Not explicitly stated, but the mention of "All data files were scored by EnsoSleep" and "All data files were scored by NEAT" implies these were pre-existing datasets, making them retrospective.

2. Number of Experts and Qualifications for Ground Truth

Number of Experts: Not explicitly stated. The study refers to "established gold standard" and "human agreement variability" among "expert human raters," implying multiple experts.
Qualifications of Experts: Not explicitly stated beyond "expert human raters." No details are provided regarding their specific medical background (e.g., neurologists, sleep specialists), years of experience, or board certifications.

3. Adjudication Method for the Test Set

Adjudication Method: Not explicitly stated. The document simply refers to "the established gold standard." It does not mention whether this gold standard was derived from a single expert, consensus among multiple experts, or a specific adjudication process (like 2+1 or 3+1).

4. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

Was an MRMC study done? A direct MRMC comparative effectiveness study involving human readers assisting with AI vs. without AI assistance was not explicitly described. The study primarily focuses on comparing the standalone performance of NEAT (the AI) against the standalone performance of the predicate device (EnsoSleep), and then interpreting these differences in the context of human-to-human agreement variability.
Effect Size of Human Reader Improvement: Since a direct MRMC study with human readers assisting AI was not detailed, there is no information provided on the effect size of how much human readers improve with AI vs. without AI assistance.

5. Standalone Performance (Algorithm Only)

Was a standalone study done? Yes. The study evaluated the "segment-by-segment" performance of NEAT and EnsoSleep algorithms directly against the "established gold standard." This is a measure of the algorithm's standalone performance without human input during the scoring process.

6. Type of Ground Truth Used

Type of Ground Truth: The ground truth for the test set was based on an "established gold standard" for sleep stage classification. This strongly implies expert consensus or expert scoring of the EEG data according to American Academy of Sleep Medicine definitions, rules, and guidelines. Pathology or outcomes data were not used for sleep staging ground truth.

7. Training Set Sample Size

Training Set Sample Size: The sample size for the training set is not explicitly stated in the provided document.

8. How Ground Truth for Training Set Was Established

How Ground Truth for Training Set Was Established: The document states that neat-cli "leverages Python libraries for identifying stages of sleep on MFF files using Machine Learning (ML)." However, it does not explicitly describe how the ground truth for the training set was established. Typically, for ML models, the training data's ground truth would also be established by expert annotation or consensus, similar to the test set ground truth, but this is not confirmed in the provided text.

Ask a Question

Ask a specific question about this device

K Number

K241513

Device Name

Sourcerer

Manufacturer

Brain Electrophysiology Laboratory Company, LLC

Date Cleared

2024-09-27

(121 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K092844

Predicate For

N/A

Intended Use

The software is intended for use by a trained/qualified EEG technologist or physician on both adult and pediatric subjects at least 16 years of age for the visualization of human brain function by fusing a variety of EEG information with rendered images of an idealized head model and an idealized MRI image.

Device Description

Sourcerer is an EEG source localization software that uses EEG and MRI-derived information to estimate and visualize cortex projections of human brain activity. Sourcerer is designed in a client-server model wherein the server components integrate directly with FLOW - BEL's software. Inverse source projections are computed on the server using EEG and MRI data from FLOW using the Electro-magnetic Inverse Module (EMIM API). The inverse results are interactively visualized in the Chrome browser running on the client computer using the Electro-magnetic Functional Anatomy Viewer (EMFAV).

AI/ML Overview

Here's an analysis of the provided text to extract the acceptance criteria and study details:

1. Table of Acceptance Criteria and Reported Device Performance

Acceptance Criteria	Reported Device Performance
Algorithmic Testing (HexaFEM)
Consistency with analytical solutions for three-layer spherical model	HexaFEM solutions are consistent with the analytical solutions for the three-layer spherical model.
Consistency with FDM solutions for a realistic head model using the same conductivity values	HexaFEM and FDM solutions are the same for one realistic head model using the same conductivity values.
Algorithmic Testing (Inverse Model - EMIM Module)
LORETA: Localization error distance similar to reported values by its creator.	Average localization error is about 7 mm, similar to what is reported for LORETA from its creator.
sLORETA: Exact source estimation results for simulated signal sources, replicating creator's reported results.	Source estimation results are exact for the simulated signal sources, fully replicating simulated results reported by sLORETA's creator.
MSP: Zero localization error for simulated signal sources.	Shows 100% (zero localization error), as expected.
Clinical Performance Testing
Performance of Sourcerer to be equivalent to GeoSource (Predicate Device).	Performance of Sourcerer was shown to be equivalent to GeoSource (comparison based on Euclidian distance between maximal amplitude location and resected boundary in epileptic patients).
Software Verification and Validation Testing
Accuracy of Sourcerer validated through algorithm testing.	Algorithm testing validated the accuracy of Sourcerer. Product deemed fit for clinical use.
Developed according to FDA's "Guidance for the Content of Premarket Submissions for Software Contained in Medical Device".	Sourcerer was designed and developed as recommended by the FDA guidance.
Safety classification set to Class B according to AAMI/ANSI/IEC 62304 Standard.	Sourcerer safety classification set to Class B.
"Basic Documentation Level" applied.	"Basic Documentation Level" applied to this device.

2. Sample size used for the test set and the data provenance

The text explicitly mentions:

Clinical Performance Testing: "The clinical data used in the evaluation is obtained from epileptic patients during standard presurgical evaluation." The sample size for the clinical test set is not explicitly stated as a number, but rather as "each patient's pre-operative hdEEG recording." It's implied there were multiple patients, but the exact count is missing.
Data Provenance: The clinical data is retrospective ("obtained from epileptic patients during standard presurgical evaluation") and appears to be from a clinical setting, presumably within the country of origin of the device manufacturer (USA, as indicated by the FDA submission).

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

Clinical Performance Testing Ground Truth: The ground truth for the clinical test set was established by:
- Resected region (from MRI): This implies surgical and pathological confirmation of the epileptic zone, which would typically involve neurosurgeons and neuropathologists.
- Clinical outcome: This refers to the patient's post-surgical seizure control, indicating the success of the resection.
  No specific number of experts or their qualifications (e.g., number of years of experience) are provided in the document.

4. Adjudication method for the test set

The document does not explicitly describe an adjudication method for establishing ground truth, such as 2+1 or 3+1. The ground truth for the clinical performance testing relied on the "resected region (from MRI)" and "clinical outcome," which are objective clinical findings rather than subjective expert interpretations requiring adjudication.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

There is no mention of a multi-reader multi-case (MRMC) comparative effectiveness study. The clinical performance testing compared the device's output (Electrical Source Imaging - ESI) to the predicate device (GeoSource) and the ground truth (resected region, clinical outcome), not improved human reader performance with AI assistance.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Yes, extensive standalone (algorithm only) performance testing was done:

Algorithmic Testing of HexaFEM: Compared HexaFEM solutions to analytical solutions and FDM solutions.
Algorithmic Testing of Inverse Model (EMIM Module): Tested LORETA, sLORETA, and MSP solvers using "test files with known signal sources." This involved comparing the algorithm's estimated source generator to the known (simulated) source.

7. The type of ground truth used

Algorithmic Testing (HexaFEM):
- Mathematical/Analytical Ground Truth: Comparison with "analytical solutions for the three-layer spherical model."
- Comparative Ground Truth: Comparison with "FDM solutions for one realistic head model."
Algorithmic Testing (Inverse Model - EMIM Module):
- Simulated/Known Ground Truth: "known signal sources" from forward projections were used as ground truth for "recovering the source generator (known)."
Clinical Performance Testing:
- Outcomes Data/Pathology/Clinical Consensus: "resected region (from MRI)" and "clinical outcome" were used to establish the ground truth for epileptic focus localization.

8. The sample size for the training set

The document does not specify the sample size for the training set. It focuses on verification and validation, but not the training of the underlying algorithms.

9. How the ground truth for the training set was established

Since the document does not specify the training set, it does not describe how its ground truth was established. The ground truth description is primarily for the test/validation sets.

Ask a Question

Ask a specific question about this device

Page 1 of 1