autoSCORE is intended for the review, monitoring and analysis of EEG recordings made by electroencephalogram (EEG) devices using scalp electrodes and to aid neurologists in the assessment of EEG. This device is intended to be used by qualified medical practitioners who will exercise professional judgment in using the information.
The spike detection component of autoSCORE is intended to mark previously acquired sections of the patient's EEG recordings that may correspond to spikes, in order to assist qualified clinical practitioners in the assessment of EEG traces. The spike detection component is intended to be used in patients at least three months old. The autoSCORE component has not been assessed for intracranial recordings.
autoSCORE is intended to assess the probability that previously acquired sections of EEG recordings contain abnormalities, and classifies these into pre-defined types of abnormalities, including epileptiform abnormalities. autoSCORE does not have a user interface. autoSCORE sends this information to the EEG reviewing software to indicate where markers indicating abnormality are to be placed in the EEG. autoSCORE also provides the probability that EEG recordings include abnormalities and the type of abnormalities. The user is required to review the EEG and exercise their clinical judgendently make a conclusion supporting or not supporting brain disease.
This device does not provide any diagnostic conclusion about the patient's condition to the user. The device is not intended to detect or classify seizures.

Device Description

autoSCORE is a software-only decision support product intended to be used with compatible electroencephalography (EEG) review software. It is intended to assist the user when reviewing EEG recordings, by assessing the probability that previously acquired sections of EEG recordings contain abnormalities, and classifying these into pre-defined types of abnormality. autoSCORE sends this information to the EEG software to indicate where markers indicating abnormality are to be placed in the EEG. autoSCORE uses an algorithm that has been trained with standard deep learning principles using a large training dataset. autoSCORE also provides an overview of the probability that EEG recordings and sections of EEG recordings include abnormalities, and which type(s) of abnormality they include. This is performed by identifying spikes of epileptiform abnormalities (Focal epileptiform and Generalized epileptiform) as well identifying non-epileptiform abnormalities (Focal Nonepileptiform and Diffuse Non-epileptiform). The user is required to review the EEG and exercise their clinical judgement to independently make a conclusion supporting or not supporting brain disease. autoSCORE cannot detect or classify seizures. The recorded EEG activity is not altered by the information provided by autoSCORE. autoSCORE is not intended to provide information for diagnosis but to assist clinical workflow when using the EEG software.

AI/ML Overview

The FDA 510(k) summary for Holberg EEG AS's autoSCORE device provides extensive information regarding its acceptance criteria and the study proving it meets these criteria. Here's a breakdown of the requested information:

Acceptance Criteria and Device Performance for autoSCORE

The acceptance criteria for autoSCORE are established by its performance metrics in comparison to human expert assessments and predicate devices. The device is intended to assist medical practitioners in the review, monitoring, and analysis of EEG recordings by identifying and classifying abnormalities, particularly epileptic and non-epileptic events.

1. Table of Acceptance Criteria and Reported Device Performance

The acceptance criteria are implicitly defined by the performance metrics (Sensitivity, Specificity, PPV, NPV, Correlation Coefficient) shown to be comparable to or exceeding those of human experts or predicate devices. Since specific numeric thresholds for acceptance are not explicitly stated, the reported performance metrics are presented as evidence of meeting acceptable clinical performance.

Table 1: Reported Performance of autoSCORE (Summarized from document)

Metric (Recording-Level)	Normal/Abnormal (All Ages; Part 2, n=100)	Normal/Abnormal (All Ages; Part 1, n=4850)	Normal/Abnormal (All Ages; Part 5, n=1315)	Focal Epi (Part 2, n=100)	Gen Epi (Part 2, n=100)	Diff Non-Epi (Part 2, n=100)	Focal Non-Epi (Part 2, n=100)	Epi (AutoSCORE vs. Predicate; Part 3, n=100)	Epi (AutoSCORE vs. Predicate; Part 4, n=58)
Sensitivity (%)	100	83.1 [81.3, 84.8]	87.8 [85.0, 90.5]	73.9 [54.5, 91.3]	100 [100, 100]	87.5 [72.7, 100]	61.5 [42.1, 80]	90.0 [77.8, 100]	93.3 [83.3, 100]
Specificity (%)	88.4 [77.8, 97.4]	91.8 [90.8, 92.8]	89.4 [87.2, 91.6]	88.3 [80.8, 94.9]	94.1 [88.6, 98.8]	82.8 [74.0, 90.9]	93.2 [86.8, 98.6]	87.1 [78.8, 94.4]	96.4 [88.0, 100]
PPV (%)	92.0 [84.5, 98.3]	84.9 [83.2, 86.6]	86.0 [83.0, 88.8]	65.4 [45.8, 83.3]	75.1 [54.5, 93.8]	61.7 [44.8, 78.0]	76.1 [56.2, 94.1]	75.0 [60.0, 88.9]	96.6 [88.5, 100]
NPV (%)	100	90.8 [89.8, 91.8]	90.9 [88.8, 92.9]	91.9 [85.1, 97.4]	100 [100, 100]	95.5 [89.7, 100]	87.4 [79.5, 94.1]	95.3 [89.5, 100]	93.1 [82.6, 100]
Correlation Coeff.	0.96	0.99	0.99	0.85	0.83	0.93	0.84	N/A	N/A

Note: For detailed confidence intervals and marker-level performance, refer to Tables 4, 5, 6, 7, and 8 in the original document.

2. Sample Sizes Used for the Test Set and Data Provenance

The clinical validation was performed across five separate datasets:

Part 1 (Single-Center): 4,850 EEGs. Data provenance not explicitly stated but implied to be from routine EEG assessment in a hospital setting. Retrospective.
Part 2 (Multi-Center): 100 EEGs. Data provenance not explicitly stated but implied to be from routine EEG assessment in various hospital settings. Retrospective.
Part 3 (Direct Comparison to Primary Predicate): Same 100 EEGs as Part 2. Retrospective.
Part 4 (Benchmarking against Primary and Secondary Predicates): 58 EEGs. Data provenance not explicitly stated but implied to be from routine EEG assessment. Retrospective.
Part 5 (Hold-out Dataset, Two Centers): 1,315 EEGs. Data provenance not explicitly stated but implied to be from routine EEG assessment in two hospital settings. Retrospective.

None of the EEGs used in the validation were used for the development of the AI model. The document does not explicitly state the country of origin for the data, but the company address in Norway suggests a European origin.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of those Experts

Part 1 & 5: Ground truth established by multiple Human Experts (HEs), with a single HE reviewer per EEG.
- Part 1: 9 HEs, each assessing more than 1% of the EEGs.
- Part 5: 15 HEs, each assessing more than 1% of the EEGs.
- Qualifications: "Qualified medical practitioners" or "neurologists" who exercise professional judgment. In Parts 1 and 5, their assessments were part of "routine EEG assessment in their respective hospitals," implying they are experienced clinicians.
Part 2 & 3: Ground truth established by HE consensus.
- Part 2 & 3: 11 independent HEs reviewed 100 EEGs.
- Qualifications: "Independent human experts." Implied to be qualified clinical practitioners.
Part 4: Ground truth established by HE consensus.
- Part 4: 3 HEs.
- Qualifications: "HEs." Implied to be qualified clinical practitioners.

4. Adjudication Method for the Test Set

Part 1 & 5 (Recording and Marker Level): Ground truth was established by single HE reviewer per EEG. While multiple HEs contributed, each EEG had a single "reference standard" HE assessment. This is a "none" or "single-reader" adjudication in the context of individual EEG ground truth, though the overall dataset was reviewed by multiple HEs.
Part 2 & 3 (Recording Level): Ground truth was based on HE consensus of 11 HEs, assessing if EEGs were normal/abnormal and contained specific abnormality categories. This implies a form of majority consensus or agreement-based adjudication among the 11 experts. The granularity of probability grouping was 9 percentage points.
Part 4 (Recording and Marker Level): Ground truth was majority consensus scoring of 3 HEs.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, If So, What was the Effect Size of How Much Human Readers Improve with AI vs. Without AI Assistance?

The study described is a direct comparison of autoSCORE's performance against human experts and predicate devices, effectively evaluating the AI's standalone or augmented performance rather than the improvement of human readers when assisted by AI. The document states autoSCORE is a "decision support product intended to be used with compatible electroencephalography (EEG) review software" and that the "user is required to review the EEG and exercise their clinical judgement to independently make a conclusion." However, it does not present an MRMC comparative effectiveness study that quantifies the improvement of human readers assisted by AI versus without AI assistance.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done?

Yes, the study primarily assesses the standalone performance of the autoSCORE algorithm by comparing its outputs directly against human expert assessments (considered the ground truth) and outputs from predicate devices. The tables summarizing sensitivity, specificity, PPV, NPV, and correlation coefficients directly reflect the algorithm's performance.

7. The Type of Ground Truth Used (Expert Consensus, Pathology, Outcomes Data, etc.)

The ground truth used for the test sets was primarily human expert assessment (or consensus of human experts).

Parts 1 and 5 used individual HE assessments as the reference standard (routine clinical assessments).
Parts 2, 3, and 4 used expert consensus as the reference standard.
No pathology or outcomes data were used to establish the ground truth.

8. The Sample Size for the Training Set

The document explicitly states that "None of the EEGs used in the validation were used in the development of the AI model." However, the specific sample size of the training set is not provided in the provided document. It only mentions that "autoSCORE uses an algorithm that has been trained with standard deep learning principles using a large training dataset."

9. How the Ground Truth for the Training Set Was Established

The document does not explicitly describe how the ground truth for the training set was established. It only refers to "standard deep learning principles" and a "large training dataset." It notes that the HEs providing the reference standards for the validation phase (Studies 1, 2, 3, and 4) were different from those who participated in the development portion of the process. This implies that human experts were involved in creating the ground truth for the training data, but the method (e.g., single expert, multi-expert consensus, specific rules) is not detailed.

Summary

{0}------------------------------------------------

January 7, 2024

Image /page/0/Picture/1 description: The image shows the logo of the U.S. Food and Drug Administration (FDA). On the left is the Department of Health & Human Services logo. To the right of that is a blue square with the letters "FDA" in white. To the right of the blue square is the text "U.S. FOOD & DRUG ADMINISTRATION" in blue.

Holberg EEG AS Smriti Franklin QA/RA Manager Fjøsangerveien 70A 5068 Bergen, Norway

Re: K231068

Trade/Device Name: autoSCORE Regulation Number: 21 CFR 882.1400 Regulation Name: Electroencephalograph Regulatory Class: Class II Product Code: OMB Dated: December 8, 2023 Received: December 8, 2023

Dear Smriti Franklin:

We have reviewed your section 510(k) premarket notification of intent to market the device referenced above and have determined the device is substantially equivalent (for the indications for use stated in the enclosure) to legally marketed predicate devices marketed in interstate commerce prior to May 28, 1976, the enactment date of the Medical Device Amendments, or to devices that have been reclassified in accordance with the provisions of the Federal Food, Drug, and Cosmetic Act (the Act) that do not require approval of a premarket approval application (PMA). You may, therefore, market the device, subject to the general controls provisions of the Act. Although this letter refers to your product as a device, please be aware that some cleared products may instead be combination products. The 510(k) Premarket Notification Database available at https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfpmn/pmn.cfm identifies combination product submissions. The general controls provisions of the Act include requirements for annual registration, listing of devices, good manufacturing practice, labeling, and prohibitions against misbranding and adulteration. Please note: CDRH does not evaluate information related to contract liability warranties. We remind you, however, that device labeling must be truthful and not misleading.

If your device is classified (see above) into either class II (Special Controls) or class III (PMA), it may be subject to additional controls. Existing major regulations affecting your device can be found in the Code of Federal Regulations, Title 21, Parts 800 to 898. In addition, FDA may publish further announcements concerning your device in the Federal Register.

Additional information about changes that may require a new premarket notification are provided in the FDA guidance documents entitled "Deciding When to Submit a 510(k) for a Change to an Existing Device" (https://www.fda.gov/media/99812/download) and "Deciding When to Submit a 510(k) for a Software Change to an Existing Device" (https://www.fda.gov/media/99785/download).

{1}------------------------------------------------

Your device is also subject to, among other requirements, the Quality System (QS) regulation (21 CFR Part 820), which includes, but is not limited to, 21 CFR 820.30, Design controls; 21 CFR 820.90, Nonconforming product; and 21 CFR 820.100, Corrective and preventive action. Please note that regardless of whether a change requires premarket review. the OS regulation requires device manufacturers to review and approve changes to device design and production (21 CFR 820.30 and 21 CFR 820.70) and document changes and approvals in the device master record (21 CFR 820.181).

Please be advised that FDA's issuance of a substantial equivalence determination does not mean that FDA has made a determination that your device complies with other requirements of the Act or any Federal statutes and regulations administered by other Federal agencies. You must comply with all the Act's requirements, including, but not limited to: registration and listing (21 CFR Part 807); labeling (21 CFR Part 801); medical device reporting of medical device-related adverse events) (21 CFR Part 803) for devices or postmarketing safety reporting (21 CFR Part 4, Subpart B) for combination products (see https://www.fda.gov/combination-products/guidance-regulatory-information/postmarketing-safety-reportingcombination-products); good manufacturing practice requirements as set forth in the quality systems (QS) regulation (21 CFR Part 820) for devices or current good manufacturing practices (21 CFR Part 4, Subpart A) for combination products; and, if applicable, the electronic product radiation control provisions (Sections 531-542 of the Act); 21 CFR Parts 1000-1050.

Also, please note the regulation entitled, "Misbranding by reference to premarket notification" (21 CFR 807.97). For questions regarding the reporting of adverse events under the MDR regulation (21 CFR Part 803), please go to https://www.fda.gov/medical-device-safety/medical-device-reportingmdr-how-report-medical-device-problems.

For comprehensive regulatory information about mediation-emitting products, including information about labeling regulations, please see Device Advice (https://www.fda.gov/medicaldevices/device-advice-comprehensive-regulatory-assistance) and CDRH Learn (https://www.fda.gov/training-and-continuing-education/cdrh-learn). Additionally, you may contact the Division of Industry and Consumer Education (DICE) to ask a question about a specific regulatory topic. See the DICE website (https://www.fda.gov/medical-device-advice-comprehensive-regulatoryassistance/contact-us-division-industry-and-consumer-education-dice) for more information or contact DICE by email (DICE@fda.hhs.gov) or phone (1-800-638-2041 or 301-796-7100).

Sincerely,

Jay R. Gupta -S

Jay Gupta Assistant Director DHT5A: Division of Neurosurgical, Neurointerventional and Neurodiagnostic Devices OHT5: Office of Neurological and Physical Medicine Devices

{2}------------------------------------------------

Office of Product Evaluation and Quality Center for Devices and Radiological Health

Enclosure

{3}------------------------------------------------

Indications for Use

510(k) Number (if known) K231068

Device Name autoSCORE

Indications for Use (Describe)

autoSCORE is intended for the review, monitoring and analysis of EEG recordings made by electroencephalogram (EEG) devices using scalp electrodes and to aid neurologists in the assessment of EEG. This device is intended to be used by qualified medical practitioners who will exercise professional judgment in using the information.
The spike detection component of autoSCORE is intended to mark previously acquired sections of the patient's EEG recordings that may correspond to spikes, in order to assist qualified clinical practitioners in the assessment of EEG traces. The spike detection component is intended to be used in patients at least three months old. The autoSCORE component has not been assessed for intracranial recordings.
autoSCORE is intended to assess the probability that previously acquired sections of EEG recordings contain abnormalities, and classifies these into pre-defined types of abnormalities, including epileptiform abnormalities. autoSCORE does not have a user interface. autoSCORE sends this information to the EEG reviewing software to indicate where markers indicating abnormality are to be placed in the EEG. autoSCORE also provides the probability that EEG recordings include abnormalities and the type of abnormalities. The user is required to review the EEG and exercise their clinical judgendently make a conclusion supporting or not supporting brain disease.
This device does not provide any diagnostic conclusion about the patient's condition to the user. The device is not intended to detect or classify seizures.

Type of Use (Select one or both, as applicable)
☑ Prescription Use (Part 21 CFR 801 Subpart D) ☐ Over-The-Counter Use (21 CFR 801 Subpart C)	☑ Prescription Use (Part 21 CFR 801 Subpart D)	☐ Over-The-Counter Use (21 CFR 801 Subpart C)
☑ Prescription Use (Part 21 CFR 801 Subpart D)	☐ Over-The-Counter Use (21 CFR 801 Subpart C)

CONTINUE ON A SEPARATE PAGE IF NEEDED.

This section applies only to requirements of the Paperwork Reduction Act of 1995.

DO NOT SEND YOUR COMPLETED FORM TO THE PRA STAFF EMAIL ADDRESS BELOW.

The burden time for this collection of information is estimated to average 79 hours per response, including the

collection of information is estimated to average 79 hour time to review instructions, search existing data sources, gather and maintain the data needed and complete and review the collection of information. Send comments regarding this burden estimate or any other aspect of this information collection, including suggestions for reducing this burden, to:

Department of Health and Human Services Food and Drug Administration Office of Chief Information Officer Paperwork Reduction Act (PRA) Staff PRAStaff(@fda.hhs.gov

"An agency may not conduct or sponsor, and a person is not required to respond to, a collection of information unless it displays a currently valid OMB number."

{4}------------------------------------------------

Image /page/4/Picture/1 description: The image contains a logo for Holberg EEG. The logo consists of a purple circle with a stylized EEG waveform inside it. To the right of the circle, the text "HOLBERG EEG" is written in a simple, sans-serif font.

510(K) Summary

1. SUBMITTER

Holberg EEG AS Fjøsangerveien 70A 5068 Bergen, Norway Phone: +47 926 44 261 Contact Person: Smriti Franklin Date Prepared: March 23rd, 2023

2. DEVICE IDENTIFICATION

Trade Name: autoSCORE

Common Name: Automatic event detection software for full-montage electroencephalograph Classification Name and Regulation Number: Electroencephalograph, 21 CFR 882.1400 Regulatory Class: II Product Code: OMB

3. PREDICATE DEVICES

Primary Predicate Device

Trade/Device Name: encevis, K171720

Additional Predicate Device

Trade/Device Name: Persyst 13, K151929

{5}------------------------------------------------

Image /page/5/Picture/1 description: The image shows the logo for Holberg EEG. The logo consists of a purple circle with a stylized waveform inside, followed by the text "HOLBERG EEG" in a simple, sans-serif font. The waveform graphic is meant to represent brain activity, which is related to the EEG acronym.

4. DEVICE DESCRIPTION

autoSCORE also provides an overview of the probability that EEG recordings and sections of EEG recordings include abnormalities, and which type(s) of abnormality they include. This is performed by identifying spikes of epileptiform abnormalities (Focal epileptiform and Generalized epileptiform) as well identifying non-epileptiform abnormalities (Focal Nonepileptiform and Diffuse Non-epileptiform).

The user is required to review the EEG and exercise their clinical judgement to independently make a conclusion supporting or not supporting brain disease.

autoSCORE cannot detect or classify seizures. The recorded EEG activity is not altered by the information provided by autoSCORE. autoSCORE is not intended to provide information for diagnosis but to assist clinical workflow when using the EEG software.

5. INDICATIONS FOR USE/ INTENDED USE

5.1 INDICATIONS FOR USE STATEMENT

1. autoSCORE is intended for the review, monitoring and analysis of EEG recordings made by electroencephalogram (EEG) devices using scalp electrodes and to aid neurologists in the assessment of EEG. This device is intended to be used by qualified medical practitioners who will exercise professional judgment in using the information.
1. The spike detection component of autoSCORE is intended to mark previously acquired sections of the patient's EEG recordings that may correspond to spikes, in order to assist qualified clinical practitioners in the assessment of EEG traces. The spike detection component is intended to be used in patients at least three months old. The autoSCORE component has not been assessed for intracranial recordings.
1. autoSCORE is intended to assess the probability that previously acquired sections of EEG recordings contain abnormalities and classifies these into pre-defined types of

{6}------------------------------------------------

abnormalities, including epileptiform and non-epileptiform abnormalities. autoSCORE does not have a user interface. autoSCORE sends this information to the EEG reviewing software to indicate where markers indicating abnormality are to be placed in the EEG. autoSCORE also provides the probability that EEG recordings include abnormalities, and the type of abnormalities. The user is required to review the EEG and exercise their clinical judgement to independently make a conclusion supporting or not supporting brain disease.

1. This device does not provide any diagnostic conclusion about the patient's condition to the user. The device is not intended to detect or classify seizures.

5.2 INTENDED USE ENVIRONMENT

autoSCORE is intended to be used in environments where clinical EEGs are acquired or reviewed by suitably trained and qualified professionals.

autoSCORE is intended to be used for the analysis of EEG that has been recorded in environments suitable for adult and pediatric routine EEG acquisition according to best clinical practice, excluding acquisition environments for ICU and neonatal recordings. autoSCORE is not validated for EEG recorded in a home/ambulatory environment or any non-hospital/EEG laboratory setting.

5.3 INTENDED PATIENT POPULATION

autoSCORE use is restricted to EEG recordings from patients over 3 months of age. autoSCORE cannot be used for EEG recordings from neonatal patients. This restriction applies to all features of autoSCORE. There are no other restrictions regarding the patient population.

6. SUBSTANTIAL EQUIVALENCE DISCUSSION

The following table 1 compares autoSCORE to the predicate device with respect to intended use, technological characteristics and operating principles. The comments section provides further information on the determination of substantial equivalence.

		autoSCORE	encevis	Persyst 13	Comments
	510k Reference	Subject device	K171720	K151929	N/A
	Product Code	OMB	OMB	OMB	Identical
	Class	II	II	II	Identical
Device, regulation and sponsor details	RegulationNumber	21 CFR 882.1400	21 CFR 882.1400	21 CFR 882.1400	Identical
	RegulationName	Electroencephalograph	Electroencephalograph	Electroencephalograph	Identical
		autoSCORE	encevis	Persyst 13	Comments
Device Description and Features	Manufacturer	Holberg EEG AS	AIT Austrian Institute ofTechnology GmbH	Persyst DevelopmentCorporation	N/A
	Device Type	Software-only Device	Software-only Device	Software-only Device	Identical
	General DeviceDescription	EEG Review and AnalysisSoftware	EEG Review and AnalysisSoftware	EEG Review andAnalysis Software	Identical
	IdentifiesSpikes	Yes	Yes	Yes	Identical
	Assessmentandcategorizationofabnormalitiesincludingprobability inpreviouslyacquiredsections of EEG	Yes	No	No	Different
	Type of EEG	Scalp EEG	Scalp EEG	Scalp EEG	Identical
	Intended UseEnvironments	autoSCORE is intended tobe used in environmentswhere clinical EEGs areacquired or reviewed bysuitably trained andqualified professionals.autoSCORE is intended tobe used for the analysis ofEEG that has beenrecorded in environmentssuitable for adult andpediatric routine EEGacquisition according tobest clinical practice,excluding acquisitionenvironments for ICU andneonatal recordings.autoSCORE is notvalidated for EEGrecorded in ahome/ambulatoryenvironment or any non-hospital/EEG laboratorysetting.	encevis is intended to beused in environmentswhere clinical EEGs areacquired or reviewed bysuitably trained andqualified professionals.encevis Spike Detectioncomponent is intended tobe used in adult patientsgreater than or equal to18 years. encevis SpikeDetection performancehas not been assessed forintracranial recordings.	Persyst 13 is intendedto be used inenvironments whereclinical EEGs areacquired or reviewed bysuitably trained andqualified professionals.The Spike Detectioncomponent is intendedto be used in patients atleast one month old.Persyst 13 SpikeDetection performancehas not been assessedfor intracranialrecordings.	Similar. SeePatient age forintendedpopulationcomparison.
	Population age	> 3 months	Adults (age > 18 years)	> 1 month	Minimumpatient agemore thanthe predicatedevice.
		autoSCORE	encevis	Persyst 13	Comments
Device Operation	Design Input	Raw EEG Signal	Raw EEG signal		Identical
	DesignInput files	Calculation is based onEEG data recorded byexternal EEG systems.They are read from theEEG data provided bythe EEG system	Calculation is based onEEG data recorded byexternal EEG systems.They are either readfrom the EEG fileprovided by the EEGsystem or can be sentto encevis using theinterface provided byAIT (AITInterfaceDLL)	Calculation is basedon EEG datarecorded by externalEEG systems. Theyare read from theEEG file provided bythe EEG system	Identical(No AITinterface)
	Algorithm	Convolutional NeuralNetwork	Convolutional NeuralNetwork	Neural Network	Identical
	User-definedparameters	No parameters in spikedetection algorithm canbe changed by the user	No parameters in spikedetection algorithmcan be changed by theuser		Similar
	Type of EEG-Analysis	Post-hoc analysis	Post-hoc analysis	Post-hoc analysis	Identical
	Design Output	Spike Detectioncomponent makes theresults available to theuser in form ofmarkers	Spike Detectioncomponent makes theresults available to theuser in form of markers	Spike Detectioncomponent makes theresults available to theuser in form of markers	Identical
	Device Outputs	Identification andcategorization ofepileptiform and non-epileptiformabnormalitiesincluding probabilitythat EEG recordingsinclude abnormalities,and the type ofabnormalities. Theseoutputs are given atboth recording Leveland marker Level.	Identification ofepileptiformabnormalities (spikes).These outputs aregiven at marker level.	Identification ofepileptiformabnormalities (spikes).These outputs aregiven at marker level.	Some predicatedevice featuresare notincluded inautoSCORE.These includeseizuredetection,burstsuppression,aEEG, rhythmicand periodicpatterns andfrequencybands.
	Output Files	Results are returnedback to the hostsoftware after analysis.	Results are stored in adatabase and/or sentover the interfaceAITInterfaceDLL to anexternal EEG system.User output is given bygraphical userinterfaces.	Results are stored inadditional files in thefile system placed inthe same folder as theEEG file. User output isgiven by graphicaluser interfaces.	Similar
	Diagnosticconclusion				Comments
	autoSCORE	encevis	Persyst 13
User	This device is intendedto be used by qualifiedmedical practitionerswho will exerciseprofessional judgmentin using theinformation.	This device is intendedto be used by qualifiedmedical practitionerswho will exerciseprofessional judgmentin using theinformation.	This device is intendedto be used by qualifiedmedical practitionerswho will exerciseprofessional judgmentin using theinformation.	Identical
CompatibleandinteroperableEquipmentand software	autoSCORE can readand process EEG datafrom Natus®NeuroWorks®software	encevis can read andprocess EEG data fromseveral EEG vendors. Alist of compatible EEGsystems can be foundonhttp://www.encevis.com	Persyst 13 can readand process EEG datafrom several EEGvendors. A list ofcompatible EEGsystems can be foundonhttp://www.persyst.com/support/supported-formats/	Similar

Table 1: Comparison of autoSCORE against predicate devices.

{7}------------------------------------------------

Image /page/7/Picture/1 description: The image contains the logo for Holberg EEG. The logo consists of a purple circle with a stylized waveform inside, followed by the text "HOLBERG EEG" in a bold, sans-serif font. The waveform graphic is also purple, matching the color of the circle.

{8}------------------------------------------------

Image /page/8/Picture/1 description: The image shows the logo for Holberg EEG. The logo consists of a purple circle with a stylized waveform inside, followed by the text "HOLBERG EEG" in a simple, sans-serif font. The waveform graphic likely represents brain activity, which is relevant to the EEG (electroencephalogram) service offered by Holberg.

{9}------------------------------------------------

Image /page/9/Picture/1 description: The image shows the logo for Holberg EEG. The logo consists of a purple circle with a stylized waveform inside, followed by the text "HOLBERG EEG" in a sans-serif font. The waveform graphic is meant to represent an electroencephalogram (EEG), which is a test that measures electrical activity in the brain.

Color Key
Identical/Similar Characteristics		Different or N/A Characteristics

Comparison of Intended Use/ Indications for Use

The Indications for Use statement for autoSCORE is similar to the predicate devices. However, autoSCORE does not contain certain predicate device features including seizure detection, burst suppression, and other quantitative measures. Indications for use statement point 1, 2, and 4 are identical to the respective parts of predicate devices indications for use statement. Point 3 of the indications for use statement describes autoSCORE's technological characteristics, including additional outputs that are different from the predicate devices.

These differences do not alter the intended use of the device, nor do they affect the safety and effectiveness of the device relative to the predicates. Both the subject and predicate devices have the same intended use for analyzing electroencephalograph data, identifying events including spike detection and producing outputs based on analysis of EEG for interpretation by a qualified user.

Comparison of Technological Characteristics

Technological differences between the subject and predicate devices have been highlighted in Table 1 above. There are additional features in the predicate devices, including seizure detection, analysis of additional quantitative features, and a different user interface, which Page 5 - 6

{10}------------------------------------------------

Image /page/10/Picture/1 description: The image shows the logo for Holberg EEG. The logo consists of a purple circle with a stylized EEG waveform inside it. To the right of the circle, the words "HOLBERG EEG" are written in black, sans-serif font.

are outside the intended use of the subject device. These features are completely independent functions and do not affect the abnormality detection features of the subject device.

Both the predicate devices and subject device detect features related to epileptiform abnormalities (e.g. spikes). In addition to detecting epileptiform abnormalities, the subject device also detects non-epileptiform abnormalities. The subject device also provides the probability of the detected abnormality being an epileptiform abnormality, such as a focal epileptiform or generalized epileptiform abnormality, or a non-epileptiform abnormality, such as a focal non-epileptiform or diffuse non-epileptiform abnormality. The identification of additional abnormalities and categorization of these abnormalities does not affect the intended use of the device and does not pose any additional risks as compared to the predicate devices as evidenced through performance validation.

7. Performance Validation

Performance validation to evaluate autoSCORE performance was conducted in two parts:

-Non-Clinical Validation – To validate autoSCORE outputs against defined autoSCORE design inputs and user requirements.
-Clinical Validation – To validate autoSCORE performance against independent human experts and predicate devices.

These validations have been summarized below.

7.1 Non-clinical Performance Validation

Software verification and validation testing was conducted and documented in accordance with 2005 FDA Guidance, Guidance for the Content of Premarket Submissions for Software Contained in Medical Devices.

Product Design and Software Requirements Traceability has been documented and verified against verification and validation test results.

Software verification and validation testing included:

1. Code Review
1. Unit level testing
1. System level testing
1. Integration level testing

The software for this device is determined as a "moderate" level of concern because a failure or latent flaw could indirectly result in minor injury to the patient or operator through incorrect information or through the action of a care provider.

Software verification and validation activities demonstrated that the device software meets

Page 5 - 7

{11}------------------------------------------------

Image /page/11/Picture/1 description: The image shows the logo for Holberg EEG. The logo consists of a purple circle with a stylized EEG waveform inside it. To the right of the circle, the text "HOLBERG EEG" is written in a simple, sans-serif font.

all software requirements.

7.2 Clinical Performance Validation

7.2.1 Clinical Performance Evaluation

A retrospective non-interventional comprehensive clinical validation was performed using de-identified data to evaluate performance of all autoSCORE features against Human Experts and predicate devices to establish substantial equivalence.

The following performance data have been provided in support of the substantial equivalence determination.

Table 2: Type of software performance test per feature. autoSCORE indicates the EEG as normal if it does not contain epileptiform or non-epileptiform abnormalities, and abnormal if it contains one or both of these abnormalities. Part 1, Part 2, and Part 5 of the clinical study show comparable results against Human experts where an EEG is marked as 'normal' by autoSCORE. Part 3 and 4 of the study include the assessment of presence and absence of epileptiform abnormalities in predicate devices and autoSCORE that feeds into the assessment of a normal or abnormal EEG.

ValidationTestsPerformed	autoSCORE Features - Identification and categorization of followingabnormalities
		Spike Detection -epileptiform abnormalities			Non-epileptiformabnormalities
	NormalEEG	Focalepileptiform	Generalizedepileptiform	Focal non-epileptiform	Diffuse non-epileptiform
DirectComparisonagainstpredicate device	x	x	x	Not availablein predicate	Not availablein predicate
Benchmarkingagainst bothpredicatedevices withexternal goldstandard EEGs	x	x	x	Not availablein predicate	Not availablein predicate
Comparisonwith HumanExpertEvaluation	x	x	x	x	x

{12}------------------------------------------------

Image /page/12/Picture/1 description: The image shows the logo for Holberg EEG. The logo consists of a purple circle with a stylized waveform inside, followed by the text "HOLBERG EEG" in black. The waveform appears to represent brain activity, which is relevant to the company's focus on EEG technology.

Clinical Performance Evaluations

For the performance evaluation of the autoSCORE spike detection device, the study was conducted to measure outputs of autoSCORE against the assessments from independent human experts as well as the spike detection from the predicate devices – encevis and Persyst 13.

Further, for the autoSCORE performance evaluation of additional technological features (not in the predicate device), the study was conducted to measure autoSCORE results of nonepileptiform abnormalities detection and categorization of abnormalities against Human Experts (HE).

The clinical validation study was carried out in five parts to compare the performance of autoSCORE with the human experts as well as with the predicate devices:

1. Performance evaluation against human experts (single-Center): A single-center dataset of 4,850 EEGs assessed by 9 human experts assessing more than 1% of the EEGs each.
1. Performance evaluation against human experts (multi-center): A multi-center dataset of 100 EEGs were assessed by 11 independent human experts.

3. Direct comparison against primary predicate device (encevis): The same dataset of 100 EEGs used in Part 2 were used to evaluate performance against the primary predicate device, encevis.

4. Benchmarking against primary and secondary predicate device (encevis and Persyst 13):

A dataset of 58 EEGs was used to benchmark performance of both the primary predicate device encevis, the predicate device Persyst 13, and autoSCORE against human expert consensus.

1. Performance evaluation against human experts (two centers):
  A hold-out dataset of 1315 EEGs not used for training of the Al model acquired from two centers were assessed by 15 human experts assessing more than 1% of the EEGs each.

The validation study was performed across five separate datasets with the following characteristics:

Validation Parts	Number of sites	Sample size	Number of reviewers	Patient gender	EEG Duration min-max	Patient age min-max (median)	Pediatric (P) vs Adult (A)*
------------------	-----------------	-------------	---------------------	----------------	----------------------	------------------------------	-----------------------------

Table 3. Summary of study parts used for validation of autoSCORE.

{13}------------------------------------------------

Image /page/13/Picture/1 description: The image shows the logo for Holberg EEG. The logo consists of a purple circle with a stylized EEG waveform inside it. To the right of the circle are the words "HOLBERG EEG" in a simple, sans-serif font.

1. Performanceevaluation againstHEa. Recordinglevelb. MarkerLevel	1	4850	9	2527 (M)2248 (F)75(unknown)	14 - 120minutes	3 months- 106years (39years)	P - 1490A - 3360
2. Performanceevaluation againstHE3. Directcomparison againstpredicate device	16	100	14	39 (M)61 (F)	20 - 240minutes	9 months- 95years(26 years)	P - 43A - 57
4. Benchmarkingagainst predicatedevices and markervalidation againstHE consensusa. Recordinglevelb. MarkerLevel	4	58	3	27 (M)31 (F)	20 - 30minutes	2 - 77years (36years)	P - 13A - 45
5. Performanceevaluation againstHEa. Recordinglevelb. Marker	2	1315	15	636 (M)642 (F)37(unknown)	14 - 240minutes	3 months- 99years(38 years)	P - 467A - 848

*Pediatric - <22 years; Adults -> 22 years

7.2.2 Study Population and Refence Standards

None of the EEGs used in the validation were used in the development of the Al model. The HEs providing the reference standards in the validation phase of Study 1, 2, 3, and 4 were different from those who participated in the development portion of the process.

Study 1 – The reference standard was based on 4850 EEGs described by multiple HEs, but a single HE reviewer per EEG. The HEs inserted markers in the EEGs defining if the EEG was abnormal or normal, and if abnormal, the abnormality categories, and served as reference standard both on recording level and marker level. The HE assessments were part of the routine EEG assessment in their respective hospitals, and the HEs had all relevant patient clinical information. Apart from age and gender, all clinical data was removed for this clinical validation to avoid any associated bias.

Study 2 – The reference standard was based on HE consensus of 11 HEs reviewing 100 EEGs. The HEs assessed if the EEGs on recording level were normal or abnormal, and if abnormal if the EEGs contained one or more of the abnormality categories Focal Epi, Gen Epi, Focal Non-Epi, and Diffuse Non-Epi. The HEs were blinded to all patient data except age and gender.

{14}------------------------------------------------

Image /page/14/Picture/1 description: The image shows the logo for Holberg EEG. The logo consists of a purple circle with a stylized waveform inside, followed by the text "HOLBERG EEG" in a simple, sans-serif font. The waveform graphic suggests a connection to brain activity or neurological monitoring.

Study 3 – This study uses the same dataset as for study 2, and thus also the same HE consensus reference standard.

Study 4 – The reference standard was obtained by visual assessment of 58 EEGs by 3 HEs. Marker time points for the IEDs were recorded for each EEG and each HE. The reference standard was the majority consensus scoring of the HEs. This served as reference standard both on recording level and marker level for the IEDs. HEs were blinded to all patient data except age and gender.

Study 5 – The reference standard was based on 1315 EEGs described by multiple HEs, but a single HE reviewer per EEG. The HEs inserted markers in the EEGs defining if the EEG was abnormal or normal and, if abnormal, the abnormality categories, and served as reference standard both on recording level and marker level. The HE assessments were part of the routine EEG assessment in their respective hospitals, and the HEs had all relevant patient clinical information. Apart from age and gender, all clinical data was removed for this clinical validation to avoid any associated bias.

All relevant autoSCORE outputs were covered in the above studies.

7.2.3 Analytical Methods

The analytical methods used in this validation have been described in 7.2.3.1 and 7.2.3.2.

The figure 1 below shows a hierarchical representation of autoSCORE recording- and marker outputs and the conditions under which these outputs are presented in the Natus® NeuroWorks® user interface.

{15}------------------------------------------------

autoSCORE recording result		Recording probability	Recording output		Marker probability	Marker output
	Abnormal probability	0.0% - 11.5%	Normal	Focal Epi Probability	0.0% - 46.5%
		11.5% - 50.9%	Probable Normal		46.5% - 100.0%	Focal Epi Marker(s)
		50.9% - 73.5%	Probable Abnormal
		73.5% - 100.0%	Abnormal
	Focal Epi probability	0.0% - 53.4%		Gen Epi Probability	0.0% - 50.7%
		53.4% - 85.8%	Probable Focal Epi		50.7% - 100.0%	Gen Epi Marker(s)
		85.8% - 100.0%	Focal Epi
1	Gen Epi probability	0.0% - 49.2%		Focal Non-epi Probability	0.0% - 49.4%
		49.2% - 90.2%	Probable Gen Epi		49.4% - 100.0%	Focal Non-Epi Marker(s)
		90.2% - 100.0%	Gen Epi
	Focal Non-Epi probability	0.0% - 52.4%		Diffuse Non-epi Probability	0.0% - 47.7%
		52.4% - 75.1%	Probable Focal Non-Epi		47.7% - 100.0%	Diffuse Non-Epi Marker(s)
		75.1% - 100.0%	Focal Non-Epi
	Diffuse Non-Epi probability	0.0% - 48.9%
		48.9% - 88.5%	Probable Diffuse Non-Epi
		88.5% - 100.0%	Diffuse Non-Epi

Figure 1: Hierarchical representation of autoSCORE recording and marker level outputs.

7.2.3.1 Comparison of performance with HEs

Binary Metrics

The binary metrics given in Table 4, Table 6 (sensitivity, specificity, PPV and NPV) in results section were computed independently for each study part and each feature (Normal/Abnormal, Focal Epi, Gen Epi, Focal Non-Epi) with 95% symmetric confidence intervals obtained using bootstrap resampling (n ≥ 10000).

The following definitions were used for the binary metrics for the recording level outputs (where HE was used in study 1 and 5 and HE consensus in study 2):

TP – HE or HE consensus indicated that the condition is present and autoSCORE also indicates that the condition is present.

FP - HE or HE consensus indicated that the condition is not present but autoSCORE indicates that the condition is present.

TN - HE or HE consensus indicated that the condition is not present and autoSCORE also indicates that the condition is not present.

FN - HE or HE consensus indicated that the condition is present but autoSCORE indicates that the condition is not present.

For the marker level outputs the following definitions were used for study 1 and 5:

TP - A 16 second window where HE has indicated that an abnormality is present and autoSCORE also would place a marker for the same type of abnormality.

FP - A 16 second window where autoSCORE would place a marker for a specific type of abnormality and this window is either: 1) randomly extracted from an EEG that HE has

{16}------------------------------------------------

assessed as normal or 2) a 16 second window which the HE has assessed as not containing this type of abnormality.

FN- A 16 second window which HE has assessed as containing a specific type of abnormality where autoSCORE would not place a marker for this type of abnormality.

TN- A 16 second window where autoSCORE would not place a marker of a specific type of abnormality and this window is either: 1) randomly extracted from an EEG that the HE has assessed as normal or 2) a 16 second window which the HE has assessed as not containing this type of abnormality.

For the marker level outputs for study 4, validation was based on areas in the EEGs marked by autoSCORE and HE consensus markers. TP, TN, FP, and FN were derived from the resulting segmentation of the recording into areas marked only by autoSCORE, areas marked only by HE consensus and areas marked by both autoSCORE and HE consensus.

In the next step, values from the contingency matrices were used to calculate:

Sensitivity, also referred to as True Positive Rate or TPR = TP/(TP+FN) ●
Specificity, also referred to as True Negative Rate or TNR = TN/(TN + FP) ●
PPV = TP/(TP + FP) .
NPV = TN/(TN + FN)
Prevalence = (TP + FN)/(TP + FP + TN + FN) = (number of true condition positive)/(number of samples).

Probability

To validate the probability output given by autoSCORE, several HE outputs were averaged in order to obtain a probability reference for the HEs. This grouping was done in different ways for study part 1/5 and for study part 2:

Study part 1 and 5: The large number of EEG recordings allowed their grouping . depending on autoSCORE probability values, applicable both for recording level and marker level. The grouping was uniform with 10 bins from 0% to 100%, each of 10 percent-points.

. Study part 2: The large number of HEs involved in the study allowed grouping EEGs depending on probability based on the number of HEs "voting" for the presence of the respective abnormalities, applicable only for recording level. Since each EEG was rated by 11 HEs, the granularity of this grouping was 9 percent-points.

The correlation coefficients given in Table 4, Table 6 and associated p values are Pearsons Correlation calculated using the python scipy stats package with the mean autoSCORE output and mean HE assessments in each bin as described above. In study part 4, a similar discretization was performed on the markers placed. The ranges were here 100-90%, 90-80%, 80-70%, 70-60%, 60-50% and 50-0%. Only markers above threshold for user output were included, and a qualitative analysis of the number of overlapping markers with reference (HE consensus or primary predicate device) for each probability range.

Levels of abnormalities

Figure 2 below presents the strategy employed to validate categorical outputs. Categories in grey indicate areas that were not considered for determining TP, TN, FP and FN, and consequently, were not used for calculation performance parameters.

{17}------------------------------------------------

Image /page/17/Picture/1 description: The image shows the logo for Holberg EEG. The logo consists of a purple circle with a stylized waveform inside, followed by the text "HOLBERG EEG" in a simple, sans-serif font. The waveform graphic is also purple, matching the color of the circle.

Image /page/17/Figure/2 description: The image shows a comparison of high and low probability thresholds for different medical conditions. The left side of the image lists conditions such as 'Normal', 'not Epi Focal', 'not Epi Generalized', 'not Non-Epi Diffuse', and 'not Non-Epi Focal'. The right side of the image lists corresponding probable and abnormal conditions, such as 'Probable Abnormal', 'Abnormal', 'Probable Epi Focal', 'Epi Focal', 'Probable Epi Generalized', 'Epi Generalized', 'Probable Non-Epi Diffuse', 'Non-Epi Diffuse', 'Probable Non-Epi Focal', and 'Non-Epi Focal'.

Figure 2: Schematic representation of strategy employed to validate the levels of abnormalities.

7.2.3.2 Comparison of performance with predicate devices

Binary Metrics - recording level

The binary metrics given in Table 7 (accuracy, sensitivity, specificity, PPV and NPV) in the results section were computed in the same way as described above in the comparison with HE section. To allow comparison with predicate devices encevis (study part 3 and 4) and Persyst (study part 4), a number of assumptions and limitations had to be addressed:

-Only the epileptiform activity abnormalities can be validated against the predicate device. Focal Epi and Gen Epi parts of the autoSCORE output were merged.
Since predicate devices are not designed to give recording level output, rules of interpretation had to be applied. If at least one spike is generated by the predicate device, the EEG recording is classified as Abnormal.

Probability level of abnormality – marker level

Only Focal Epi and Gen Epi markers placed by autoSCORE could be compared with the placement of the encevis spikes in study part 4.

{18}------------------------------------------------

7.2.4 Results of Performance Evaluation

7.2.4.1 Comparison of performance with HEs

Summary results obtained by the abovementioned methods are presented in Table 4 below. Tveit et al. [1] describes in detail the agreement between HEs and autoSCORE in comparison to HE-HE agreement.

Table 4: Performance results for autoSCORE classification of Abnormal EEG at recording-level based on comparison with HE assessment. Reference standards for each study part are discussed in a previous section of this document.

AgeGroup	StudyPart	Sensitivity(%)	Specificity (%)	PPV (%)	NPV (%)	Correlationcoefficient (p-value)
All ages	Part 2(n=100)	100 [100,100]	88.4 [77.8,97.4]	92.0[84.5,98.3]	100 [100.0,100.0]	0.96 (p < 10-5)
	Part 1(n=4850)	83.1 [81.3,84.8]	91.8 [90.8,92.8]	84.9 [83.2,86.6]	90.8 [89.8,91.8]	0.99 (p < 10-5)
	Part 5(n=1315)	87.8 [85.0,90.5]	89.4 [87.2,91.6]	86.0 [83.0,88.8]	90.9 [88.8,92.9]	0.99 (p < 10-5)
Adult	Part 2(n=57)	100.0 [100.0, 84.0 [68.0,100.0]	96.4]	88.9 [77.5,97.4]	100.0 [100.0,100.0]	0.95 (p < 10-5)
	Part 1(n=3360)	85.0 [83.1,86.8]	89.2 [87.8,90.6]	86.2 [84.4,88.0]	88.2 [86.7,89.6]	0.98 (p < 10-5)
	Part 5(n=848)	90.5 [87.6,93.2]	84.6 [81.1,87.9]	85.2 [81.9,88.4]	90.1 [87.0,92.9]	0.99 (p < 10-5)
Pediatric	Part 2(n=43)	100.0 [100.0, 94.4 [81.2,100.0]	100.0]	96.1 [87.1,100.0]	100.0 [100.0,100.0]	0.94 (p < 10-5)
	Part 1	71.6 [65.9,77.1]	95.8 [94.6,96.8]	76.3 [70.6,81.8]	94.6 [93.4,95.8]	0.96 (p < 10-5)
	Part 5(n=467) B	79.7 [72.8,86.2]	95.7 [93.4,97.8]	88.7 [82.7,94.0]	91.8 [88.9,94.6]	0.95 (p < 10-5)

A results can be affected by low prevalence of 16%

B results can be affected by low prevalence of 29.6%

In addition to performance metrics results reported in Table 4, similar results were obtained to validate all categorical outputs (Normal vs Probable Normal vs Probable Abnormal vs Abnormal). The definition of the thresholds can be found in Figure 1 and the logic followed during validation of each category is presented in Figure 2 in 7.2.3 Performance evaluation measures. The categories shown in light grey in Figure 2 were omitted during validation of the respective performance metrics. It was shown that most of the autoSCORE results will be defined by the most confident Abnormal or Normal classification with PPV for all age groups in all studies equal to or higher than 90%. For the lower confidence levels Probable Abnormal and Probable Normal corresponding PPV are lower or equal, ranging from 80.3% to 100%.

{19}------------------------------------------------

Summary results for the four types of abnormalities obtained by the abovementioned methods are presented in Table 5 below. Table 5 does not include results of the different age groups, but rather focuses on reporting the overall results (adult and pediatric patients combined) for each study part as no significant age-related differences were found. Tveit et al. [1] describes in detail the agreement between HE assessment and autoSCORE results in comparison to HE-HE agreement.

Table 5: Performance results for autoSCORE classification of abnormality types at recording-level based on comparison with HE assessment. Reference standards for each study part are discussed in a previous

section of this document.
AbnormalityType	StudyPart	Sensitivity(%)	Specificity(%)	PPV (%)	NPV (%)	Prevalence	Correlationcoefficient (p-value)
Focal Epi	Part 2(n=100)	73.9 [54.5,91.3]	88.3 [80.8,94.9]	65.4 [45.8,83.3]	91.9 [85.1,97.4]	23.0%	0.85(p = 0.00091)
	Part 1(n=4850)	62.5 [55.9,69.0]	95.0 [94.3,95.6]	35.8 [30.9,40.8]	98.3 [97.9,98.6]	4.3 %	0.88(p = 0.00089)
	Part 5(n=1315)	66.5 [59.8,73.1]	93.9 [92.5,95.2]	64.8 [58.1,71.4]	94.3 [92.9,95.6]	14.5 %	0.96(p < 10-5)
Gen Epi	Part 2(n=100)	100.0 [100.0,100.0]	94.1 [88.6,98.8]	75.1 [54.5,93.8]	100.0 [100.0,100.0]	15,0 %	0.83(p = 0.003)
	Part 1(n=4850)	71.2 [62.3,79.8]	98.3 [97.9,98.6]	47.4 [39.7,55.2]	99.4 [99.1,99.6]	2.1 %	0.88(p = 0.00067)
	Part 5(n=848)	72.3 [62.0,82.2]	96.1 [95.0,97.2]	53.4 [43.8,63.2]	98.3 [97.5,98.9]	5.8 %	0.79(p = 0.0066)
Diff Non-Epi	Part 2(n=100)	87.5 [72.7,100.0]	82.8 [74.0,90.9]	61.7 [44.8,78.0]	95.5 [89.7,100.0]	24.0 %	0.93(p < 10-5)
	Part 1(n=4850)	65.2 [62.5,67.9]	94.5 [93.7,95.2]	79.1 [76.5,81.6]	89.4 [88.4,90.4]	24.3 %	0.98(p < 10-5)
	Part 5(n=467)	79.4 [74.5,84.0]	94.0 [92.6,95.4]	77.9 [73.0,82.7]	94.5 [93.1,95.8]	21.0 %	0.97(p < 10-5)
Focal Non-Epi	Part 2(n=100)	61.5 [42.1,80.0]	93.2 [86.8,98.6]	76.1 [56.2,94.1]	87.4 [79.5,94.1]	26.0 %	0.84(p = 0.00056)
	Part 1(n=4850)	65.2 [60.9,69.5]	88.4 [87.4,89.3]	38.3 [35.0,41.7]	95.8 [95.2,96.4]	10.0 %	0.97(p < 10-5)
	Part 5(n=467)	73.1 [67.4,78.6]	89.6 [87.7,91.4]	61.2 [55.6,66.9]	93.7 [92.1,95.1]	18.4 %	0.98(p < 10-5)

In addition to performance metrics results reported in Table 5, similar results were obtained to validate all categorical outputs (abnormality type vs Probable abnormality type). The definition of thresholds can be found in Figure 1, and the logic followed during validation of each category is presented in Figure 2 in 7.2.3 Performance evaluation measures. The categories shown in light grey in Figure 2 were omitted during validation of the respective performance metrices. It was shown that most of the autoSCORE results will be defined by the most confident categories (Focal Epi, Generalized Epi, Diffuse Non-Epi and Focal Non-Epi) -with similar or greater PPV than for the less confident categories.

Types of abnormality – Marker level

Marker placement and correct type assignment were evaluated as part of Studies 1, 4, and 5. In study part 4, overlap of autoSCORE markers with areas indicated by HE

{20}------------------------------------------------

consensus as epileptiform discharges was validated. In study parts 1 and 5 it was validated if autoSCORE places markers in the same areas of EEG where previously HEs did. HEs also provided information about the type of abnormality assigned to the marker. Performance metrics calculated based on these results are provided in Table 6 together with the prevalence and correlation coefficient. autoSCORE places more markers in EEG recordings (declared as Abnormal on the recording level) than HEs, who typically select the most representative examples instead of all abnormality examples present in the recording. Consequently, the correlation coefficients and performance metrics values are lower for marker level abnormalities than recording level abnormalities leading to lower agreement between HEs and autoSCORE markers with assigned probability values below 90%. autoSCORE markers with high probability values indicate high agreement with markers inserted by HEs. Therefore, markers assigned with the highest level of confidence are most likely to indicate presence of each type of abnormality.

Table 6: Performance results for autoSCORE classification of abnormality types at the marker level
based on comparison with HE assessment. Reference standards for each study part are discussed in
a previous section of this document.
AbnormalityType	Study Part	Sensitivity(%)	Specificity(%)	PPV (%)	NPV (%)	Prevalence	Correlationcoefficient(p-value)
Focal Epi +GeneralizedEpi	Part 4(n=509)	87.3 [80.4,92.7]	61.3 [56.1,67.2]	38.9 [28.1,50.4]	94.6 [92.1,96.7]	22.0%	Qualitativenotquantitative
Focal Epi	Part 1(n=179700)	58.0 [57.0,59.0]	98.0 [98.0,98.1]	62.6 [61.6,63.7]	97.6 [97.6,97.7]	5.40%	0.88(p = 0.0018)
	Part 5(n=29989)	62.7 [61.1,64.3]	96.6 [96.4,96.9]	71.6 [70.1,73.2]	95.1 [94.8,95.3]	11.90%	0.95 (p <10-5)
GeneralizedEpi	Part 1(n=168441)	50.4 [49.2,51.8]	99.6 [99.6,99.6]	81.5 [80.2,82.8]	98.3 [98.2,98.4]	3.30%	0.81 (p =0.0049)
	Part 5(n=27521)	68.0 [65.3,70.8]	99.4 [99.3,99.5]	82.5 [79.9,84.9]	98.7 [98.5,98.8]	4.00%	0.92 (p =0.00016)
Diffuse Non-Epi	Part 1(n=204119)	52.7 [52.2,53.2]	94.2 [94.1,94.3]	69.6 [69.1,70.2]	88.7 [88.5,88.8]	20.20%	0.95 (p <10-5)
	Part 5(n=28986)	68.3 [66.5,70.1]	94.3 [94.0,94.6]	53.7 [52.0,55.4]	96.8 [96.6,97.1]	8.80%	0.89 (p =0.0005)
Focal Non-Epi	Part 1(n=179700)	63.0 [62.3,63.8]	93.8 [93.7,93.9]	51.3 [50.6,52.0]	96.1 [96.0,96.2]	9.40%	0.99 (p <10-5)
	Part 5(n=30168)	70.0 [68.5,71.4]	92.5 [92.2,92.8]	57.0 [55.6,58.5]	95.6 [95.3,95.8]	12.40%	0.93 (p =0.00012)

7.2.4.2 Comparison of performance with predicate devices

Recording level

Summary results for study part 3 and 4 where autoSCORE's performance was compared with predicate devices are presented below in Table 7.

Table 7: Performance results for autoSCORE classification of Focal Epi and Gen Epi EEG at the recording-level and predicate devices based on comparison with HE assessment. Reference standards

{21}------------------------------------------------

for each study part are discussed in a previous section of this document.
Age Group	Study part	Device	Accuracy (%)	Sensitivity (%)	Specificity (%)	NPV (%)	PPV (%)
All ages	Part 3	autoSCORE	88.0 [81.0,94.0]	90.0 [77.8,100.0]	87.1 [78.8,94.4]	95.3 [89.5,100.0]	75.0 [60.0,88.9]
		encevis	48.0 [38.0,58.0]	96.7 [88.9,100.0]	27.2 [17.1,37.9]	95.0 [83.3,100.0]	36.3 [25.9,46.9
	Part 4	autoSCORE	94.8 [87.9,100.0]	93.3 [83.3,100.0]	96.4 [88.0,100.0]	93.1 [82.6,100.0]	96.6 [88.5,100.0]
		encevis	56.9 [44.8,69.0]	100.0[100.0,100.0]	10.7 [0.0,23.3]	100.0[100.0,100.0]	54.5 [41.1,67.9]
		Persyst	53.5 [41.4,65.5]	100.0[100.0,100.0]	3.6 [0.0,12.0]	100.0[100.0,100.0]	52.6 [39.7,65.5]
Adult	Part 3	autoSCORE	86.0 [77.2,94.7]	84.6 [61.5,100.0]	86.4 [75.6,95.6]	95.0 [87.2,100.0]	64.7 [40.0,87.0]
		encevis	40.4 [28.1,52.6]	92.3 [75.0,100.0]	25.0 [12.8,38.5]	91.6 [72.7,100.0]	26.7 [14.3,40.0]
	Part 4	autoSCORE	93.3 [84.4,100.0]	92.6 [81.5,100.0]	94.4 [81.2,100.0]	89.5 [73.7,100.0]	96.1 [87.5,100.0]
		encevis	66.7 [53.3,80.0]	100.0[100.0,100.0]	16.7 [0.0,35.7]	100.0[100.0,100.0]	64.3 [50.0,78.6]
Pediatric	Part 3	autoSCORE	90.7 [81.4,97.7]	94.1 [80.0,100.0]	88.5 [75.0,100.0]	95.8 [86.2,100.0]	84.2 [66.7,100.0]
		encevis	58.2 [44.2,72.1]	100.0[100.0,100.0]	30.8 [13.6,50.0]	100.0[100.0,100.0]	48.6 [32.3,65.0]
	Part 4	autoSCORE	100.0[100.0,100.0]	100.0[100.0,100.0]	100.0[100.0,100.0]	100.0[100.0,100.0]	100.0 [100.0,100.0]
		encevis	23.1 [0.0,46.2]	100.0[100.0,100.01	0.0 [0.0,0.0]		23.1 [0.0,46.2]

HOLBERG EEG

Marker level

Overview of overlap of autoSCORE's Focal Epi and Gen Epi markers with encevis spikes dependent on probability assigned to the autoSCORE marker is presented below in Table 8. The highest agreement between both devices was obtained for markers with assigned higher probability. Therefore, markers assigned with the highest level of confidence are most likely to indicate presence of abnormality.

Table 8: autoSCORE markers classified as Focal Epi and Gen Epi in study 4 and their overlap withspikes generated by encevis.
Agegroup	Probabilityrange ofautoSCOREmarkers	Number ofautoSCOREmarkers	Number of autoSCOREmarkers overlappingwith encevis spikes	Procentage ofautoSCORE markersoverlapping with encevisspikes
All	0-50	15	2	13%

{22}------------------------------------------------

Image /page/22/Picture/1 description: The image shows the logo for Holberg EEG. The logo consists of a purple icon on the left and the text "HOLBERG EEG" on the right. The icon is a stylized representation of brain waves inside of a circle.

ages
	50-60	57	11	19%
	60-70	59	19	32%
	70-80	112	37	33%
	80-90	93	50	54%
	90-100	173	149	86%
Adult	0-50	11	1	9%
	50-60	46	8	17%
	60-70	47	17	36%
	70-80	79	29	37%
	80-90	72	38	53%
	90-100	150	128	85%
Allages	0-50	4	1	25%
	50-60	11	3	27%
	60-70	12	2	17%
	70-80	33	8	24%
	80-90	21	12	57%
	90-100	23	21	91%

8. Biocompatibility, Electrical Safety, Electromagnetic Compatibility (EMC), and Mechanical Safety

autoSCORE is a software-only device. Biocompatibility, electrical safety, electromagnetic compatibility, and mechanical safety are not applicable to this device.

9. Statement of Substantial Equivalence

Since the predicate devices were cleared based in part on the results of clinical studies, and given the differences in device outputs, clinical testing was required to support substantial equivalence.

The non-clinical data support the safety of the device and the software verification and validation demonstrate that the autoSCORE device should perform as intended in the specified use conditions. The clinical data demonstrate that the subject device (autoSCORE) performs as well as the predicate devices that are currently marketed for the same intended use.

Therefore, autoSCORE is substantially equivalent to predicate devices in intended use. autoSCORE has some technological features that differ from predicate devices. Any differences between the subject and predicate device have no significant influence on safety or effectiveness. autoSCORE is at least as safe and effective as the legally marketed predicate devices, as established through performance testing. Therefore, the evidence provided demonstrates that autoSCORE is substantially equivalent to the predicate devices.

10. References

{23}------------------------------------------------

Image /page/23/Picture/1 description: The image shows the logo for Holberg EEG. The logo consists of a purple circle with a stylized waveform inside, followed by the text "HOLBERG EEG" in a sans-serif font. The text is black and positioned to the right of the circular icon.

1. Tveit, J., et al., Automated Interpretation of Clinical Electroencephalograms Using Artificial Intelligence. JAMA Neurology, 2023.

Regulation Number and Section

§ 882.1400 Electroencephalograph.

(a)
Identification. An electroencephalograph is a device used to measure and record the electrical activity of the patient's brain obtained by placing two or more electrodes on the head.(b)
Classification. Class II (performance standards).