K Number
K220815
Device Name
BrainInsight
Manufacturer
Date Cleared
2022-07-19

(120 days)

Product Code
Regulation Number
892.2050
Panel
RA
Reference & Predicate Devices
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

BrainInsight is intended for automatic labeling, spatial measurement, and volumetric quantification of brain structures from a set of low-field MR images and returns annotated and segmented images, color overlans and reports.

Device Description

BrainInsight is a fully automated MR imaging post-processing medical software that provides image alignment, whole brain segmentation, ventricle segmentation, and midline shift measurements of brain structures from a set of MR images from patients ages 18 years or older. The BrainInsight processing architecture includes a proprietary automated internal pipeline based on machine learning tools. The output annotated and segmented images are provided in standard image format using segmented color overlays and reports that can be displayed on third-party workstations and FDA-cleared Picture Archive and Communications Systems (PACS).

The modified BrainInsight described in this submission includes changes to the machine learning models to allow for the processing Al-reconstructed low-field MR images. The modified device also includes configuration updates and refactoring changes for incremental improvement.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study details for the BrainInsight device, based on the provided text:

BrainInsight Acceptance Criteria and Study Details

1. Table of Acceptance Criteria and Reported Device Performance

For Midline Shift:

ApplicationAcceptance Criteria (Error Range)Reported Device Performance (Mean Absolute Error)
Midline Shift"no worse than the average annotator discrepancy" (non-inferiority)T1 Error: 1.03 mm
T2 Error: 0.97 mm

For Lateral Ventricles and Whole Brain Segmentation (Dice Overlap):

ApplicationAcceptance Criteria (Dice Overlap)Reported Device Performance (Dice Overlap [%])
T1 Left Ventricle"no worse than the average annotator discrepancy" (non-inferiority)84
T1 Right Ventricle"no worse than the average annotator discrepancy" (non-inferiority)82
T1 Whole Brain"no worse than the average annotator discrepancy" (non-inferiority)95
T2 Left Ventricle"no worse than the average annotator discrepancy" (non-inferiority)81
T2 Right Ventricle"no worse than the average annotator discrepancy" (non-inferiority)79
T2 Whole Brain"no worse than the average annotator discrepancy" (non-inferiority)96

For Lateral Ventricles and Whole Brain Segmentation (Volume Differences):

ApplicationAcceptance Criteria (Volume Differences)Reported Device Performance (Volume Differences [%])
T1 Left Ventricle"no worse than the average annotator discrepancy" (non-inferiority)8
T1 Right Ventricle"no worse than the average annotator discrepancy" (non-inferiority)7
T1 Whole Brain"no worse than the average annotator discrepancy" (non-inferiority)3
T2 Left Ventricle"no worse than the average annotator discrepancy" (non-inferiority)11
T2 Right Ventricle"no worse than the average annotator discrepancy" (non-inferiority)19
T2 Whole Brain"no worse than the average annotator discrepancy" (non-inferiority)5

2. Sample Size Used for the Test Set and Data Provenance

The document does not explicitly state the numerical sample size for the test set. It mentions the distribution of categories:

  • Age: Min: 19, Max: 77
  • Gender: 59% Female / 41% Male
  • Pathology: Stroke (Infarct), Hydrocephalus, Hemorrhage (SAH, SDH, IVH, IPH), Mass/Edema, Tumor, Multiple sclerosis.

Data Provenance: The images were acquired from "multiple sites" using the "FDA cleared Hyperfine Swoop Portable MR imaging system." It is implied to be retrospective as data collection occurred before the testing. The country of origin is not specified but is likely the US given the FDA submission.

3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications

The text states that "Ground truth for midline shift was determined based on the average shift distance of all annotators" and "Ground truth for segmentation is calculated using Simultaneous Truth and Performance Level Estimation (STAPLE)." It also mentions that "The datasets were annotated by multiple experts." However, the exact number of experts used for the test set's ground truth and their specific qualifications (e.g., "radiologist with 10 years of experience") are not explicitly stated.

4. Adjudication Method for the Test Set

The ground truth for midline shift was determined by the average shift distance of all annotators. For segmentation, the Simultaneous Truth and Performance Level Estimation (STAPLE) method was used. This implies a form of consensus-based adjudication, but not a strict numerical rule like 2+1 or 3+1. STAPLE is a probabilistic approach to estimate a true segmentation from multiple expert segmentations while simultaneously estimating the performance level of each expert.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

The document describes a standalone performance study of the algorithm against expert annotations, but does not mention a multi-reader multi-case (MRMC) comparative effectiveness study where human readers' performance with and without AI assistance is compared. Therefore, no effect size of human improvement with AI assistance is provided.

6. Standalone (Algorithm Only Without Human-in-the-Loop Performance) Study

Yes, a standalone performance study was conducted. The device's performance (Midline Shift, Dice Overlap, Volume Differences) was evaluated directly against a ground truth established by annotators, and the results were compared to the average annotator discrepancy to demonstrate non-inferiority. This is a standalone evaluation of the algorithm's performance.

7. Type of Ground Truth Used

The ground truth used was expert consensus.

  • For midline shift, it was based on the "average shift distance of all annotators."
  • For segmentation, it was calculated using "Simultaneous Truth and Performance Level Estimation (STAPLE)" from multiple expert annotations.

8. Sample Size for the Training Set

The document does not explicitly state the numerical sample size for the training set. It only mentions that "Each model was trained using a training dataset to optimize parameters" and "The data collection for the training and validation datasets were done at multiple sites."

9. How the Ground Truth for the Training Set Was Established

The ground truth for the training set was established through expert annotation. The text states: "The datasets were annotated by multiple experts. The entire group of training image sets was divided into segments and each segment was given to a single expert. The expert's determination became the ground truth for each image set in their segment."

§ 892.2050 Medical image management and processing system.

(a)
Identification. A medical image management and processing system is a device that provides one or more capabilities relating to the review and digital processing of medical images for the purposes of interpretation by a trained practitioner of disease detection, diagnosis, or patient management. The software components may provide advanced or complex image processing functions for image manipulation, enhancement, or quantification that are intended for use in the interpretation and analysis of medical images. Advanced image manipulation functions may include image segmentation, multimodality image registration, or 3D visualization. Complex quantitative functions may include semi-automated measurements or time-series measurements.(b)
Classification. Class II (special controls; voluntary standards—Digital Imaging and Communications in Medicine (DICOM) Std., Joint Photographic Experts Group (JPEG) Std., Society of Motion Picture and Television Engineers (SMPTE) Test Pattern).