K Number
K232305
Date Cleared
2023-10-23

(83 days)

Product Code
Regulation Number
892.2050
Panel
RA
Reference & Predicate Devices
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

AI-Rad Companion Brain MR is a post-processing image analysis software that assists clinicians in viewing, analyzing, and evaluating MR brain images.

Device Description

AI-Rad Companion Brain MR VA50 is an enhancement to the predicate, AI-Rad Companion Brain MR VA40 (K213706). Just as in the predicate, the brain morphometry feature of AI-Rad Companion Brain MR addresses the automatic quantification and visual assessment of the volumetric properties of various brain structures based on T1 MPRAGE datasets. From a predefined list of brain structures (e.g. Hippocampus, Caudate, Left Frontal Gray Matter, etc.) volumetric properties are calculated as absolute and normalized volumes with respect to the total intracranial volume. The normalized values are compared against age-matched mean and standard deviations obtained from a population of healthy reference subjects. The deviation from this reference population can be visualized as 3D overlay map or out-of-range flag next to the quantitative values.

Additionally, identical to the predicate, the white matter hyperintensities feature addresses the automatic quantification and visual assessment of white matter hyperintensities on the basis of T1 MPRAGE and T2 weighted FLAIR datasets. The detected WMH can be visualized as a 3D overlay map and the quantification in count and volume as per 4 brain regions in the report.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study details for the AI-Rad Companion Brain MR device, based on the provided FDA 510(k) summary:

Acceptance Criteria and Device Performance

MetricAcceptance CriteriaReported Performance (AVG)95% CIStandard Deviation (STD)
Volumetric Segmentation AccuracyPCC >= 0.770.94 PCCC[0.83, 0.98]n.a.
Voxel-wise Segmentation AccuracyMean Dice score >= 0.470.50 Dice[0.42, 0.57]0.22
WMH Change Region-wise Segmentation AccuracyMedian F1-score >= 0.690.69 F1-score[0.633, 0.733]0.13

Study Details

  1. Sample Size and Data Provenance:

    • Test Set Sample Size: 75 subjects / 150 studies (2 scans per subject).
    • Data Provenance: The data originate from a mix of retrospective and potentially prospective sources, from both the US and Europe:
      • UPenn (US): 15 subjects
      • ADNI (US): 15 subjects
      • Lausanne (EU): 22 subjects
      • Prague (EU): 23 subjects
      • Medical Indication: 60 Multiple Sclerosis (MS) patients, 15 Alzheimer's (AD) patients.
      • Age Range: 25-88 years.
      • Gender Distribution: 56 females, 19 males.
      • Scanner Info: Siemens 3.0T MR scanners, T1w MPRAGE and T2w FLAIR scan protocols.
  2. Number of Experts and Qualifications for Ground Truth:

    • The document states that for each dataset, three sets of ground truth were manually annotated. Each set was annotated by a "disjoint group of annotator, reviewer, and clinical expert."
    • For the initial annotation and review, "in-house annotators" and "in-house reviewers" were used.
    • For final review and correction, a "clinical expert" was used, randomly assigned per case to minimize bias.
    • Specific qualifications (e.g., years of experience, board certification) for these experts are not explicitly stated in the provided text, beyond being "clinical experts."
  3. Adjudication Method for Test Set:

    • The ground truth process involved a multi-step adjudication. For each test dataset:
      1. Three initial annotations by three different in-house annotators.
      2. Each initial annotation was reviewed by an in-house reviewer.
      3. Each initial annotation (after in-house review) was reviewed by a reference clinical expert.
      4. If corrections by the clinical expert were "significant and time-consuming," they were communicated back to the annotator for correction and then re-reviewed.
    • This resembles a form of iterative consensus building and expert adjudication, where multiple initial annotations are refined through reviewer and expert input, rather than a strict N+1 or N+M voting system for final ground truth, though the final decision appears to rest with the clinical expert.
  4. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:

    • No MRMC study was done. The document explicitly states: "The predicate (K213706) was not validated using clinical tests and therefore no clinical tests were conducted to test the performance and functionality of the modifications introduced within AI-Rad Companion Brain MR."
    • The validation focused on standalone algorithmic performance compared to expert-established ground truth and comparison against a "reference device" (icobrain) using equivalent validation methodology for the WMH follow-up feature.
    • Therefore, there's no reported effect size of human readers improving with AI vs. without AI assistance.
  5. Standalone (Algorithm Only) Performance Study:

    • Yes, a standalone performance study was conducted for the WMH follow-up feature. The acceptance criteria and reported performance metrics (PCC, Dice, F1-score) are for the algorithm's performance against the established ground truth.
  6. Type of Ground Truth Used:

    • The ground truth for the White Matter Hyperintensities (WMH) Follow-Up Feature was established through expert consensus and manual annotation. It involved a "disjoint group of annotator, reviewer, and clinical expert" for each ground truth dataset. The clinical expert performed the final review and correction.
  7. Training Set Sample Size:

    • The document states: "The training data used for the fine tuning the hyper parameters of WMH follow-up algorithm is independent of the data used to test the white matter hyperintensity algorithm follow up algorithm."
    • However, the specific sample size for the training set is not provided in the given text. It mentions independent training data but does not quantify it.
  8. How Ground Truth for Training Set was Established:

    • The document mentions that training data was used for "fine tuning the hyper parameters." While it implies that the training data would also require ground truth, the method for establishing ground truth for the training set is not explicitly described in the provided text. It only states that the training data was "independent" of the test data. Given the "WMH follow-up algorithm does not include any machine learning component," the type of "training" might refer to calibration or rule optimization rather than machine learning model training in the conventional sense, and subsequently, how ground truth for that calibration was established is not detailed.

§ 892.2050 Medical image management and processing system.

(a)
Identification. A medical image management and processing system is a device that provides one or more capabilities relating to the review and digital processing of medical images for the purposes of interpretation by a trained practitioner of disease detection, diagnosis, or patient management. The software components may provide advanced or complex image processing functions for image manipulation, enhancement, or quantification that are intended for use in the interpretation and analysis of medical images. Advanced image manipulation functions may include image segmentation, multimodality image registration, or 3D visualization. Complex quantitative functions may include semi-automated measurements or time-series measurements.(b)
Classification. Class II (special controls; voluntary standards—Digital Imaging and Communications in Medicine (DICOM) Std., Joint Photographic Experts Group (JPEG) Std., Society of Motion Picture and Television Engineers (SMPTE) Test Pattern).