K Number
K232535
Date Cleared
2023-12-22

(123 days)

Product Code
Regulation Number
892.1000
Panel
RA
Reference & Predicate Devices
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

The MAGNETOM system is indicated for use as a magnetic resonance diagnostic device (MRDD) that produces transverse, sagittal, coronal and oblique cross sectional images, spectroscopic images and/or spectra, and that displays the internal structure and/or function of the head, body, or extremities. Other physical parameters derived from the images and/or spectra may also be produced. Depending on the region of interest, contrast agents may be used. These images and/or spectra and the physical parameters derived from the images and/or spectra when interpreted by a trained physician yield information that may assist in diagnosis.

The MAGNETOM system may also be used for imaging during interventional procedures when performed with MR compatible devices such as in-room displays and MR Safe biopsy needles.

Device Description

The subject devices, MAGNETOM Sola and MAGNETOM Altea with software syngo MR XA61A, consist of new and modified software and hardware that is similar to what is currently offered on the predicate device, MAGNETOM Sola with syngo MR XA51A (K221733).

A high-level summary of the new and modified hardware and software is provided below:

Hardware
Modified Hardware:

  • Host computers ((syngo MR Acquisition Workplace (MRAWP) and syngo MR Workplace (MRWP))
  • MaRS (Measurement and Reconstruction System) computer – for MAGNETOM Sola only
  • myExam 3D Camera

Software
New Features and Applications:

  • GRE_PC
  • Physiologging
  • Deep Resolve Boost HASTE
  • Deep Resolve Boost EPI Diffusion
  • Complex Averaging
  • myExam Implant Suite

Modified Features and Applications:

  • OpenRecon Framework.
  • BEAT_nav (re-naming only).
  • Low SAR Protocols – SAR adoptive MR protocols to perform knee, spine, heart and brain examinations with 50% of the max allowed SAR values in normal mode for head and whole-body SAR.
AI/ML Overview

The provided text describes the Siemens Medical Solutions USA, Inc. MAGNETOM Sola and MAGNETOM Altea with software syngo MR XA61A, which are Magnetic Resonance Diagnostic Devices (MRDD). The submission (K232535) claims substantial equivalence to a predicate device (MAGNETOM Sola with syngo MR XA51A, K221733).

Based on the provided information, the acceptance criteria and study details for the AI features (Deep Resolve Boost and Deep Resolve Sharp) are as follows:

1. Table of Acceptance Criteria and Reported Device Performance

FeatureAcceptance Criteria (Stated)Reported Device Performance and Metrics
Deep Resolve BoostThe impact of the network has been characterized by several quality metrics such as peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM). Most importantly, the performance was evaluated by visual comparisons to evaluate e.g., aliasing artifacts, image sharpness and denoising levels.Performance was evaluated by visual comparisons to evaluate aliasing artifacts, image sharpness, and denoising levels, in addition to quantitative metrics like PSNR and SSIM. The document states, "The results from each set of tests demonstrate that the devices perform as intended and are thus substantially equivalent to the predicate device to which it has been compared," implying these metrics met the internal acceptance criteria for substantial equivalence. No specific numerical thresholds are provided.
Deep Resolve SharpThe impact of the network has been characterized by several quality metrics such as peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and perceptual loss. In addition, the feature has been verified and validated by inhouse tests. These tests include visual rating and an evaluation of image sharpness by intensity profile comparisons of reconstructions with and without Deep Resolve Sharp.Performance was evaluated by visual rating and intensity profile comparisons for image sharpness, along with quantitative metrics like PSNR, SSIM, and perceptual loss. The document states, "The results from each set of tests demonstrate that the devices perform as intended and are thus substantially equivalent to the predicate device to which it has been compared," implying these metrics met the internal acceptance criteria for substantial equivalence. No specific numerical thresholds are provided.

2. Sample Size Used for the Test Set and Data Provenance

  • Deep Resolve Boost:
    • TSE: more than 25,000 slices (implicitly for both training/validation and testing, as no separate test set is explicitly mentioned).
    • HASTE: more than 10,000 HASTE slices (refined from TSE dataset).
    • EPI Diffusion: more than 1,000,000 slices.
    • Data Provenance: Retrospective creation from acquired datasets. The data "covered a broad range of body parts, contrasts, fat suppression techniques, orientations, and field strength." Country of origin is not specified but given the manufacturer (Siemens Healthcare GmbH, Germany, and Siemens Shenzhen Magnetic Resonance LTD, China) and typical medical device development, it likely includes international data.
  • Deep Resolve Sharp:
    • 2D images: more than 10,000 high resolution 2D images.
    • Data Provenance: Retrospective creation from acquired datasets. The data "covered a broad range of body parts, contrasts, fat suppression techniques, orientations, and field strength." Country of origin is not specified.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

The document does not specify the number of experts or their qualifications for establishing ground truth for the test set specifically. It mentions that "visual comparisons" and "visual rating" were part of the evaluation for both Deep Resolve Boost and Deep Resolve Sharp, which implies human expert review. However, details about these experts are not provided.

4. Adjudication Method for the Test Set

The document does not explicitly state an adjudication method (e.g., 2+1, 3+1). It refers to "visual comparisons" and "visual rating" as part of the evaluation, which suggests expert review, but the process for resolving disagreements or reaching consensus is not detailed.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, If So, What Was the Effect Size of How Much Human Readers Improve with AI vs Without AI Assistance

No MRMC comparative effectiveness study involving human readers with and without AI assistance is reported for the substantial equivalence submission. The non-clinical tests focus on performance metrics and visual comparisons of image quality produced by the AI feature versus predicate device features. The "Publications" section lists several clinical feasibility studies, but these are noted as "referenced to provide information" and are not direct evidence of human reader improvement with AI for this specific submission's evaluation. The submission states, "No clinical tests were conducted to support substantial equivalence for the subject devices."

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

Yes, standalone performance was evaluated through quantitative image quality metrics (PSNR, SSIM, perceptual loss) and direct comparison of images produced by the AI-enhanced sequences against the predicate device's features. The "Test Statistics and Test Results Summary" for both Deep Resolve Boost and Deep Resolve Sharp detail these algorithm-only evaluations.

7. The Type of Ground Truth Used

The ground truth for both Deep Resolve Boost and Deep Resolve Sharp was established from acquired datasets (raw MRI data). This data was then retrospectively manipulated to create input data for the models:

  • Deep Resolve Boost: Input data was "retrospectively created from the ground truth by data manipulation and augmentation," including undersampling k-space lines, lowering SNR, and mirroring k-space data. The acquired datasets themselves "represent the ground truth for the training and validation."
  • Deep Resolve Sharp: Input data was "retrospectively created from the ground truth by data manipulation," specifically by cropping k-space data to use only the center part, which created corresponding low-resolution input data and high-resolution output/ground truth data. The acquired datasets "represent the ground truth for the training and validation."

Essentially, the "ground truth" refers to the high-quality, fully sampled/non-accelerated raw or reconstructed MRI data from which the training and validation inputs were derived.

8. The Sample Size for the Training Set

The sample sizes mentioned under "Training and Validation data" are implicitly for training, as they refer to the datasets from which both training and validation data were derived:

  • Deep Resolve Boost:
    • TSE: more than 25,000 slices
    • HASTE: more than 10,000 HASTE slices (refined)
    • EPI Diffusion: more than 1,000,000 slices
  • Deep Resolve Sharp:
    • more than 10,000 high resolution 2D images

9. How the Ground Truth for the Training Set Was Established

The ground truth for the training set was established from acquired datasets (raw MRI data). As explained in point 7, this involved:

  • Deep Resolve Boost: Using the acquired datasets as the "ground truth." Input data for training was then generated by manipulating this ground truth (e.g., undersampling, adding noise).
  • Deep Resolve Sharp: Using the acquired datasets as the "ground truth." Input data was then generated by manipulating the k-space data of the ground truth to create corresponding low-resolution inputs and high-resolution ground truth outputs for the model.

§ 892.1000 Magnetic resonance diagnostic device.

(a)
Identification. A magnetic resonance diagnostic device is intended for general diagnostic use to present images which reflect the spatial distribution and/or magnetic resonance spectra which reflect frequency and distribution of nuclei exhibiting nuclear magnetic resonance. Other physical parameters derived from the images and/or spectra may also be produced. The device includes hydrogen-1 (proton) imaging, sodium-23 imaging, hydrogen-1 spectroscopy, phosphorus-31 spectroscopy, and chemical shift imaging (preserving simultaneous frequency and spatial information).(b)
Classification. Class II (special controls). A magnetic resonance imaging disposable kit intended for use with a magnetic resonance diagnostic device only is exempt from the premarket notification procedures in subpart E of part 807 of this chapter subject to the limitations in § 892.9.