Search Results

The MAGNETOM system is indicated for use as a magnetic resonance diagnostic device (MRDD) that produces transverse, sagittal, coronal and oblique cross-sectional images, spectroscopic images and/or spectra, and that displays the internal structure and/or function of the head, body, or extremities. Other physical parameters derived from the images and/or spectra may also be produced. Depending on the region of interest, contrast agents may be used. These images and/or spectra and the physical parameters derived from the images and/or spectra when interpreted by a trained physician yield information that may assist in diagnosis.

The MAGNETOM system may also be used for imaging during interventional procedures when performed with MR compatible devices such as in-room displays and MR Safe biopsy needles.

Device Description

The subject devices, MAGNETOM Aera (including MAGNETOM Aera Mobile), MAGNETOM Skyra, MAGNETOM Prisma, MAGNETOM Prisma™, MAGNETOM Vida, MAGNETOM Lumina with software syngo MR XA60A, consist of new and modified software and hardware that is similar to what is currently offered on the predicate device, MAGNETOM Vida with syngo MR XA50A (K213693).

AI/ML Overview

This FDA 510(k) summary describes several updates to existing Siemens Medical Solutions MRI systems (MAGNETOM Vida, Lumina, Aera, Skyra, Prisma, and Prisma fit), primarily focusing on software updates (syngo MR XA60A) and some modified/new hardware components. The document highlights the evaluation of new AI features, specifically "Deep Resolve Boost" and "Deep Resolve Sharp."

Here's an analysis of the acceptance criteria and the study details for the AI features:

1. Table of Acceptance Criteria and Reported Device Performance

The document provides a general overview of the evaluation metrics used but does not explicitly state acceptance criteria in a quantitative format (e.g., "Deep Resolve Boost must achieve a PSNR of X" or "Deep Resolve Sharp must achieve Y SSIM"). Instead, it describes the types of metrics used and qualitative assessments.

AI Feature	Acceptance Criteria (Implicit from Evaluation)	Reported Device Performance (Summary)
Deep Resolve Boost	- Preservation of image quality (aliasing artifacts, image sharpness, denoising levels) compared to original.- Impact characterized by PSNR and SSIM.	The impact of the network has been characterized by several quality metrics such as peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM). Most importantly, the performance was evaluated by visual comparisons to evaluate e.g., aliasing artifacts, image sharpness and denoising levels.
Deep Resolve Sharp	- Preservation of image quality (image sharpness) compared to original.- Impact characterized by PSNR, SSIM, and perceptual loss.- Verification and validation by visual rating and evaluation of image sharpness by intensity profile comparisons.	The impact of the network has been characterized by several quality metrics such as peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and perceptual loss. In addition, the feature has been verified and validated by inhouse tests. These tests include visual rating and an evaluation of image sharpness by intensity profile comparisons of reconstructions with and without Deep Resolve Sharp.

2. Sample Size Used for the Test Set and Data Provenance

Deep Resolve Boost: The document doesn't explicitly state a separate "test set" size. It mentions the "Training and Validation data" which includes:
- TSE: more than 25,000 slices
- HASTE: pre-trained on the TSE dataset and refined with more than 10,000 HASTE slices
- EPI Diffusion: more than 1,000,000 slices
- Data Provenance: The data covered a broad range of body parts, contrasts, fat suppression techniques, orientations, and field strength. No specific country of origin is mentioned, but the manufacturer (Siemens Healthcare GmbH) is based in Germany, and Siemens Medical Solutions USA, Inc. is the submitter. The data was "retrospectively created from the ground truth by data manipulation and augmentation."
Deep Resolve Sharp: The document doesn't explicitly state a separate "test set" size. It mentions "Training and Validation data" from "on more than 10,000 high resolution 2D images."
- Data Provenance: Similar to Deep Resolve Boost, the data covered a broad range of body parts, contrasts, fat suppression techniques, orientations, and field strength. Data was "retrospectively created from the ground truth by data manipulation." No specific country of origin is mentioned.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

Not specified. The document states that the acquired datasets "represent the ground truth." There is no mention of expert involvement in establishing ground truth for the test sets. The focus is on technical metrics (PSNR, SSIM) and "visual comparisons" or "visual rating" which implies expert review, but the number and qualifications are not provided.

4. Adjudication Method for the Test Set

Not explicitly stated. The document mentions "visual comparisons" for Deep Resolve Boost and "visual rating" for Deep Resolve Sharp. This suggests subjective human review, but no specific adjudication method (like 2+1 or 3+1 consensus) is detailed.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, If So, What Was the Effect Size of How Much Human Readers Improve with AI vs without AI Assistance

No MRMC comparative effectiveness study is described for the AI features. The studies mentioned (sections 8 and 9) focus on evaluating the technical performance and image quality of the AI algorithms themselves, not on their impact on human reader performance.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

Yes, standalone performance evaluation of the algorithms was conducted. The "Test Statistics and Test Results Summary" for both Deep Resolve Boost and Deep Resolve Sharp detail the evaluation of the network's impact using quantitative metrics (PSNR, SSIM, perceptual loss) and qualitative assessments ("visual comparisons," "visual rating," "intensity profile comparisons"). This represents the algorithm's performance independent of a human reader's diagnostic accuracy.

7. The Type of Ground Truth Used

The ground truth used for both Deep Resolve Boost and Deep Resolve Sharp was the acquired datasets themselves, representing the original high-quality or reference images/slices.

For Deep Resolve Boost, input data was "retrospectively created from the ground truth by data manipulation and augmentation," including undersampling k-space lines, lowering SNR, and mirroring k-space data. The original acquired data serves as the target "ground truth" for the AI to reconstruct/denoise.
For Deep Resolve Sharp, input data was "retrospectively created from the ground truth by data manipulation," specifically by cropping k-space data to create low-resolution input, with the original high-resolution data serving as the "output / ground truth" for training and validation.

8. The Sample Size for the Training Set

Deep Resolve Boost:
- TSE: more than 25,000 slices
- HASTE: pre-trained on the TSE dataset and refined with more than 10,000 HASTE slices
- EPI Diffusion: more than 1,000,000 slices
Deep Resolve Sharp: more than 10,000 high resolution 2D images.

9. How the Ground Truth for the Training Set Was Established

The ground truth for the training set was established as the acquired, unaltered (or minimally altered, e.g., removal of k-space lines to simulate lower quality input from high quality ground truth) raw imaging data.

For Deep Resolve Boost: "The acquired datasets (as described above) represent the ground truth for the training and validation. Input data was retrospectively created from the ground truth by data manipulation and augmentation." This implies that the original, high-quality scans were considered the ground truth, and the AI was trained to restore manipulated, lower-quality versions to this original quality.
For Deep Resolve Sharp: "The acquired datasets represent the ground truth for the training and validation. Input data was retrospectively created from the ground truth by data manipulation. k-space data has been cropped such that only the center part of the data was used as input. With this method corresponding low-resolution data as input and high-resolution data as output / ground truth were created for training and validation." Similar to Boost, the original, higher-resolution scans served as the ground truth.

Ask a Question

Ask a specific question about this device

Page 1 of 1