K Number
K243667
Device Name
Sonic DL
Date Cleared
2025-06-05

(190 days)

Product Code
Regulation Number
892.1000
Panel
RA
Reference & Predicate Devices
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

Sonic DL is a Deep Learning based reconstruction technique that is available for use on GE HealthCare 1.5T, 3.0T, and 7.0T MR systems. Sonic DL reconstructs MR images from highly under-sampled data, and thereby enables highly accelerated acquisitions. Sonic DL is intended for imaging patients of all ages. Sonic DL is not limited by anatomy and can be used for 2D cardiac cine imaging and 3D Cartesian imaging using fast spin echo and gradient echo sequences. Depending on the region of interest, contrast agents may be used.

Device Description

Sonic DL is a software feature intended for use with GE HealthCare MR systems. It includes a deep learning based reconstruction algorithm that enables highly accelerated acquisitions by reconstructing MR images from highly under-sampled data. Sonic DL is an optional feature that is integrated into the MR system software and activated through purchasable software option keys.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study details for Sonic DL, based on the provided FDA 510(k) clearance letter:

Acceptance Criteria and Device Performance for Sonic DL

1. Table of Acceptance Criteria and Reported Device Performance

The document doesn't explicitly list specific quantitative "acceptance criteria" against which a single "reported device performance" is measured in a pass/fail manner for all aspects. Instead, it presents various performance metrics and qualitative assessments that demonstrate the device's acceptable performance compared to existing standards and its stated claims. For the sake of clarity, I've summarized the implied acceptance thresholds or comparative findings from the quantitative studies and the qualitative assessments.

Metric/CriterionAcceptance Criteria (Implied/Comparative)Reported Device Performance (Sonic DL)
Non-Clinical Testing (Sonic DL 3D)
Peak-Signal-to-Noise (PSNR)Equal to or above 30dBEqual to or above 30dB at all acceleration factors (up to 12)
Structural Similarity Index Measure (SSIM)Equal to or above 0.8Equal to or above 0.8 at all acceleration factors (up to 12)
ResolutionPreservation of resolution grid structure and resolutionPreserved resolution grid structure and resolution
Medium/High Contrast DetectabilityRetained compared to conventional methodsRetained at all accelerations; comparable or better than conventional methods
Low Contrast DetectabilityNon-inferior to more modestly accelerated conventional reconstruction methods at recommended acceleration ratesMaintained at lower acceleration factors; non-inferior at recommended rates (e.g., 8x Sonic DL 3D ~ 4x parallel imaging; 12x Sonic DL 3D ~ 8x parallel imaging)
Model Stability (Hallucination)Low risk of hallucination; dataset integrity preservedLow risk of hallucination; dataset integrity preserved across all cases
Clinical Testing (Sonic DL 3D) - Quantitative Post Processing
Volumetric Measurements (Brain Tissues) - Relative MAE 95% CILess than 5% for most regions (brain tissues)Less than 5% for most regions
Volumetric Measurements (HOS) - Relative MAE 95% CILess than 3% for Hippocampal Occupancy Score (HOS)Less than 3% for HOS
Intra-class Correlation Coefficient (ICC)Exceeded 0.75 across all comparisonsExceeded 0.75 across all comparisons
Clinical Testing (Sonic DL 3D) - Clinical Evaluation Studies (Likert-score)
Diagnostic QualityImages are of diagnostic qualitySonic DL 3D images are of diagnostic quality (across all anatomies, field strengths, and acceleration factors investigated)
Pathology RetentionPathology seen in comparator images can be accurately retainedPathology seen in ARC + HyperSense images can be accurately retained
Decline with AccelerationRetain diagnostic quality overall despite declineScores gradually declined with increasing acceleration factors yet retained diagnostic quality overall
Clinical Claims
Scan Time ReductionSubstantial reduction in scan timeYields substantial reduction in scan time
Diagnostic Image QualityPreservation of diagnostic image qualityPreserves diagnostic image quality
Acceleration FactorsUp to 12xProvides acceleration factors up to 12

2. Sample Size Used for the Test Set and Data Provenance

  • Quantitative Post Processing Test Set:
    • Sample Size: 15 fully-sampled datasets.
    • Data Provenance: Retrospective, acquired at GE HealthCare in Waukesha, USA, from 1.5T, 3.0T, and 7.0T scanners.
  • Clinical Evaluation Studies (Likert-score based):
    • Study 1 (Brain, Spine, Extremities):
      • Number of image series evaluated: 120 de-identified cases.
      • Number of unique subjects: 54 subjects (48 patients, 6 healthy volunteers).
      • Age range: 11-80 years.
      • Gender: 26 Male, 28 Female.
      • Pathology: Mixture of small, large, focal, diffuse, hyper- and hypo-intense lesions.
      • Contrast: Used in a subset as clinically indicated.
      • Data Provenance: Retrospective and prospective (implied by "obtained from clinical sites and from healthy volunteers scanned at GE HealthCare facilities"). Data collected from 7 sites (4 in United States, 3 outside of United States).
    • Study 2 (Brain):
      • Number of additional cases: 120 cases.
      • Number of unique subjects: From 30 fully-sampled acquisitions.
      • Data Provenance: Retrospective, collected internally at GE HealthCare, 1.5T, 3.0T, and 7.0T field strengths.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications

  • Quantitative Post Processing: The "ground truth" here is the fully sampled data and the quantitative measurements derived from it. No human experts are explicitly mentioned for establishing this computational 'ground truth'.
  • Clinical Evaluation Studies:
    • Study 1: 3 radiologists. Their specific qualifications (e.g., years of experience, subspecialty) are not provided in the document.
    • Study 2: 3 radiologists. Their specific qualifications are not provided in the document.

4. Adjudication Method for the Test Set

The document does not explicitly state a formal adjudication method (like 2+1 or 3+1). For the clinical evaluation studies, it mentions that "three radiologists were asked to evaluate the diagnostic quality of images" and "radiologists were also asked to comment on the presence of any pathology." This suggests individual assessments were either aggregated, or findings were considered concordant if a majority agreed, but a specific arbitration or adjudication process for disagreements is not detailed.


5. Multi Reader Multi Case (MRMC) Comparative Effectiveness Study

  • Was an MRMC study done? Yes, the two Likert-score based clinical studies involved multiple readers (3 radiologists) evaluating multiple cases (120 de-identified cases in Study 1 and 120 additional cases in Study 2) for comparative effectiveness against ARC + HyperSense.
  • Effect size of human readers improvement with AI vs. without AI assistance: The document states that "Sonic DL 3D images are of diagnostic quality while yielding a substantial reduction in the scan time compared to ARC + HyperSense images." It also noted that "pathology seen in the ARC + HyperSense images can be accurately retained in Sonic DL 3D images." However, it does not quantify the effect size of how much human readers improve with AI assistance (Sonic DL) versus without it. Instead, the studies aim to demonstrate non-inferiority or comparable diagnostic quality despite acceleration. There's no performance gain stated explicitly for the human reader in terms of diagnostic accuracy or confidence when using Sonic DL images compared to conventional images; rather, the benefit is in maintaining diagnostic quality with faster acquisition.

6. Standalone (Algorithm Only) Performance Study

  • Was a standalone study done? Yes, extensive non-clinical testing was performed as a standalone assessment of the algorithmic performance. This included evaluations using:
    • Digital reference objects (DROs) and MR scans of physical ACR phantoms to measure PSNR, RMSE, SSIM, resolution, and low contrast detectability.
    • A task-based study using a convolutional neural network ideal observer (CNN-IO) to quantify low contrast detectability.
    • Reconstruction of in vivo datasets with unseen data inserted to assess model stability and hallucination risk.
      These studies directly evaluated the algorithm's output metrics and behavior independently of human interpretation in a clinical workflow, making them standalone performance assessments.

7. Type of Ground Truth Used

  • Non-Clinical Testing:
    • Quantitative Metrics (PSNR, RMSE, SSIM, Resolution, Contrast Detectability): Fully sampled data was used as the reference "ground truth" against which the under-sampled and reconstructed Sonic DL 3D images were compared.
    • Model Stability (Hallucination): The "ground truth" was the original in vivo datasets before inserting previously unseen data, allowing for evaluation of whether the algorithm introduced artifacts or hallucinations.
  • Quantitative Post Processing (Clinical Testing):
    • Fully sampled data sets were used as the reference for comparison of volumetric measurements with Sonic DL 3D and ARC + HyperSense images.
  • Clinical Evaluation Studies (Likert-score based):
    • The implied "ground truth" was the diagnostic quality and presence/absence of pathology as assessed by the conventional ARC + HyperSense images, which were considered the clinical standard for comparison. The radiologists were essentially comparing Sonic DL images against the standard of care images without a separate, absolute ground truth like pathology for every lesion.

8. Sample Size for the Training Set

The document does not specify the sample size used for training the Sonic DL 3D deep learning model. It only mentions that Sonic DL is a "Deep Learning based reconstruction technique" and includes a "deep learning convolutional neural network."


9. How the Ground Truth for the Training Set Was Established

The document does not describe how the ground truth for the training set was established. It is standard practice for supervised deep learning models like Sonic DL to be trained on pairs of under-sampled and corresponding fully-sampled or high-quality (e.g., conventionally reconstructed) images, where the high-quality image serves as the 'ground truth' for the network to learn to reconstruct from the under-sampled data. However, the specifics of this process (e.g., data types, annotation, expert involvement) are not mentioned in this document.

§ 892.1000 Magnetic resonance diagnostic device.

(a)
Identification. A magnetic resonance diagnostic device is intended for general diagnostic use to present images which reflect the spatial distribution and/or magnetic resonance spectra which reflect frequency and distribution of nuclei exhibiting nuclear magnetic resonance. Other physical parameters derived from the images and/or spectra may also be produced. The device includes hydrogen-1 (proton) imaging, sodium-23 imaging, hydrogen-1 spectroscopy, phosphorus-31 spectroscopy, and chemical shift imaging (preserving simultaneous frequency and spatial information).(b)
Classification. Class II (special controls). A magnetic resonance imaging disposable kit intended for use with a magnetic resonance diagnostic device only is exempt from the premarket notification procedures in subpart E of part 807 of this chapter subject to the limitations in § 892.9.