Search Results

Sonic DL is a Deep Learning based reconstruction technique that is available for use on GE HealthCare 1.5T, 3.0T, and 7.0T MR systems. Sonic DL reconstructs MR images from highly under-sampled data, and thereby enables highly accelerated acquisitions. Sonic DL is intended for imaging patients of all ages. Sonic DL is not limited by anatomy and can be used for 2D cardiac cine imaging and 3D Cartesian imaging using fast spin echo and gradient echo sequences. Depending on the region of interest, contrast agents may be used.

Device Description

Sonic DL is a software feature intended for use with GE HealthCare MR systems. It includes a deep learning based reconstruction algorithm that enables highly accelerated acquisitions by reconstructing MR images from highly under-sampled data. Sonic DL is an optional feature that is integrated into the MR system software and activated through purchasable software option keys.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study details for Sonic DL, based on the provided FDA 510(k) clearance letter:

Acceptance Criteria and Device Performance for Sonic DL

1. Table of Acceptance Criteria and Reported Device Performance

The document doesn't explicitly list specific quantitative "acceptance criteria" against which a single "reported device performance" is measured in a pass/fail manner for all aspects. Instead, it presents various performance metrics and qualitative assessments that demonstrate the device's acceptable performance compared to existing standards and its stated claims. For the sake of clarity, I've summarized the implied acceptance thresholds or comparative findings from the quantitative studies and the qualitative assessments.

Metric/Criterion	Acceptance Criteria (Implied/Comparative)	Reported Device Performance (Sonic DL)
Non-Clinical Testing (Sonic DL 3D)
Peak-Signal-to-Noise (PSNR)	Equal to or above 30dB	Equal to or above 30dB at all acceleration factors (up to 12)
Structural Similarity Index Measure (SSIM)	Equal to or above 0.8	Equal to or above 0.8 at all acceleration factors (up to 12)
Resolution	Preservation of resolution grid structure and resolution	Preserved resolution grid structure and resolution
Medium/High Contrast Detectability	Retained compared to conventional methods	Retained at all accelerations; comparable or better than conventional methods
Low Contrast Detectability	Non-inferior to more modestly accelerated conventional reconstruction methods at recommended acceleration rates	Maintained at lower acceleration factors; non-inferior at recommended rates (e.g., 8x Sonic DL 3D ~ 4x parallel imaging; 12x Sonic DL 3D ~ 8x parallel imaging)
Model Stability (Hallucination)	Low risk of hallucination; dataset integrity preserved	Low risk of hallucination; dataset integrity preserved across all cases
Clinical Testing (Sonic DL 3D) - Quantitative Post Processing
Volumetric Measurements (Brain Tissues) - Relative MAE 95% CI	Less than 5% for most regions (brain tissues)	Less than 5% for most regions
Volumetric Measurements (HOS) - Relative MAE 95% CI	Less than 3% for Hippocampal Occupancy Score (HOS)	Less than 3% for HOS
Intra-class Correlation Coefficient (ICC)	Exceeded 0.75 across all comparisons	Exceeded 0.75 across all comparisons
Clinical Testing (Sonic DL 3D) - Clinical Evaluation Studies (Likert-score)
Diagnostic Quality	Images are of diagnostic quality	Sonic DL 3D images are of diagnostic quality (across all anatomies, field strengths, and acceleration factors investigated)
Pathology Retention	Pathology seen in comparator images can be accurately retained	Pathology seen in ARC + HyperSense images can be accurately retained
Decline with Acceleration	Retain diagnostic quality overall despite decline	Scores gradually declined with increasing acceleration factors yet retained diagnostic quality overall
Clinical Claims
Scan Time Reduction	Substantial reduction in scan time	Yields substantial reduction in scan time
Diagnostic Image Quality	Preservation of diagnostic image quality	Preserves diagnostic image quality
Acceleration Factors	Up to 12x	Provides acceleration factors up to 12

2. Sample Size Used for the Test Set and Data Provenance

Quantitative Post Processing Test Set:
- Sample Size: 15 fully-sampled datasets.
- Data Provenance: Retrospective, acquired at GE HealthCare in Waukesha, USA, from 1.5T, 3.0T, and 7.0T scanners.
Clinical Evaluation Studies (Likert-score based):
- Study 1 (Brain, Spine, Extremities):
  - Number of image series evaluated: 120 de-identified cases.
  - Number of unique subjects: 54 subjects (48 patients, 6 healthy volunteers).
  - Age range: 11-80 years.
  - Gender: 26 Male, 28 Female.
  - Pathology: Mixture of small, large, focal, diffuse, hyper- and hypo-intense lesions.
  - Contrast: Used in a subset as clinically indicated.
  - Data Provenance: Retrospective and prospective (implied by "obtained from clinical sites and from healthy volunteers scanned at GE HealthCare facilities"). Data collected from 7 sites (4 in United States, 3 outside of United States).
- Study 2 (Brain):
  - Number of additional cases: 120 cases.
  - Number of unique subjects: From 30 fully-sampled acquisitions.
  - Data Provenance: Retrospective, collected internally at GE HealthCare, 1.5T, 3.0T, and 7.0T field strengths.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications

Quantitative Post Processing: The "ground truth" here is the fully sampled data and the quantitative measurements derived from it. No human experts are explicitly mentioned for establishing this computational 'ground truth'.
Clinical Evaluation Studies:
- Study 1: 3 radiologists. Their specific qualifications (e.g., years of experience, subspecialty) are not provided in the document.
- Study 2: 3 radiologists. Their specific qualifications are not provided in the document.

4. Adjudication Method for the Test Set

The document does not explicitly state a formal adjudication method (like 2+1 or 3+1). For the clinical evaluation studies, it mentions that "three radiologists were asked to evaluate the diagnostic quality of images" and "radiologists were also asked to comment on the presence of any pathology." This suggests individual assessments were either aggregated, or findings were considered concordant if a majority agreed, but a specific arbitration or adjudication process for disagreements is not detailed.

5. Multi Reader Multi Case (MRMC) Comparative Effectiveness Study

Was an MRMC study done? Yes, the two Likert-score based clinical studies involved multiple readers (3 radiologists) evaluating multiple cases (120 de-identified cases in Study 1 and 120 additional cases in Study 2) for comparative effectiveness against ARC + HyperSense.
Effect size of human readers improvement with AI vs. without AI assistance: The document states that "Sonic DL 3D images are of diagnostic quality while yielding a substantial reduction in the scan time compared to ARC + HyperSense images." It also noted that "pathology seen in the ARC + HyperSense images can be accurately retained in Sonic DL 3D images." However, it does not quantify the effect size of how much human readers improve with AI assistance (Sonic DL) versus without it. Instead, the studies aim to demonstrate non-inferiority or comparable diagnostic quality despite acceleration. There's no performance gain stated explicitly for the human reader in terms of diagnostic accuracy or confidence when using Sonic DL images compared to conventional images; rather, the benefit is in maintaining diagnostic quality with faster acquisition.

6. Standalone (Algorithm Only) Performance Study

Was a standalone study done? Yes, extensive non-clinical testing was performed as a standalone assessment of the algorithmic performance. This included evaluations using:
- Digital reference objects (DROs) and MR scans of physical ACR phantoms to measure PSNR, RMSE, SSIM, resolution, and low contrast detectability.
- A task-based study using a convolutional neural network ideal observer (CNN-IO) to quantify low contrast detectability.
- Reconstruction of in vivo datasets with unseen data inserted to assess model stability and hallucination risk.
  These studies directly evaluated the algorithm's output metrics and behavior independently of human interpretation in a clinical workflow, making them standalone performance assessments.

7. Type of Ground Truth Used

Non-Clinical Testing:
- Quantitative Metrics (PSNR, RMSE, SSIM, Resolution, Contrast Detectability): Fully sampled data was used as the reference "ground truth" against which the under-sampled and reconstructed Sonic DL 3D images were compared.
- Model Stability (Hallucination): The "ground truth" was the original in vivo datasets before inserting previously unseen data, allowing for evaluation of whether the algorithm introduced artifacts or hallucinations.
Quantitative Post Processing (Clinical Testing):
- Fully sampled data sets were used as the reference for comparison of volumetric measurements with Sonic DL 3D and ARC + HyperSense images.
Clinical Evaluation Studies (Likert-score based):
- The implied "ground truth" was the diagnostic quality and presence/absence of pathology as assessed by the conventional ARC + HyperSense images, which were considered the clinical standard for comparison. The radiologists were essentially comparing Sonic DL images against the standard of care images without a separate, absolute ground truth like pathology for every lesion.

8. Sample Size for the Training Set

The document does not specify the sample size used for training the Sonic DL 3D deep learning model. It only mentions that Sonic DL is a "Deep Learning based reconstruction technique" and includes a "deep learning convolutional neural network."

9. How the Ground Truth for the Training Set Was Established

The document does not describe how the ground truth for the training set was established. It is standard practice for supervised deep learning models like Sonic DL to be trained on pairs of under-sampled and corresponding fully-sampled or high-quality (e.g., conventionally reconstructed) images, where the high-quality image serves as the 'ground truth' for the network to learn to reconstruct from the under-sampled data. However, the specifics of this process (e.g., data types, annotation, expert involvement) are not mentioned in this document.

Ask a Question

Ask a specific question about this device

K Number

K223523

Device Name

Sonic DL

Manufacturer

GE Medical Systems,LLC (GE Healthcare)

Date Cleared

2023-05-30

(188 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K213668

Predicate For

K243667

Intended Use

Sonic DL is a Deep Learning based image reconstruction technique that is available for use on GE Healthcare 1.5T and 3.0T MR systems. Sonic DL reconstructs MR images from highly under-sampled data, and thereby enables highly accelerated acquisitions. Sonic DL is intended for cardiac imaging, and for patients of all ages.

Device Description

Sonic DL is a new software feature intended for use with GE Healthcare MR systems. It consists of a deep learning based reconstruction algorithm that is applied to data from MR cardiac cine exams obtained using a highly accelerated acquisition technique.

Sonic DL is an optional feature that is integrated into the MR system software and activated through a purchasable software option key.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study details for the Sonic DL device, based on the provided document:

Sonic DL Acceptance Criteria and Study Details

1. Table of Acceptance Criteria and Reported Device Performance

The document describes the performance of Sonic DL in comparison to conventional ASSET Cine images. While explicit numerical acceptance criteria for regulatory clearance are not stated, the studies aim to demonstrate non-inferiority or superiority in certain aspects. The implicit acceptance criteria are:

Diagnostic Quality: Sonic DL images must be rated as being of diagnostic quality.
Functional Measurement Agreement: Functional cardiac measurements (e.g., LV volumes, EF, CO) from Sonic DL images must agree closely with those from conventional ASSET Cine images, ideally within typical inter-reader variability.
Reduced Scan Time: Sonic DL must provide significantly shorter scan times.
Preserved Image Quality: Image quality must be preserved despite higher acceleration.
Single Heartbeat Imaging (Functional): Enable functional imaging in a single heartbeat.
Rapid Free-Breathing Functional Imaging: Enable rapid functional imaging without breath-holds.

Implicit Acceptance Criterion	Reported Device Performance
Diagnostic Quality	"on average the Sonic DL images were rated as being of diagnostic quality" (second reader study).
Functional Measurement Agreement	"the inter-method variability (coefficient of variability comparing functional measurements taken with Sonic DL images versus measurements using the conventional ASSET Cine images) was smaller than the inter-observer intra-method variability for the conventional ASSET Cine images for all parameters, indicating that Sonic DL is suitable for performing functional cardiac measurements" (first reader study). "Functional measurements using Sonic DL 1 R-R free breathing images from 10 subjects were compared to functional measurements using the conventional ASSET Cine breath hold images, and showed close agreement" (additional clinical testing for 1 R-R free breathing).
Reduced Scan Time	"providing a significant reduction in scan time compared to the conventional ASSET Cine images" (second reader study). "the Sonic DL feature provided significantly shorter scan times than the conventional Cine imaging" (overall conclusion).
Preserved Image Quality	"capable of reconstructing Cine images from highly under sampled data that are similar to the fully sampled Cine images in terms of image quality and temporal sharpness" (nonclinical testing). "the image quality of 13 Sonic DL 1 R-R free breathing cases was evaluated by a U.S. board certified radiologist, and scored higher than the corresponding conventional free breathing Cine images from the same subjects" (additional clinical testing for 1 R-R free breathing).
Single Heartbeat Functional Imaging	"Sonic DL is capable of achieving a 12 times acceleration factor and obtaining free-breathing images in a single heartbeat (1 R-R)" (additional clinical testing).
Rapid Free-Breathing Functional Imaging	"Sonic DL is capable of... obtaining free-breathing images in a single heartbeat (1 R-R)" (additional clinical testing).

2. Sample Size Used for the Test Set and Data Provenance

The document describes two primary reader evaluation studies and additional clinical testing.

First Reader Study (Functional Measurements):
- Sample Size: 107 image series from 57 unique subjects (46 patients, 11 healthy volunteers).
- Data Provenance: Data from 7 sites: 2 GE Healthcare facilities and 5 external clinical collaborators. This indicates data from multiple sources, likely a mix of prospective and retrospective collection. The geographic origin of these sites is not explicitly stated but implies a multi-center study potentially from different countries where GE Healthcare operates or collaborates.
Second Reader Study (Image Quality Assessment):
- Sample Size: 127 image sets, which included a subset of the subjects from the first study.
- Data Provenance: Same as the first reader study (clinical sites and healthy volunteers at GE Healthcare facilities).
Additional Clinical Testing (1 R-R Free Breathing):
- Functional Measurements: 10 subjects.
- Image Quality Evaluation: 13 subjects.
- Data Provenance: In vivo cardiac cine images from 19 healthy volunteers. This implies prospective collection or a subset of prospectively collected healthy volunteer data.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications

First Reader Study (Functional Measurements): Three radiologists. Qualifications are not explicitly stated, but their role in making quantitative measurements implies expertise in cardiac MRI.
Second Reader Study (Image Quality Assessment): Three radiologists. Qualifications are not explicitly stated, but their role in blinded image quality assessments implies expertise in cardiac MRI interpretation.
Additional Clinical Testing (1 R-R Free Breathing Image Quality): One U.S. board certified radiologist.

4. Adjudication Method for the Test Set

The document does not explicitly state an adjudication method (like 2+1, 3+1, or none) for either the functional measurements or the image quality assessments. For the first study, it mentions "inter-method variability" and "inter-observer intra-method variability," suggesting that the readings from the three radiologists were compared against each other and against the conventional method, but not necessarily adjudicated to establish a single "ground truth" per case. For the second study, "blinded image quality assessments" were performed, and ratings were averaged, but no adjudication process is described.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done, and the effect size

A clear MRMC comparative effectiveness study, in the sense of measuring human reader improvement with AI vs. without AI assistance, is not explicitly described.

The studies compare the performance of Sonic DL images (algorithm output) against conventional images, with human readers evaluating both.

The first reader study compares quantitative measurements from Sonic DL images to conventional images, indicating suitability for performing functional cardiac measurements by showing smaller inter-method variability than inter-observer intra-method variability for conventional images. This suggests Sonic DL is at least as reliable as the variability between conventional human measurements.
The second reader study involves blinded image quality assessments of both conventional and Sonic DL images, confirming that Sonic DL images were rated as diagnostic quality.
The additional clinical testing for 1 R-R free breathing shows that Sonic DL images were "scored higher than the corresponding conventional free breathing Cine images" by a U.S. board-certified radiologist.

These are comparisons of the image quality and output from the AI system versus conventional imaging, interpreted by readers, rather than measuring human reader performance assisted by the AI system.

Therefore, the effect size of how much human readers improve with AI vs. without AI assistance is not provided because the studies were designed to evaluate the image output quality and measurement agreement of the AI-reconstructed images themselves, not to assess an AI-assisted workflow for human readers.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) was done

Yes, standalone performance was assessed for image quality metrics.

Nonclinical Testing: "Model accuracy metrics such as Peak-Signal-to-Noise (PSNR), Root-Mean-Square Error (RMSE), Structural Similarity Index Measure (SSIM), and Mean Absolute Error (MAE) were used to compare simulated Sonic DL images with different levels of acceleration and numbers of phases to the fully sampled images." This is a standalone evaluation of the algorithm's output quality against a reference.
In Vivo Testing: "model accuracy and temporal sharpness evaluations were conducted using in vivo cardiac cine images obtained from 19 health volunteers." This is also a standalone technical evaluation of the algorithm's output on real data.

7. The Type of Ground Truth Used

Nonclinical Testing (Simulated Data): The ground truth was the "fully sampled images" generated from an MRXCAT phantom and a digital phantom.
Clinical Testing (Reader Studies):
- Functional Measurements: The "ground truth" for comparison was the measurements taken from the "conventional ASSET Cine images." The variability of these conventional measurements across readers also served as a baseline for comparison. This is a form of clinical surrogate ground truth (comparing to an established accepted method).
- Image Quality Assessments: The "ground truth" was the expert consensus/opinion of the radiologists during their blinded assessments of diagnostic quality.
- Additional Clinical Testing (1 R-R Free Breathing): Functional measurements were compared to "conventional ASSET Cine breath hold images" (clinical surrogate ground truth). Image quality was based on the scoring by a "U.S. board certified radiologist" (expert opinion).

No pathology or outcomes data were used as ground truth. The ground truth in the clinical setting was primarily based on established imaging techniques (conventional MR) and expert radiologist assessments.

8. The Sample Size for the Training Set

The document does not explicitly state the sample size for the training set used for the deep learning model. It only describes the data used for testing the device.

9. How the Ground Truth for the Training Set Was Established

Since the training set size is not provided, the method for establishing its ground truth is also not described in the provided text. Typically, for deep learning reconstruction, the "ground truth" for training often involves fully sampled or high-quality reference images corresponding to the undersampled input data.

Ask a Question

Ask a specific question about this device

Page 1 of 1