K Number
K193170
Date Cleared
2019-12-13

(28 days)

Product Code
Regulation Number
892.1750
Panel
RA
Reference & Predicate Devices
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

The Deep Learning Image Reconstruction software is a deep learning based reconstruction method intended to produce cross-sectional images of the head and whole body by computer reconstruction of X-ray transmission data taken at different angles and planes, including Axial, Helical (Volumetric), and Cardiac acquisitions, for all ages. Deep Learning Image Reconstruction software can be used for head, whole body, cardiac, and vascular CT applications.

Device Description

Deep Learning Image Reconstruction is an image reconstruction method that uses a dedicated Deep Neural Network (DNN) that has been designed and trained specifically to generate CT Images to give an image appearance, as shown on axial NPS plots, similar to traditional FBP images while maintaining the performance of ASiR-V in the following areas: image noise (pixel standard deviation), low contrast detectability, high-contrast spatial resolution, and streak artifact suppression.

The images produced are branded as "TrueFidelity™ CT Images". Reconstruction times with Deep Learning Image Reconstruction software support a normal throughput for routine CT.

The deep learning technology is integrated into the scanner's existing raw data-based image reconstruction chain to produce DICOM compatible "TrueFidelity "" CT Images".

The system allows user selection of three strengths of Deep Learning Image Recon: Low, Medium or High. The strength selection will vary with individual users' preferences and experience for the specific clinical need.

Deep Learning Image Reconstruction software was initially introduced on the Revolution CT systems (K133705, K163213). The DLR algorithm is now ported to Revolution EVO (K131576), which offers 64 detector row and up to 40mm collimation, and ASIR-V reconstruction option.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study details based on the provided text:

1. Table of Acceptance Criteria and Reported Device Performance

The document doesn't explicitly state quantitative "acceptance criteria" in a pass/fail format with numerical thresholds. Instead, it describes performance goals relative to the predicate device (ASiR-V) or traditional FBP images. The reported device performance generally indicates "as good as or better than" the reference.

Acceptance Criteria (Stated Goal)Reported Device Performance
Image Appearance (Axial NPS plots)Similar to traditional FBP images
Image Noise (pixel standard deviation)As good as or better than ASiR-V
Low Contrast Detectability (LCD)As good as or better than ASiR-V
High-Contrast Spatial Resolution (MTF)As good as or better than ASiR-V
Streak Artifact SuppressionAs good as or better than ASiR-V
Image Quality Preference (Reader Study)DLIR images preferred over ASiR-V for image noise texture, image sharpness, and image noise texture homogeneity (Implied acceptance criteria: DLIR is preferred)

2. Sample Size Used for the Test Set and Data Provenance

  • Sample Size: 60 retrospectively collected clinical cases.
  • Data Provenance: Retrospective. The origin country is not explicitly stated, but the submitter is GE Healthcare Japan Corporation, so it's possible some or all cases originated from Japan or a region where GE Healthcare Japan Corporation operates.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

  • Number of Experts: 7 board-certified radiologists.
  • Qualifications: Board-certified radiologists with expertise in the specialty areas that align with the anatomical region of each case. The document does not specify years of experience.

4. Adjudication Method for the Test Set

  • Adjudication Method: Each image was read by 3 different radiologists who provided independent assessments of image quality. The readers were blinded to the results of other readers' assessments. There is no explicit mention of an adjudication process (e.g., 2+1 or 3+1 decision) for discrepant reader opinions; it appears the individual assessments were analyzed.

5. If a Multi Reader Multi Case (MRMC) Comparative Effectiveness Study Was Done, If So, What Was the Effect Size of How Much Human Readers Improve with AI vs Without AI Assistance

  • MRMC Study: Yes, a clinical reader study was performed where 7 radiologists read images reconstructed with both ASiR-V (without DLIR) and DLIR.
  • Effect Size of Human Reader Improvement: The document states that readers were asked to "compare directly the ASIR-V and Deep Learning Image Reconstruction (DLIR) images according to three key metrics of image quality preference – image noise texture, image sharpness, and image noise texture homogeneity." It reports that the results support substantial equivalence and performance claims and implies a preference for DLIR images, but does not quantify the effect size of how much human readers "improve" with AI assistance in terms of diagnostic accuracy or efficiency. The study primarily focused on radiologists' preference for image quality characteristics.

6. If a Standalone (i.e. algorithm only without human-in-the-loop performance) Was Done

  • Standalone Performance: Yes, extensive non-clinical engineering bench testing was performed where DLIR and ASiR-V reconstructions were compared using identical raw datasets. This included objective metrics such as Low Contrast Detectability (LCD), Image Noise (pixel standard deviation), High-Contrast Spatial Resolution (MTF), Streak Artifact Suppression, Noise Power Spectrum (NPS), CT Number Accuracy and Uniformity, and Contrast to Noise (CNR) ratio. This constitutes a standalone (algorithm-only) performance evaluation.

7. The Type of Ground Truth Used

  • For the Reader Study (Clinical Performance): The ground truth for evaluating diagnostic use was based on the assessment of image quality related to diagnostic use according to a 5-point Likert Scale by board-certified radiologists. This is a form of expert consensus on image quality suitable for diagnosis, rather than a definitive "truth" established by pathology or patient outcomes.
  • For the Bench Testing (Technical Performance): The "ground truth" was the objective measurement of various image quality metrics (e.g., pixel standard deviation for noise, MTF for spatial resolution) in phantoms, which have known properties.

8. The Sample Size for the Training Set

  • The document states that the Deep Neural Network (DNN) used in Deep Learning Image Reconstruction was "trained specifically" but does not disclose the sample size of the training set.

9. How the Ground Truth for the Training Set Was Established

  • The document implies that the DNN was trained to generate CT Images to give an image appearance similar to traditional FBP images while maintaining ASiR-V performance in certain areas. This suggests that existing "traditional FBP images" or images reconstructed with "ASiR-V" served as a reference or a form of "ground truth" for the training process. However, the exact methodology for establishing ground truth during the training phase (e.g., using paired low-dose/high-dose images, or simulated noise reduction) is not detailed in the provided text.

§ 892.1750 Computed tomography x-ray system.

(a)
Identification. A computed tomography x-ray system is a diagnostic x-ray system intended to produce cross-sectional images of the body by computer reconstruction of x-ray transmission data from the same axial plane taken at different angles. This generic type of device may include signal analysis and display equipment, patient and equipment supports, component parts, and accessories.(b)
Classification. Class II.