K Number
K223659
Device Name
Jazz
Manufacturer
Date Cleared
2023-09-22

(290 days)

Product Code
Regulation Number
892.2050
Reference & Predicate Devices
Predicate For
N/A
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

Jazz is intended for the labeling, visualization and volumetric quantification of segmentable brain structures from a set of MR images, for patients with a known diagnosis of multiple sclerosis (for the multiple sclerosis pipeline) and/or brain metastasis (for the metastasis pipeline), and the production of a radiological report.

Device Description

Jazz is a Software as a Medical Device (SaMD) consisting of a software intended to be a facilitating tool for the physicians, in the sense of a semi-automatic pipeline for the process of identifying, labeling and quantifying the volume of segmentable brain structures identified on MR images. "Semi-automatic" refers to the possibility given to the physician to correct the segmentation of the software before saving.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study details for the Jazz device, extracted from the provided text:

1. A table of acceptance criteria and the reported device performance

Acceptance CriteriaReported Device Performance
Accuracy Experiment 1 (Lesion Segmentation):All experiments passed the acceptance criteria.
- Voxel-wise sensitivity of at least 40%
- Voxel-wise specificity of at least 95%
- Lesion-wise Dice score of at least 0.5 (for both multiple sclerosis high sensitivity and high specificity models)
- Lesion-wise true positive rate of at least 60% (for the high sensitivity model)
- Lesion-wise false negative rate of at most 40% (for the high sensitivity model)
- Lesion-wise false discovery rate of at most 50% (for the high specificity model)
Accuracy Experiment 2 (Anatomy Localization):All experiments passed the acceptance criteria.
- Anatomy localization score of 1 (best) in at least 80% of the cases
- Anatomy localization score of 6 in less than 10% of the lesions
Accuracy Experiment 3 (Coregistration):All experiments passed the acceptance criteria.
- Average quality of the coregistration score larger than 4 (good coregistration)
- Percentage of coregistration score equaling 2 (bad coregistration) or worse smaller than 10%
Reproducibility Experiment:All experiments passed the acceptance criteria.
- Identical number of lesions volume and report generated in a process-reprocess experiment with Jazz

2. Sample size used for the test set and the data provenance

  • Test set sample size: 344 subject datasets.
  • Data provenance: Not explicitly stated regarding country of origin or whether it's retrospective/prospective. The text only mentions the subjects included "healthy subjects, multiple sclerosis and metastasis patients."

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

  • Number of experts: Not explicitly stated.
  • Qualifications of experts: "gold standard human expert opinion." No further details on specific qualifications (e.g., years of experience, specialty) are provided for the experts who established the ground truth for the test set.

4. Adjudication method for the test set

  • Not explicitly stated. The text mentions "gold standard human expert opinion" for ground truth, but doesn't detail how multiple expert opinions (if applicable) were adjudicated.

5. If a multi-reader, multi-case (MRMC) comparative effectiveness study was done, if so, what was the effect size of how much human readers improve with AI vs without AI assistance

  • No, an MRMC comparative effectiveness study was not conducted to assess how human readers improve with AI assistance. The study focuses on the standalone performance of the Jazz device. The Jazz software is described as a "semi-automatic pipeline" where the physician has the opportunity to review and correct segmentations before the final report. However, the performance data presented is for the device's accuracy and reproducibility against ground truth, not for human reader improvement with the device.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

  • Yes, the performance data presented primarily reflects standalone (algorithm only) performance against manually labeled ground truth volumes. While the device is "semi-automatic" and allows physician correction, the acceptance criteria are set for the device's algorithmic segmentation and localization capabilities. The phrase "the measured volumes... are validated for accuracy against manually labeled ground truth volumes" supports this. The "neuroradiological confirmation" step in the flowchart implies human review, but the reported performance metrics appear to be for the initial algorithmic output prior to this confirmation.

7. The type of ground truth used

  • Expert Consensus / Expert Opinion: The primary ground truth for accuracy experiments was established using "manually labeled ground truth volumes" and "gold standard human expert opinion."

8. The sample size for the training set

  • The training set sample size is not specified. The text only states that "Networks were trained using brain images, which were fully segregated from the test set."

9. How the ground truth for the training set was established

  • The ground truth for the training set was established "using a gold standard human expert opinion."

§ 892.2050 Medical image management and processing system.

(a)
Identification. A medical image management and processing system is a device that provides one or more capabilities relating to the review and digital processing of medical images for the purposes of interpretation by a trained practitioner of disease detection, diagnosis, or patient management. The software components may provide advanced or complex image processing functions for image manipulation, enhancement, or quantification that are intended for use in the interpretation and analysis of medical images. Advanced image manipulation functions may include image segmentation, multimodality image registration, or 3D visualization. Complex quantitative functions may include semi-automated measurements or time-series measurements.(b)
Classification. Class II (special controls; voluntary standards—Digital Imaging and Communications in Medicine (DICOM) Std., Joint Photographic Experts Group (JPEG) Std., Society of Motion Picture and Television Engineers (SMPTE) Test Pattern).