Search Results
Found 670 results
510(k) Data Aggregation
(57 days)
The uMR 680 system is indicated for use as a magnetic resonance diagnostic device (MRDD) that produces sagittal, transverse, coronal, and oblique cross sectional images, and spectroscopic images, and that display internal anatomical structure and/or function of the head, body and extremities.
These images and the physical parameters derived from the images when interpreted by a trained physician yield information that may assist the diagnosis. Contrast agents may be used depending on the region of interest of the scan.
The uMR 680 is a 1.5T superconducting magnetic resonance diagnostic device with a 70cm size patient bore. It consists of components such as magnet, RF power amplifier, RF coils, gradient power amplifier, gradient coils, patient table, spectrometer, computer, equipment cabinets, power distribution system, internal communication system, and vital signal module etc. The uMR 680 Magnetic Resonance Diagnostic Device is designed to conform to NEMA and DICOM standards.
This special 510(k) is to request modifications for the cleared uMR 680(K243397). The modifications performed on the uMR 680 in this submission are due to the following changes that include:
(1) Addition of RF coils: Tx/Rx Head Coil.
(2) Addition of a mobile configuration.
N/A
Ask a specific question about this device
(135 days)
The ECHELON Synergy System is an imaging device and is intended to provide the physician with physiological and clinical information, obtained non-invasively and without the use of ionizing radiation. The MR system produces transverse, coronal, sagittal, oblique, and curved cross sectional images that display the internal structure of the head, body, or extremities. The images produced by the MR system reflect the spatial distribution of protons (hydrogen nuclei) exhibiting magnetic resonance. The NMR properties that determine the image appearance are proton density, spin-lattice relaxation time (T1), spin-spin relaxation time (T2) and flow. When interpreted by a trained physician, these images provide information that can be useful in diagnosis determination.
The ECHELON Synergy is a Magnetic Resonance Imaging System that utilizes a 1.5 Tesla superconducting magnet in a gantry design. The control and image processing hardware and the base elements of the system software are identical to the predicate device.
This document describes the ECHELON Synergy MRI system's acceptance criteria and the studies conducted to demonstrate its performance. The submission for FDA 510(k) clearance (K251386) references a predicate device, the ECHELON Synergy MRI System (K241429), and outlines modifications to hardware and software.
1. Table of Acceptance Criteria and Reported Device Performance
The document does not explicitly present a table of "acceptance criteria" against which a numeric performance metric is listed for each new feature. Instead, it details that certain functionalities (DLR Symmetry and AutoPose) underwent performance evaluations. The "performance" reported is described qualitatively or comparatively to conventional methods.
| Feature/Metric | Acceptance Criteria (Implicit/Derived) | Reported Device Performance |
|---|---|---|
| DLR Symmetry - Artifact Reduction | Reduction of artifacts should be demonstrated. | Phantom testing demonstrated DLR Symmetry could reduce artifacts in the image using Normalized Root Mean Square Error (NRMSE). Clinical image review by radiologists indicated superior artifact reduction (p<0.05) compared to conventional images. |
| DLR Symmetry - Image Quality (SNR, Sharpness, Contrast, Lesion Conspicuity, Overall) | Should not degrade image quality compared to conventional methods. Images should be clinically acceptable. | Phantom Testing: Did not degrade image quality based on SNR, Relative Edge Sharpness, and Contrast Change Rate. Clinical Image Review: Radiologists reported superior SNR, image sharpness, lesion conspicuity, and overall image quality (p<0.05) in DLR Symmetry images. All DLR Symmetry images were evaluated as clinically acceptable. |
| AutoPose (Shoulder, Knee, HipJoint, Abdomen, Pelvis (male/female), Cardiac) - Automatic Slice Positioning | Should be able to set slice positions for a scan without manual adjustment in most cases. For remaining cases, user operation steps should be equivalent to manual positioning. | Evaluation by certified radiological technologists showed that "almost cases" were able to set slice positions without manual adjustment. The remaining cases required user operation steps equivalent to manual slice positioning. |
2. Sample Sizes Used for the Test Set and Data Provenance
- DLR Symmetry:
- Clinical Image Test Set: 89 unique subjects (patients and healthy subjects).
- Data Provenance: From U.S. and Japan.
- Data Type: Retrospective (clinical images collected).
- AutoPose:
- Shoulder: 60 cases
- Knee: 60 cases
- HipJoint: 65 cases
- Abdomen: 115 cases
- Pelvis for male: 60 cases
- Pelvis for female: 68 cases
- Cardiac: 126 cases
- Data Provenance: FUJIFILM Corp., FUJIFILM Healthcare Americas Corp., and clinical sites.
- Data Type: Subject type includes healthy volunteers and patients, implying a mix of prospective data collection for testing new features and potentially retrospective for some patient data.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications
- DLR Symmetry:
- Number of Experts: Three US board certified radiologists.
- Qualifications: "US board certified radiologists." Specific years of experience are not mentioned.
- AutoPose:
- Number of Experts/Evaluators: Three certified radiological technologists.
- Qualifications: "Certified radiological technologists." Specific years of experience are not mentioned.
4. Adjudication Method for the Test Set
- DLR Symmetry: The document states that comparisons were made by "the reviewers" (plural) in terms of image quality metrics using a 3-point scale. It doesn't explicitly state an adjudication method like 2+1 or 3+1 if there were disagreements among the three radiologists. It implies a consensus or majority rule might have been used for the reported "superior" findings, but this isn't detailed.
- AutoPose: The evaluation results are simply described as "evaluation results showed," implying a summary of the technologists' findings. No specific adjudication method is described.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done
- DLR Symmetry: A form of MRMC study was conducted where three US board-certified radiologists reviewed images reconstructed with DLR Symmetry versus conventional methods.
- Effect Size of Human Readers with AI vs. Without AI Assistance: The document indicates that images with DLR Symmetry (AI-assisted reconstruction) were "superior to those in the conventional images with statistically significant difference (p<0.05)" across various image quality metrics. This shows an improvement in the perceived image quality for human readers when using DLR Symmetry, but it does not quantify the "effect size of how much human readers improve with AI vs. without AI assistance" in terms of diagnostic performance (e.g., improved sensitivity/specificity for a given task). Instead, it focuses on the quality of the image presented to the reader.
6. If a Standalone (i.e. algorithm only without human-in-the-loop performance) was done
- DLR Symmetry: Yes, in part. Phantom testing was conducted to evaluate artifact reduction (NRMSE), SNR, Relative Edge Sharpness, and Contrast Change Rate. This is an algorithmic performance evaluation independent of human interpretation of clinical images, although it assesses image characteristics rather than diagnostic output.
- AutoPose: The evaluation by certified radiological technologists focuses on the algorithm's ability to set slice positions automatically, which is a standalone performance metric for the automation function.
7. The Type of Ground Truth Used
- DLR Symmetry:
- For phantom testing: "Ground truth" refers to the known characteristics of the phantom and metrics like NRMSE, SNR, etc.
- For clinical image review: The ground truth was established by expert consensus or individual assessment of the "clinical acceptability" of the images and comparative judgment (superiority) of image quality metrics by three US board-certified radiologists. This isn't pathology or outcomes data, but rather expert radiological opinion on image quality and clinical utility.
- AutoPose: The "ground truth" was whether the automated positioning successfully set the slice positions without manual adjustment, as evaluated by certified radiological technologists.
8. The Sample Size for the Training Set
The document explicitly states regarding DLR Symmetry: "The test dataset was independent of the training and validation datasets." However, it does not provide the sample size or details for the training set (or validation set) for DLR Symmetry or AutoPose.
9. How the Ground Truth for the Training Set Was Established
The document does not provide details on how the ground truth for the training set was established for either DLR Symmetry or AutoPose, as the training set details are not included in the provided text.
Ask a specific question about this device
(128 days)
The SIGNA™ Sprint is a whole body magnetic resonance scanner designed to support high resolution, high signal-to-noise ratio, and short scan times. It is indicated for use as a diagnostic imaging device to produce axial, sagittal, coronal, and oblique images, spectroscopic images, parametric maps, and/or spectra, dynamic images of the structures and/or functions of the entire body, including, but not limited to, head, neck, TMJ, spine, breast, heart, abdomen, pelvis, joints, prostate, blood vessels, and musculoskeletal regions of the body. Depending on the region of interest being imaged, contrast agents may be used.
The images produced by SIGNA™ Sprint reflect the spatial distribution or molecular environment of nuclei exhibiting magnetic resonance. These images and/or spectra when interpreted by a trained physician yield information that may assist in diagnosis.
SIGNA™ Sprint is a whole-body magnetic resonance scanner designed to support high resolution, high signal-to-noise ratio, and short scan time. The system uses a combination of time-varying magnet fields (Gradients) and RF transmissions to obtain information regarding the density and position of elements exhibiting magnetic resonance. The system can image in the sagittal, coronal, axial, oblique, and double oblique planes, using various pulse sequences, imaging techniques and reconstruction algorithms. The system features a 1.5T superconducting magnet with 70cm bore size. The system is designed to conform to NEMA DICOM standards (Digital Imaging and Communications in Medicine).
Key aspects of the system design:
- Uses the same magnet as a conventional whole-body 1.5T system, with integral active shielding and a zero boil-off cryostat.
- A gradient coil that achieves up to 65 mT/m peak gradient amplitude and 200 T/m/s peak slew rate.
- An embedded body coil that reduces thermal and enhance intra-bore visibility.
- A newly designed 1.5T AIR Posterior Array.
- A detachable patient table.
- A platform software with various PSD and applications, including the following AI features:
The provided text is a 510(k) clearance letter and summary for a new MRI device, SIGNA™ Sprint. It states explicitly that no clinical studies were required to support substantial equivalence. Therefore, the information requested regarding acceptance criteria, study details, sample sizes, ground truth definitions, expert qualifications, and MRMC studies is not available in this document.
The document highlights the device's technical equivalence to a predicate device (SIGNA™ Premier) and reference devices (SIGNA™ Artist, SIGNA™ Champion) and relies on non-clinical tests and sample clinical images to demonstrate acceptable diagnostic performance.
Here's a breakdown of what can be extracted from the document regarding testing, and why other requested information is absent:
1. A table of acceptance criteria and the reported device performance
- Acceptance Criteria (Implicit): The document states that the device's performance is demonstrated through "bench testing and clinical testing that show the image quality performance of SIGNA™ Sprint compared to the predicate device." It also mentions "acceptable diagnostic image performance... in accordance with the FDA Guidance 'Submission of Premarket Notifications for Magnetic Resonance Diagnostic Devices' issued on October 10, 2023."
- Specific quantitative acceptance criteria (e.g., minimum SNR, CNR, spatial resolution thresholds) are not explicitly stated in this document.
- Reported Device Performance: "The images produced by SIGNA™ Sprint reflect the spatial distribution or molecular environment of nuclei exhibiting magnetic resonance. These images and/or spectra when interpreted by a trained physician yield information that may assist in diagnosis."
- No specific quantitative performance metrics (e.g., sensitivity, specificity, accuracy, or detailed image quality scores) are provided in this regulatory summary. The statement "The image quality of the SIGNA™ Sprint is substantially equivalent to that of the predicate device" is the primary performance claim.
2. Sample size used for the test set and the data provenance (e.g. country of origin of the data, retrospective or prospective)
- Test Set Sample Size: Not applicable/Not provided. The document explicitly states: "The subject of this premarket submission, the SIGNA™ Sprint, did not require clinical studies to support substantial equivalence."
- Data Provenance: Not applicable/Not provided for a formal clinical test set. The document only mentions "Sample clinical images have been included in this submission," but does not specify their origin or nature beyond being "sample."
3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts (e.g. radiologist with 10 years of experience)
- Not applicable. Since no formal clinical study was conducted for substantial equivalence, there was no "test set" requiring ground truth established by experts in the context of an effectiveness study. The "interpretation by a trained physician" is mentioned in the Indications for Use, which is general to MR diagnostics, not specific to a study.
4. Adjudication method (e.g. 2+1, 3+1, none) for the test set
- Not applicable. No clinical test set requiring adjudication was conducted for substantial equivalence.
5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance
- No. The document explicitly states: "The subject of this premarket submission, the SIGNA™ Sprint, did not require clinical studies to support substantial equivalence." While the device incorporates AI features cleared in other submissions (AIRx™, AIR™ Recon DL, Sonic DL™), this specific 510(k) for the SIGNA™ Sprint system itself does not include an MRMC study or an assessment of human reader improvement with these integrated AI features. The focus is on the substantial equivalence of the overall MR system.
6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done
- No, not for the SIGNA™ Sprint as a whole system. This 510(k) is for the MR scanner itself, not for a standalone algorithm. Any standalone performance for the integrated AI features (AIRx™, AIR™ Recon DL, Sonic DL™) would have been part of their respective clearance submissions (K183231, K202238, K223523), not this one.
7. The type of ground truth used (expert consensus, pathology, outcomes data, etc)
- Not applicable. No formal clinical study requiring ground truth was conducted for this submission.
8. The sample size for the training set
- Not applicable/Not provided. This submission is for the SIGNA™ Sprint MR system itself, not a new AI algorithm requiring a training set developed for this specific submission. The AI features mentioned (AIRx™, AIR™ Recon DL, Sonic DL™) were cleared in previous 510(k)s and would have had their own training and validation processes.
9. How the ground truth for the training set was established
- Not applicable/Not provided. As explained in point 8, this submission does not detail the training of new AI algorithms.
Ask a specific question about this device
(102 days)
MuscleView 2.0 is a magnetic resonance diagnostic software device used in adults and pediatrics aged 18 and older which automatically segments muscle, bone, fat and other anatomical structures from magnetic resonance imaging. After segmentation, it enables the generation, display and review of magnetic resonance imaging data. The segmentation results need to be reviewed and edited using appropriate software. Other physical parameters derived from the images may also be produced. This device is not intended for use with patients who have tumors in the trunk, arms and/or lower limb(s). When interpreted by a trained clinician, these images and physical parameters may yield information that may assist in diagnosis.
MuscleView 2.0 is a software-only medical device which performs automatic segmentation of musculoskeletal structures. The software utilizes a locked artificial intelligence/machine learning (AI/ML) algorithm to identify and segment anatomical structures for quantitative analysis. The input to the software is DICOM data from magnetic resonance imaging (MRI), but the subject device does not directly interface with any devices. The output includes volumetric and dimensional metrics of individual and grouped regions of interest (ROIs) (such as muscles, bones and adipose tissue) and comparative analysis against a Virtual Control Group (VCG) derived from reference population data.
MuscleView 2.0 builds upon the predicate device, MuscleView 1.0 (K241331, cleared 10/01/2024), which was cleared for the segmentation and analysis of lower extremity structures (hips to ankles). The subject device extends functionality to include:
- Upper body regions (neck to hips)
- Adipose tissue segmentation (subcutaneous, visceral, intramuscular, and hepatic fat)
- Quantitative comparison with a Virtual Control Group
- Additional derived metrics including Z-scores and composite scores (e.g., muscle-bone score)
The submission includes a Predetermined Change Control Plan which details the procedure for retraining AI/ML algorithms or adding data to the Virtual Control Groups in order to improve performance without negatively impacting the safety or efficacy of the device.
Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided FDA 510(k) clearance letter for MuscleView 2.0:
1. Table of Acceptance Criteria and Reported Device Performance
The acceptance criteria for MuscleView 2.0 were based on the device's segmentation accuracy, measured by Dice Similarity Coefficient (DSC) and absolute Volume Difference (VDt), remaining within the interobserver variability observed among human experts. The study demonstrated the device met these criteria. Since the text explicitly states the AI model's performance was "consistently within these predefined interobserver ranges," and "passed validation," the reported performance for all ROIs was successful in meeting the acceptance criteria.
| Metric | Acceptance Criteria | Comment on Reported Performance |
|---|---|---|
| Dice Similarity Coefficient (DSC) | DSC values where the 95% confidence interval for each ROI (across all subgroup analyses) indicates performance at or below interobserver variability (meaning higher DSC, closer to 1.0, is better). Specifically, a desired outcome was "a mean better than or equal to the acceptance criteria." | Consistently within predefined interobserver ranges and passed validation for all evaluated ROIs and subgroups. (See Table 1 for 95% CIs of individual ROIs across subgroups). |
| Absolute Volume Difference (VDt) | VDt values where the 95% confidence interval for each ROI (across all subgroup analyses) indicates performance at or below interobserver variability (meaning lower VDt, closer to 0, is better). Specifically, a desired outcome was "a mean better than or equal to the acceptance criteria." | Consistently within predefined interobserver ranges and passed validation for all evaluated ROIs and subgroups. (See Table 2 for 95% CIs of individual ROIs across subgroups). |
2. Sample Sizes Used for the Test Set and Data Provenance
| AI Setting | Number of Unique Scans | Number of Unique Subjects | Data Provenance |
|---|---|---|---|
| AI Setting 1 (Lower Extremity) | 148 | 148 | Retrospective, "diverse population," multiple imaging sites and MRI manufacturers (GE, Siemens, Philips, Canon, Toshiba/Other). Countries of origin not explicitly stated, but "regional demographics" are provided implying a mix of populations. |
| AI Setting 2 & 3 (Upper Extremity and Adipose Tissue) | 171 | 171 | Retrospective, "diverse population," multiple imaging sites and MRI manufacturers (GE, Siemens, Philips, Canon, Other/Unknown). Countries of origin not explicitly stated, but "regional demographics" are provided implying a mix of populations. |
- Overall Test Set: 148 unique subjects (for AI Setting 1) + 171 unique subjects (for AI Settings 2 & 3) = 319 unique subjects.
- Data Provenance: Retrospective, curated collection of MRI datasets from a diverse patient population (age, BMI, biological sex, ethnicity) from multiple imaging sites and MRI manufacturers (GE, Siemens, Philips, Canon, Toshiba/Other/Unknown). Independent from training datasets. De-identified.
3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications
- Number of Experts: Not explicitly stated, but referred to as "expert segmentation analysts" and "expert human annotation." The study mentions "consensus process by expert segmentation analysts" for training data, and for testing, "manual segmentation performed by experts" and that the "interobserver variability range observed among experts" was used as a benchmark. The document does not specify the exact number of experts or their specific qualifications (e.g., years of experience or board certification).
4. Adjudication Method for the Test Set
- Adjudication Method: The ground truth for both training and testing datasets was established through a "consensus process by expert segmentation analysts" for training data and "manual segmentation performed by experts" for the test set. It does not explicitly state a 2+1 or 3+1 method; rather, it implies a consensus was reached among the experts. The key here is the measurement of "interobserver variability," suggesting that multiple experts initially segmented the data, and their agreement (or discordance) defined the benchmark, from which a final consensus might have been derived.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- No MRMC study was performed. The performance testing was a standalone study comparing the AI segmentation to expert manual segmentation (ground truth) rather than comparing human readers with and without AI assistance. The text states: "Performance results demonstrated segmentation accuracy within the interobserver variability range observed among experts." This indicates a comparison of the AI's output against what multiple human experts would agree upon, not an evaluation of human performance improvement with AI.
6. Standalone Performance Study
- Yes, a standalone study was done. The document states: "To evaluate the performance of the MuscleView AI segmentation algorithm, a comprehensive test was conducted using a test set that was fully independent from the training set. The AI was blinded to the ground truth segmentation labels during inference, ensuring an unbiased comparison." This clearly describes a standalone performance evaluation of the algorithm.
7. Type of Ground Truth Used
- Expert Consensus / Expert Manual Segmentation: The ground truth was established by "manual segmentation performed by experts" and through a "consensus process by expert segmentation analysts." This is a form of expert consensus derived from detailed manual annotation. The benchmark for acceptance was the "interobserver variability range observed among experts."
8. Sample Size for the Training Set
| AI Setting | Number of Unique Scans | Number of Unique Subjects |
|---|---|---|
| AI Setting 1 (Lower Extremity) | 1658 | 1294 |
| AI Setting 2 & 3 (Upper Extremity and Adipose Tissue) | 392 | 209 |
| Total Unique Subjects for Training: 1294 + 209 = 1503 (Note: Some subjects might be present in both sets if they had both lower and upper extremity scans, but the table specifies "unique subjects" per AI setting) |
- Total Training Set: 1658 (scans for AI Setting 1) + 392 (scans for AI Settings 2 & 3) = 2050 unique scans.
- Total Unique Subjects: 1294 (for AI Setting 1) + 209 (for AI Settings 2 & 3) = 1503 unique subjects.
9. How Ground Truth for the Training Set Was Established
- The ground truth for the training set was established through a "consensus process by expert segmentation analysts" on a "curated collection of retrospective MRI datasets."
Ask a specific question about this device
(141 days)
Vista OS is an accessory to 1.5T and 3.0T whole-body magnetic resonance diagnostic devices (MRDD). It is intended to operate alongside, and in parallel with, the existing MR console to acquire traditional, real-time and accelerated images.
Vista OS software controls the MR scanner to acquire, reconstruct and display static and dynamic transverse, coronal, sagittal, and oblique cross-sectional images that display the internal structures and/or functions of the entire body. The images produced reflect the spatial distribution of nuclei exhibiting magnetic resonance. The magnetic resonance properties that determine image appearance are proton density, spin-lattice relaxation time (T1), spin-spin relaxation time (T2) and flow. When interpreted by a trained physician, these images provide information that may assist in the determination of a diagnosis.
Vista OS is intended for use as an accessory to the following MRI systems:
Manufacturers: GE Healthcare (GEHC), Siemens Healthineers
Field Strength: 1.5T and 3.0T
GE Software Versions: 12, 15, 16, 23, 24, 25, 26, 30
Siemens Software Versions: N4/VE; NX/VA
The Vista AI "Vista OS" product provides a seamless user experience for performing MRI studies on GE and Siemens scanners. The underlying software platform that we use to accomplish this task is called "RTHawk".
RTHawk is a software platform designed from the ground up to provide efficient MRI data acquisition, data transfer, image reconstruction, and interactive scan control and display of static and dynamic MR imaging data. It can control MR pulse sequences provided by Vista AI and, on scanners that support it, it can equally control MR pulse sequences provided by the scanner vendor. Scan protocols can be created by the user that mix and match among all available sequences.
RTHawk is an accessory to clinical 1.5T and 3.0T MR systems, operating alongside, and in parallel with, the MR scanner console with no permanent physical modifications to the MRI system required.
The software runs on a stand-alone Linux-based computer workstation with color monitor, keyboard and mouse. It is designed to operate alongside, and in parallel with, the existing MR console with no hardware modifications required to be made to the MR system or console. This workstation (the "Vista Workstation") is sourced by the Customer in conformance with specifications provided by Vista AI, and is verified prior to installation.
A private Ethernet network connects the Vista Workstation to the MR scanner computer. When not in use, the Vista Workstation may be detached from the MR scanner with no detrimental, residual impact upon MR scanner function, operation, or throughput.
RTHawk is an easy-to-use, yet fully functional, MR Operating System environment. RTHawk has been designed to provide a platform for the efficient acquisition, control, reconstruction, display, and storage of high-quality static and dynamic MRI images and data.
Data is continuously acquired and displayed. By user interaction or data feedback, fundamental scan parameters can be modified. Real-time and high-resolution image acquisition methods are used throughout RTHawk for scan plane localization, for tracking of patient motion, for detection of transient events, for on-the-fly, sub-second latency adjustment of image acquisition parameters (e.g., scan plane, flip angle, field-of-view, etc.) and for image visualization.
RTHawk implements the conventional MRI concept of anatomy- and indication-specific Protocols (e.g., ischemia evaluation, valvular evaluation, routine brain, etc.). Protocols are pre-set by Vista AI, but new protocols can be created and modified by the end user.
RTHawk Apps (Applications) are composed of a pulse sequence, predefined fixed and adjustable parameters, reconstruction pipeline(s), and a tailored graphical user interface containing image visualization and scan control tools. RTHawk Apps may provide real-time interactive scanning, conventional (traditional) batch-mode scanning, accelerated scanning, or calibration functions, in which data acquired may be used to tune or optimize other Apps.
When vendor-supplied pulse sequences are used in Vista OS, parameters and scan planes are prescribed in the Vista interface and images reconstructed by the scanner appear on the Vista Workstation. RTHawk Apps and vendor-supplied sequences can be mixed within a single protocol with a unified user experience for both.
Here's a breakdown of the acceptance criteria and study information for Vista OS, Vista AI Scan, and RTHawk, based on the provided FDA 510(k) clearance letter:
1. Table of Acceptance Criteria and Reported Device Performance
The document describes several clinical verification studies for new AI-powered features. Each feature has specific acceptance criteria.
| Feature Tested | Acceptance Criterion | Reported Performance (meets criteria?) |
|---|---|---|
| Automatic Detection of Motion Artifacts in Cine Cartesian SSFP | 80% agreement between neural-network assessment at its default sensitivity level and the cardiologist reader | Meets or exceeds |
| Automatic Detection of Ungateable Cardiac Waveforms | 80% agreement between neural-network assessment at its default sensitivity level and the cardiologist reader | Meets or exceeds |
| Automatic Cardiac Image Denoising | 1. Denoising should not detract from diagnostic accuracy in all cases. 2. Diagnostic quality of denoised data judged superior to paired non-denoised series in > 80% of test cases. | Meets or exceeds |
| Automatic Brain Localizer Prescriptions | Mean error in plane angulation < 3 degrees with standard deviation < 5 degrees, AND mean plane position error < 5 mm with standard deviation < 15 mm. | Meets or exceeds |
| Automatic Prostate Localizer Prescriptions | Mean 3D Intersection-over-Union (IoU) metrics of at least 0.65 for each volumetric scan prescription. | Meets or exceeds |
| Automatic Prediction of Velocity-Encoding VENC for Cine Flow Studies | Average velocity error < 10% individually for all vessels and views. | Meets or exceeds |
2. Sample Sizes and Data Provenance for Test Sets
The document provides sample sizes for each clinical verification study test set:
- Automatic Detection of Motion Artifacts: 120 sample images.
- Automatic Detection of Ungateable Cardiac Waveforms: 100 sample ECGs.
- Automatic Cardiac Image Denoising: 209 sample image series (paired with non-denoised).
- Automatic Brain Localizer Prescriptions: 323 sample image localizations.
- Automatic Prostate Localizer Prescriptions: 329 sample image localizations.
- Automatic Prediction of Velocity-Encoding VENC: 42 sample VENC peak estimates.
Data Provenance:
- Data was "collected from prior versions of Vista OS."
- "Data used in clinical verification were obtained from multiple clinical sites representing diverse ethnic groups, genders, and ages."
- The document implies the data is retrospective as it was "collected from prior versions of Vista OS" and used for verification after model training.
- Specific countries of origin are not mentioned, but the mention of "multiple clinical sites" and "diverse ethnic groups" suggests a broad geographic scope.
3. Number of Experts and Qualifications for Ground Truth - Test Set
The document states:
- "Clinical assessments were performed by independent board-certified radiologists or cardiologists."
- The number of experts is not explicitly stated (e.g., "three experts"), but it says "cardiologist reader" for cardiac studies and "trained physician" for other interpretations, implying at least one expert per study type.
- Qualifications: "board-certified radiologists or cardiologists." Specific experience (e.g., "10 years of experience") is not provided.
4. Adjudication Method for Test Set
The adjudication method is not explicitly stated. It refers to "agreement between neural-network assessment... and the cardiologist reader" for cardiac studies, and "judged superior" for denoising, which suggests a single expert's assessment was used as ground truth for comparison. It does not mention methods like 2+1 or 3+1 consensus.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
No MRMC comparative effectiveness study is explicitly mentioned. The studies focus on the performance of the AI models against expert assessment, not on comparing human readers with AI assistance versus without. The statement "all automations are provided as an additional aid to the trained operator who has the final decision power to accept or reject any suggestion or image enhancement that is provided" implies human-in-the-loop, but a specific MRMC study to quantify improvement is not described.
6. Standalone Performance Study
Yes, standalone (algorithm only without human-in-the-loop) performance studies were done for each of the new AI features. The acceptance criteria and reported performance directly measure the accuracy and agreement of the AI algorithm outputs against expert-established ground truth. The technologist retains the ability to reject or modify, but the initial validation is on the AI's standalone output.
7. Type of Ground Truth Used
The ground truth used for the clinical verification test sets was expert consensus / expert opinion.
- For artifact detection and ungateable waveforms: "cardiologist reader" assessment.
- For denoising: "diagnostic quality... judged superior" by "independent board-certified radiologists or cardiologists."
- For localizer prescriptions and VENC prediction: Implicitly, metrics like angular error, positional error, IoU, and velocity error are measured against a "correct" or "optimal" ground truth typically established by expert manual prescription or known physical values.
8. Sample Size for the Training Set
The document explicitly states that the test data was "segregated from training and tuning data." However, the exact sample size for the training set is not provided in the given text.
9. How Ground Truth for the Training Set Was Established
The document states:
- "Neural-network models were developed and trained using industry-standard methods for partitioning and isolating training, tuning, and internal testing datasets."
- "Model development data was partitioned by unique anonymous patient identifiers to prevent overlap across training, internal testing, and clinical verification datasets."
- "Clinical assessments were performed by independent board-certified radiologists or cardiologists who were not involved in any aspect of model development (including providing labels for training, tuning or internal testing)."
This implies that ground truth for the training set was established by expert labeling or consensus, but these experts were different from those who performed the final clinical verification. The exact number of experts involved in training data labeling and their qualifications are not specified.
Ask a specific question about this device
(20 days)
The InVision™ 3T Recharge Operating Suite is indicated for use as a magnetic resonance diagnostic device (MRDD) that produces transverse, sagittal, coronal and oblique cross sectional images, spectroscopic images and/or spectra, and that displays the internal structure and/or function of the head, body or extremities.
Other physical parameters derived from the images and/or spectra may also be produced. Depending on the region of interest, contrast agents may be used. These images and/or spectra and the physical parameters derived from the images and/or spectra when interpreted by a trained physician yield information that may assist in diagnosis.
The InVision™ 3T Recharge Operating Suite may also be used for imaging during intraoperative and interventional procedures when performed with MR safe devices or MR conditional devices approved for use with the MR scanner.
The InVision™ 3T Recharge Operating Suite may also be used for imaging in a multi-room suite.
The proposed InVision™ 3T Recharge Operating Suite featuring the Siemens MAGNETOM Skyra Fit upgrade from Verio is a traditional Magnetic Resonance Imaging (MRI) scanner that is suspended on an overhead rail system. It is designed to operate inside a Radio Frequency (RF) shielded room to facilitate intraoperative and multi-room use. The InVision™ 3T Recharge Operating Suite uses a scanner, the Siemens MAGNETOM Skyra Fit (K220589, reference device), to produce images of the internal structures of the head as well as the whole body. The Siemens 3T MAGNETOM Skyra Fit MRI scanner is an actively shielded magnet with a magnetic field strength of 3 Tesla.
The InVision™ 3T Recharge Operating Suite provides surgeons with access to magnetic resonance (MR) images while in the surgical field without changing the surgical/clinical workflow. When images are requested in the operating room (OR), the magnet is moved from the diagnostic room (DR) to the OR on a pair of overhead rails while the patient remains stationary during the procedure. Imaging is performed and once complete the magnet is moved out of the OR to the DR. The magnet can be moved in and out of the surgical field multiple times, as required, throughout the course of the surgical procedure. When the Siemens MAGNETOM Skyra Fit MRI scanner is in the DR, the OR may be used as a standard OR, utilizing standard surgical instruments and equipment during surgery. When not required in the OR, the scanner is available for use in the DR as a standard diagnostic MRI.
The provided FDA 510(k) Clearance Letter states that the "InVision™ 3T Recharge Operating Suite" has been tested and determined to be substantially equivalent to the predicate device "IMRIS iMRI 3T V (K212367)". However, the document does not contain specific acceptance criteria, reported device performance metrics (e.g., sensitivity, specificity, accuracy), or details of a de novo study proving the device meets said criteria.
Instead, the document focuses on demonstrating substantial equivalence through:
- Comparison of indications for use, principles of operation, and technological characteristics.
- Conformity to FDA recognized consensus standards.
- Successful completion of non-clinical performance, electrical, mechanical, structural, electromagnetic compatibility, and software testing.
- Successful completion of standard Siemens QA tests and expert review of sample clinical images to assess clinically acceptable MR imaging performance.
Therefore, many of the requested details about acceptance criteria and a specific study proving those criteria are not present in this document. The device is a "Magnetic resonance diagnostic device" (MRDD), implying its performance is related to image quality and ability to assist in diagnosis, but quantitative metrics are not provided.
Here is a summary of the information that can be extracted or inferred from the provided text, with blank or "N/A" for information not present:
1. Table of Acceptance Criteria and Reported Device Performance
As noted, the document does not specify quantitative acceptance criteria or reported device performance metrics in terms of clinical diagnostic efficacy (e.g., sensitivity, specificity, accuracy). The acceptance is based on demonstrating substantial equivalence to a predicate device and successful completion of various engineering and functional tests.
| Acceptance Criteria (Not explicitly stated in quantitative terms; inferred from substantial equivalence) | Reported Device Performance (Not explicitly stated in quantitative terms) |
|---|---|
| - Substantially equivalent Indications for Use | Same as predicate device (stated in "Equivalence Comparison" columns) |
| - Substantially equivalent Principles of Operation | Same as predicate device |
| - Substantially equivalent Technological Characteristics (with differences validated) | Differences in Siemens MRI scanner, magnet mover, and software. Validation data supports equivalent safety and performance profile. |
| - Conformity to FDA recognized consensus standards (e.g., for safety, EMC, software) | Conforms to listed standards (Table 2) |
| - Performance equivalent to predicate device for intraoperative features | Equivalence demonstrated through testing, no new safety/effectiveness issues. |
| - Clinically acceptable MR imaging performance in DR and OR | Demonstrated through standard Siemens QA tests and expert review of sample clinical images. |
| - Passed non-clinical testing (functional, imaging, integration, software, acoustic, heating) | Successful completion of all listed non-clinical tests (Table 3) |
2. Sample Size Used for the Test Set and Data Provenance
- Test Set Sample Size: Not specified. The document mentions "Sample Clinical Images in DR and OR" were assessed, but the number of images or cases is not given.
- Data Provenance: Not specified.
3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications
- Number of Experts: Not specified. The document mentions "expert review of sample clinical images."
- Qualifications of Experts: It states "interpreted by a trained physician" in the Indications for Use, and mentions "expert review" for image assessment, but specific qualifications (e.g., years of experience, subspecialty) are not provided.
4. Adjudication Method for the Test Set
- Adjudication Method: Not specified. The document states "expert review of sample clinical images," but does not detail how consensus or adjudication was reached if multiple experts were involved.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- MRMC Study: No, a multi-reader multi-case (MRMC) comparative effectiveness study was not explicitly mentioned or described. The submission focuses on demonstrating substantial equivalence via engineering/functional performance and image quality, not directly on reader performance with/without AI assistance.
- Effect Size of Human Readers Improve with AI vs. without AI Assistance: Not applicable, as no MRMC study involving AI assistance was described. The device itself is an MRI system, not an AI-assisted diagnostic tool.
6. Standalone (Algorithm Only Without Human-in-the-loop Performance) Study
- Standalone Study: Yes, in a way. The "system imaging performance testing" and successful completion of "standard Siemens QA tests" would represent standalone performance assessments of the MRI hardware and integrated software components responsible for image acquisition, without human interpretation in the loop for the performance assessment itself. However, the primary purpose of the device is to produce images "interpreted by a trained physician," meaning human-in-the-loop is part of its intended use. The document does not describe an "algorithm only" study in the context of an AI-driven diagnostic algorithm.
7. Type of Ground Truth Used
- Ground Truth Type: For the "Sample Clinical Images," the ground truth establishment method is not detailed beyond "expert review." For the general device functionality and image quality, the "ground truth" would be established by engineering specifications, recognized standards, and the performance of the predicate device.
8. Sample Size for the Training Set
- Training Set Sample Size: Not applicable. This document describes the clearance of an MRI system, not an AI/ML algorithm that typically requires a discrete "training set." The software components mentioned (Magnet Mover Software and Application Platform Software) likely underwent standard software verification and validation, but not in the context of a "training set" for machine learning.
9. How the Ground Truth for the Training Set Was Established
- Ground Truth Establishment for Training Set: Not applicable (see point 8).
Ask a specific question about this device
(190 days)
The uMR Jupiter system is indicated for use as a magnetic resonance diagnostic device (MRDD) that produces sagittal, transverse, coronal, and oblique cross sectional images, and spectroscopic images, and that display internal anatomical structure and/or function of the head, body and extremities.
These images and the physical parameters derived from the images when interpreted by a trained physician yield information that may assist the diagnosis. Contrast agents may be used depending on the region of interest of the scan.
The device is intended for patients > 20 kg/44 lbs.
uMR Jupiter is a 5T superconducting magnetic resonance diagnostic device with a 60cm size patient bore and 8 channel RF transmit system. It consists of components such as magnet, RF power amplifier, RF coils, gradient power amplifier, gradient coils, patient table, spectrometer, computer, equipment cabinets, power distribution system, internal communication system, and vital signal module etc. uMR Jupiter is designed to conform to NEMA and DICOM standards.
The modification performed on the uMR Jupiter in this submission is due to the following changes that include:
-
Addition of RF coils: SuperFlex Large - 24 and Foot & Ankle Coil - 24.
-
Addition of applied body part for certain coil: SuperFlex Small-24 (add imaging of ankle).
-
Addition and modification of pulse sequences:
- a) New sequences: fse_wfi, gre_fsp_c (3D), gre_bssfp_ucs, epi_fid(3D), epi_dti_msh.
- b) Added Associated options for certain sequences: asl_3d (add mPLD) (Only output original images and no quantification images are output), gre_fsp_c (add Cardiac Cine, Cardiac Perfusion, PSIR, Cardiac mapping), gre_quick(add WFI, MRCA), gre_bssfp(add Cardiac Cine, Cardiac mapping), epi_dwi(add IVIM) (Only output original images and no quantification images are output).
-
Addition of function: EasyScan, EasyCrop, t-ACS, QScan, tFAST, DeepRecon and WFI.
-
Addition of workflow: EasyFACT.
This FDA 510(k) summary (K250246) for the uMR Jupiter provides details on several new AI-assisted features. Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided text:
Important Note: The document is not a MRMC study comparing human readers with and without AI. Instead, it focuses on the performance of individual AI modules and their integration into the MRI system, often verified by radiologists' review of image quality.
Acceptance Criteria and Reported Device Performance
The document presents acceptance criteria implicitly through the "Test Result" or "Performance Verification" sections for each AI feature. The "Performance" column below summarizes the device's reported achievement for these criteria.
| Feature | Acceptance Criteria (Implicit) | Reported Device Performance |
|---|---|---|
| WFI | Expected to produce diagnostic quality images and effectively overcome water-fat swap artifacts, providing accurate initialization for the RIPE algorithm. Modes (default, standard, fast) should meet clinical diagnosis requirements. | "Based on the clinical evaluation of this independent testing dataset by three U.S. certificated radiologists, all three WFI modes meet the requirements for clinical diagnosis. In summary, the WFI performed as intended and passed all performance evaluations." |
| t-ACS | AI Module Test: AI prediction output should be much closer to the reference compared to the AI module input images. Integration Test: Better consistency between t-ACS and reference than CS and reference; no large structural differences; motion-time curves and Bland-Altman analysis showing consistency. | AI Module Test: "AI prediction (AI module output) was much closer to the reference comparing to the AI module input images in all t-ACS application types." Integration Test: 1. "A better consistency between t-ACS and the reference than that between CS and the reference was shown in all t-ACS application types." 2. "No large structural difference appeared between t-ACS and the reference in all t-ACS application types." 3. "The motion-time curves and Bland-Altman analysis showed the consistency between t-ACS and the reference based on simulated and real acquired data in all t-ACS application types." Overall: "The t-ACS on uMR Jupiter was shown to perform better than traditional Compressed Sensing in the sense of discrepancy from fully sampled images and PSNR using images from various age groups, BMIs, ethnicities and pathological variations. The structure measurements on paired images verified that same structures of t-ACS and reference were significantly the same. And t-ACS integration tests in two applications proved that t-ACS had good agreement with the reference." |
| DeepRecon | Expected to provide image de-noising and super-resolution, resulting in diagnostic quality images, with equivalent or higher scores than reference images in terms of diagnostic quality. | "The DeepRecon has been validated to provide image de-nosing and super-resolution processing using various ethnicities, age groups, BMIs, and pathological variations. In addition, DeepRecon images were evaluated by American Board of Radiologists certificated physicians, covering a range of protocols and body parts. The evaluation reports from radiologists verified that DeepRecon meets the requirements of clinical diagnosis. All DeepRecon images were rated with equivalent or higher scores in terms of diagnosis quality." |
| EasyFACT | Expected to effectively automate ROI placement and numerical statistics for FF and R2* values, with results subjectively evaluated as effective. | "The subjective evaluation method was used [to verify effectiveness]." "The proposal of algorithm acceptance criteria and score processing are conducted by the licensed physicians with U.S. credentials." (Implied successful verification from context) |
| EasyScan | Pass criteria of 99.3% for automatic slice group positioning, meeting safety and effectiveness requirements. | "The pass criteria of EasyScan feature is 99.3%, and the results evaluated by the licenced MRI technologist with U.S. credentials. Therefore, EasyScan meets the criteria for safety and effectiveness, and EasyScan can meet the requirements for automatic positioning locates slice groups." (Implied reaching or exceeding 99.3%.) |
| EasyCrop | Pass criteria of 100% for automatic image cropping, meeting safety and effectiveness requirements. | "The pass criteria of EasyCrop feature is 100%, and the results evaluated by the licenced MRI technologist with U.S. credentials. Therefore, EasCrop meets the criteria for safety and effectiveness, and EasCrop can meet the requirements for automatic cropping." (Implied reaching or exceeding 100%.) |
Study Details
-
Sample sizes used for the test set and the data provenance:
- WFI: 144 cases from 28 volunteers. Data collected from UIH Jupiter. "Completely separated from the previous mentioned training dataset by collecting from different volunteers and during different time periods." (Retrospective for testing, though specific country of origin beyond "UIH MRI systems" is not explicitly stated for testing data, training data has "Asian" majority.)
- t-ACS: 35 subjects (data from 76 volunteers used for overall training/validation/test split). Test data collected independently from the training data, with separated subjects and during different time periods. "White," "Black," and "Asian" ethnicities mentioned, implying potentially multi-country or diverse internal dataset.
- DeepRecon: 20 subjects (2216 cases). "Diverse demographic distributions" including "White" and "Asian" ethnicities. "Collecting testing data from various clinical sites and during separated time periods."
- EasyFACT: 5 subjects. "Data were acquired from 5T magnetic resonance imaging equipment from UIH," and "Asia" ethnicity is listed.
- EasyScan: 30 cases from 18 "Asia" subjects (initial testing); 40 cases from 8 "Asia" subjects (validation on uMR Jupiter system).
- EasyCrop: 5 subjects. "Data were acquired from 5T magnetic resonance imaging equipment from UIH," and "Asia" ethnicity is listed.
Data provenance isn't definitively "retrospective" or "prospective" for the test sets, but the emphasis on "completely separated" and "independent" from training data collected at "different time periods" suggests these were distinct, potentially newly acquired or curated sets for evaluation. The presence of multiple ethnicities (White, Black, Asian) suggests potentially broader geographical origins than just China where the company is based, or a focus on creating diverse internal datasets.
-
Number of experts used to establish the ground truth for the test set and the qualifications of those experts:
- WFI: Three U.S. certificated radiologists. (Qualifications: U.S. board-certified radiologists).
- t-ACS: No separate experts establishing ground truth for the test set performance evaluation are mentioned beyond the quantitative metrics (MAE, PSNR, SSIM) compared against "fully sampled images" (reference/ground truth). The document states that fully-sampled k-space data transformed into image domain served as the reference.
- DeepRecon: American Board of Radiologists certificated physicians. (Qualifications: American Board of Radiologists certificated physicians).
- EasyFACT: Licensed physicians with U.S. credentials. (Qualifications: Licensed physicians with U.S. credentials).
- EasyScan: Licensed MRI technologist with U.S. credentials. (Qualifications: Licensed MRI technologist with U.S. credentials).
- EasyCrop: Licensed MRI technologist with U.S. credentials. (Qualifications: Licensed MRI technologist with U.S. credentials).
-
Adjudication method (e.g. 2+1, 3+1, none) for the test set:
The document does not explicitly state an adjudication method (like 2+1 or 3+1) for conflict resolution among readers. For WFI, DeepRecon, EasyFACT, EasyScan, and EasyCrop, it implies a consensus or majority opinion model based on the "evaluation reports from radiologists/technologists." For t-ACS, the evaluation of the algorithm's output is based on quantitative metrics against a reference image ground truth.
-
If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:
No, a traditional MRMC comparative effectiveness study where human readers interpret cases with AI assistance versus without AI assistance was not described. The studies primarily validated the AI features' standalone performance (e.g., image quality, accuracy of automated functions) or their output's equivalence/superiority to traditional methods, often through expert review of the AI-generated images. Therefore, no effect size of human reader improvement with AI assistance is provided.
-
If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:
Yes, standalone performance was the primary focus for most AI features mentioned, though the output was often subject to human expert review.
- WFI: The AI network provides initialization for the RIPE algorithm. The output image quality was then reviewed by radiologists.
- t-ACS: Performance was evaluated quantitatively against fully sampled images (reference/ground truth), indicating a standalone algorithm evaluation.
- DeepRecon: Evaluated based on images processed by the algorithm, with expert review of the output images.
- EasyFACT, EasyScan, EasyCrop: These are features that automate parts of the workflow. Their output (e.g., ROI placement, slice positioning, cropping) was evaluated, often subjectively by experts, but the automation itself is algorithm-driven.
-
The type of ground truth used (expert consensus, pathology, outcomes data, etc.):
- WFI: Expert consensus/review by three U.S. certificated radiologists for "clinical diagnosis" quality. No "ground truth" for water-fat separation accuracy itself is explicitly stated, but the problem being solved (water-fat swap artifacts) implies the improved stability of the algorithm's output.
- t-ACS: "Fully-sampled k-space data were collected and transformed into image domain as reference." This serves as the "true" or ideal image for comparison, not derived from expert interpretation or pathology.
- DeepRecon: "Multiple-averaged images with high-resolution and high SNR were collected as the ground-truth images." Expert review confirms diagnostic quality of processed images.
- EasyFACT: Subjective evaluation by licensed physicians with U.S. credentials, implying their judgment regarding the correctness of ROI placement and numerical statistics.
- EasyScan: Evaluation by a licensed MRI technologist with U.S. credentials against the "correctness" of automatic slice positioning.
- EasyCrop: Evaluation by a licensed MRI technologist with U.S. credentials against the "correctness" of automatic cropping.
-
The sample size for the training set:
- WFI AI module: 59 volunteers (2604 cases). Each scanned for multiple body parts and WFI protocols.
- t-ACS AI module: Not specified as a distinct number, but "collected from a variety of anatomies, image contrasts, and acceleration factors... resulting in a large number of cases." The overall dataset for training, validation, and testing was 76 volunteers.
- DeepRecon: 317 volunteers.
- EasyFACT, EasyScan, EasyCrop: "The training data used for the training of the EasyFACT algorithm is independent of the data used to test the algorithm." For EasyScan and EasyCrop, it states "The testing dataset was collected independently from the training dataset," but does not provide specific training set sizes for these workflow features.
-
How the ground truth for the training set was established:
- WFI AI module: The AI network was trained to provide accurate initialization for the RIPE algorithm. The document implies that the RIPE algorithm itself with human oversight or internal validation would have been used to establish correct water/fat separation for training.
- t-ACS AI module: "Fully-sampled k-space data were collected and transformed into image domain as reference." This served as the ground truth against which the AI was trained to reconstruct undersampled data.
- DeepRecon: "The multiple-averaged images with high-resolution and high SNR were collected as the ground-truth images." This indicates that high-quality, non-denoised, non-super-resolved images were used as the ideal target for the AI.
- EasyFACT, EasyScan, EasyCrop: Not explicitly detailed beyond stating that training data ground truth was established to enable the algorithms for automatic ROI placement, slice group positioning, and image cropping, respectively. It implies a process of manually annotating or identifying the correct ROIs/positions/crops on training data for the AI to learn from.
Ask a specific question about this device
(118 days)
Vantage Fortian/Orian 1.5T systems are indicated for use as a diagnostic imaging modality that produces cross-sectional transaxial, coronal, sagittal, and oblique images that display anatomic structures of the head or body. Additionally, this system is capable of non-contrast enhanced imaging, such as MRA.
MRI (magnetic resonance imaging) images correspond to the spatial distribution of protons (hydrogen nuclei) that exhibit nuclear magnetic resonance (NMR). The NMR properties of body tissues and fluids are:
- Proton density (PD) (also called hydrogen density)
- Spin-lattice relaxation time (T1)
- Spin-spin relaxation time (T2)
- Flow dynamics
- Chemical Shift
Depending on the region of interest, contrast agents may be used. When interpreted by a trained physician, these images yield information that can be useful in diagnosis.
The Vantage Fortian (Model MRT-1550/WK, WM, WO, WQ)/Vantage Orian (Model MRT-1550/U3, U4, U7, U8) is a 1.5 Tesla Magnetic Resonance Imaging (MRI) System. These Vantage Fortian/Orian models use 1.4 m short and 4.1 tons light weight magnet. They include the Canon Pianissimo™ Sigma and Pianissimo Zen technology (scan noise reduction technology). The design of the gradient coil and the whole-body coil of these Vantage Fortian/Orian models provide the maximum field of view of 55 x 55 x 50 cm and include the standard (STD) gradient system.
The Vantage Orian (Model MRT-1550/ UC, UD, UG, UH, UK, UL, UO, UP) is a 1.5 Tesla Magnetic Resonance Imaging (MRI) System. The Vantage Orian models use 1.4 m short and 4.1 tons light weight magnet. They include the Canon Pianissimo™ and Pianissimo Zen technology (scan noise reduction technology). The design of the gradient coil and the whole-body coil of these Vantage Orian models provide the maximum field of view of 55 x 55 x 50 cm. The Model MRT-1550/ UC, UD, UG, UH, UK, UL, UO, UP includes the XGO gradient system.
This system is based upon the technology and materials of previously marketed Canon Medical Systems MRI systems and is intended to acquire and display cross-sectional transaxial, coronal, sagittal, and oblique images of anatomic structures of the head or body. The Vantage Fortian/Orian MRI System is comparable to the current 1.5T Vantage Fortian/Orian MRI System (K240238), cleared April 12, 2024, with the following modifications.
Acceptance Criteria and Study for Canon Medical Systems Vantage Fortian/Orian 1.5T with AiCE Reconstruction Processing Unit for MR
This document outlines the acceptance criteria and the study conducted to demonstrate that the Canon Medical Systems Vantage Fortian/Orian 1.5T with AiCE Reconstruction Processing Unit for MR (V10.0) device meets these criteria, specifically focusing on the new features: 4D Flow, Zoom DWI, and PIQE.
The provided text focuses on the updates in V10.0 of the device, which primarily include software enhancements: 4D Flow, Zoom DWI, and an extended Precise IQ Engine (PIQE). The acceptance criteria and testing are described for these specific additions.
1. Table of Acceptance Criteria and Reported Device Performance
The general acceptance criterion for all new features appears to be demonstrating clinical acceptability and performance that is either equivalent to or better than conventional methods, maintaining image quality, and confirming intended functionality. Specific quantitative acceptance criteria are not explicitly detailed in the provided document beyond qualitative assessments and comparative statements.
| Feature | Acceptance Criteria (Implied from testing) | Reported Device Performance |
|---|---|---|
| 4D Flow | Velocity measurement with and without PIQE of a phantom should meet the acceptance criteria for known flow values. Images in volunteers should demonstrate velocity stream lines consistent with physiological flow. | The testing confirmed that the flow velocity of the 4DFlow sequence met the acceptance criteria. Images in volunteers demonstrated velocity stream lines. |
| Zoom DWI | Effective suppression of wraparound artifacts in the PE direction. Reduction of image distortion level when setting a smaller PE-FOV. Accurate measurement of ADC values. | Testing confirmed that Zoom DWI is effective for suppressing wraparound artifacts in the PE direction; setting a smaller PE-FOV in Zoom DWI scan can reduce the image distortion level; and the ADC values can be measured accurately. |
| PIQE (Bench Testing) | Generate higher in-plane matrix images from low matrix images. Mitigate ringing artifacts. Maintain similar or better contrast and SNR compared to standard clinical techniques. Achieve sharper edges. | Bench testing demonstrated that PIQE generates images with sharper edges while mitigating the smoothing and ringing effects and maintaining similar or better contrast and SNR compared to standard clinical techniques (zero-padding interpolation and typical clinical filters). |
| PIQE (Clinical Image Review) | Images reconstructed with PIQE should be scored clinically acceptable or better by radiologists/cardiologists across various categories (ringing, sharpness, SNR, overall image quality (IQ), and feature conspicuity). PIQE should generate higher spatial in-plane resolution images from lower resolution images (e.g., triple matrix dimensions, 9x factor). PIQE should contribute to ringing artifact reduction, denoising, and increased sharpness. PIQE should be able to accelerate scanning by reducing acquisition matrix while maintaining clinical matrix size and image quality. PIQE benefits should be obtainable on regular clinical protocols without requiring acquisition parameter adjustment. Reviewer agreement should be strong. | The resulting reconstructions were scored on average at, or above, clinically acceptable. Exhibiting a strong agreement at the "good" and "very good" level in the IQ metrics, the Reviewers' scoring confirmed all the specific criteria listed (higher spatial resolution, ringing reduction, denoising, sharpness, acceleration, and applicability to regular protocols). |
2. Sample Size Used for the Test Set and Data Provenance
- 4D Flow & Zoom DWI: Evaluated utilizing phantom images and "representative volunteer images." Specific numbers for volunteers are not provided.
- PIQE Clinical Image Review Study:
- Subjects: A total of 75 unique subjects.
- Scans: Comprising a total of 399 scans.
- Reconstructions: Each scan was reconstructed multiple ways with or without PIQE, totaling 1197 reconstructions for scoring.
- Data Provenance: Subjects were from two sites in USA and Japan. The study states that although the dataset includes subjects from outside the USA, the population is expected to be representative of the intended US population due to PIQE being an image post-processing algorithm that is not disease-specific and not dependent on factors like population variation or body habitus.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications
- PIQE Clinical Image Review Study:
- Number of Experts: 14 USA board-certified radiologists/cardiologists.
- Distribution: 3 experts per anatomy (Body, Breast, Cardiac, Musculoskeletal (MSK), and Neuro).
- Qualifications: "USA board-certified radiologists/cardiologists." Specific years of experience are not mentioned.
4. Adjudication Method for the Test Set
- PIQE Clinical Image Review Study: The study describes a randomized, blinded clinical image review study. Images reconstructed with either the conventional method or the new PIQE method were randomized and blinded to the reviewers. Reviewers scored the images independently using a modified 5-point Likert scale. Analytical methods used included Gwet's Agreement Coefficient for reviewer agreement and Generalized Estimating Equations (GEE) for differences between reconstruction techniques, implying a statistical assessment of agreement and comparison across reviewers rather than a simple consensus adjudication method (e.g., 2+1, 3+1).
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- Yes, an MRMC comparative effectiveness study was done for PIQE.
- Effect Size of Human Readers' Improvement with AI vs. Without AI Assistance: The document states that "the Reviewers' scoring confirmed that: (a) PIQE generates higher spatial in-plane resolution images from lower resolution images (with the ability to triple the matrix dimensions in both in-plane directions, i.e. a factor of 9x); (b) PIQE contributes to ringing artifact reduction, denoising and increased sharpness; (c) PIQE is able to accelerate scanning by reducing the acquisition matrix only, while maintaining clinical matrix size and image quality; and (d) PIQE benefits can be obtained on regular clinical protocols without requiring acquisition parameter adjustment."
- While it reports positive outcomes ("scored on average at, or above, clinically acceptable," "strong agreement at the 'good' and 'very good' level"), it does not provide a quantitative effect size (e.g., AUC difference, diagnostic accuracy improvement percentage) of how much human readers improve with AI (PIQE) assistance compared to without it. The focus is on the quality of PIQE-reconstructed images as perceived by experts, rather than the direct impact on diagnostic accuracy or reader performance metrics. It confirms that the performance is "similar or better" compared to conventional methods.
6. Standalone (Algorithm Only) Performance Study
- Yes, standalone performance was conducted for PIQE and other features.
- 4D Flow and Zoom DWI: Evaluated using phantom images, which represents standalone, objective measurement of the algorithm's performance against known physical properties.
- PIQE: Bench testing was performed on typical clinical images to evaluate metrics like Edge Slope Width (sharpness), Ringing Variable Mean (ringing artifacts), Signal-to-Noise ratio (SNR), and Contrast Ratio. This is an algorithmic-only evaluation against predefined metrics, without direct human interpretation as part of the performance metric.
7. Type of Ground Truth Used
- 4D Flow & Zoom DWI:
- Phantom Studies: Known physical values (e.g., known flow values for velocity measurement, known distortion levels, known ADC values).
- PIQE:
- Bench Testing: Quantitative imaging metrics derived from the images themselves (Edge Slope Width, Ringing Variable Mean, SNR, Contrast Ratio) are used to assess the impact of the algorithm. No external ground truth (like pathology) is explicitly mentioned here, as the focus is on image quality enhancement.
- Clinical Image Review Study: Expert consensus/opinion (modified 5-point Likert scale scores from 14 board-certified radiologists/cardiologists) was used as the ground truth for image quality, sharpness, ringing, SNR, and feature conspicuity, compared against images reconstructed with conventional methods. No pathology or outcomes data is mentioned as ground truth.
8. Sample Size for the Training Set
The document explicitly states that the 75 unique subjects used in the PIQE clinical image review study were "separate from the training data sets." However, it does not specify the sample size for the training set used for the PIQE deep learning model.
9. How the Ground Truth for the Training Set Was Established
The document does not provide information on how the ground truth for the training set for PIQE was established. It only mentions that the study test data sets were separate from the training data sets.
Ask a specific question about this device
(244 days)
The uMR Ultra system is indicated for use as a magnetic resonance diagnostic device (MRDD) that produces sagittal, transverse, coronal, and oblique cross sectional images, and spectroscopic images, and that display internal anatomical structure and/or function of the head, body and extremities. These images and the physical parameters derived from the images when interpreted by a trained physician yield information that may assist the diagnosis. Contrast agents may be used depending on the region of interest of the scan.
uMR Ultra is a 3T superconducting magnetic resonance diagnostic device with a 70cm size patient bore and 2 channel RF transmit system. It consists of components such as magnet, RF power amplifier, RF coils, gradient power amplifier, gradient coils, patient table, spectrometer, computer, equipment cabinets, power distribution system, internal communication system, and vital signal module etc. uMR Ultra is designed to conform to NEMA and DICOM standards.
Here's a breakdown of the acceptance criteria and study details for the uMR Ultra device, based on the provided FDA 510(k) clearance letter.
1. Table of Acceptance Criteria and Reported Device Performance
Given the nature of the document, which focuses on device clearance, multiple features are discussed. I will present the acceptance criteria and results for the AI-powered features, as these are the most relevant to the "AI performance" aspect.
Acceptance Criteria and Device Performance for AI-Enabled Features
| AI-Enabled Feature | Acceptance Criteria | Reported Device Performance |
|---|---|---|
| ACS | - Ratio of error: NRMSE(output)/NRMSE(input) is always less than 1. - ACS has higher SNR than CS. - ACS has higher (standard deviation (SD) / mean value(S)) values than CS. - Bland-Altman analysis of image intensities acquired using fully sampled and ACS shown with less than 1% bias and all sample points falls in the 95% confidence interval. - Measurement differences on ACS and fully sampled images of same structures under 5% is acceptable. - Radiologists rate all ACS images with equivalent or higher scores in terms of diagnosis quality. | - Pass - Pass - Pass - Pass - Pass - Verified that ACS meets the requirements of clinical diagnosis. All ACS images were rated with equivalent or higher scores in terms of diagnosis quality. |
| DeepRecon | - DeepRecon images achieve higher SNR compared to NADR images. - Uniformity difference between DeepRecon images and NADR images under 5%. - Intensity difference between DeepRecon images and NADR images under 5%. - Measurements on NADR and DeepRecon images of same structures, measurement difference under 5%. - Radiologists rate all DeepRecon images with equivalent or higher scores in terms of diagnosis quality. | - NADR: 343.63, DeepRecon: 496.15 (PASS) - 0.07% (PASS) - 0.2% (PASS) - 0% (PASS) - Verified that DeepRecon meets the requirements of clinical diagnosis. All DeepRecon images were rated with equivalent or higher scores in terms of diagnosis quality. |
| EasyScan | No Fail cases and auto position success rate P1/(P1+P2+F) exceeds 80%. (P1: Pass with auto positioning; P2: Pass with user adjustment; F: Fail) | 99.6% |
| t-ACS | - AI prediction (AI module output) much closer to reference compared to AI module input images. - Better consistency between t-ACS and reference than between CS and reference. - No large structural difference appeared between t-ACS and reference. - Motion-time curves and Bland-Altman analysis consistency between t-ACS and reference. | - Pass - Pass - Pass - Pass |
| AiCo | - AiCo images exhibit improved PSNR and SSIM compared to the originals. - No significant structural differences from the gold standard. - Radiologists confirm image quality is diagnostically acceptable, fewer motion artifacts, and greater benefits for clinical diagnosis. | - Pass - Pass - Confirmed. |
| SparkCo | - Average detection accuracy needs to be > 90%. - Average PSNR of spark-corrected images needs to be higher than spark images. - Spark artifacts need to be reduced or corrected after enabling SparkCo. | - 94% - 1.6 higher - Successfully corrected |
| ImageGuard | Success rate P/(P+F) exceeds 90%. (P: Pass if prompt appears for motion / no prompt for no motion; F: Fail if prompt doesn't appear for motion / prompt appears for no motion) | 100% |
| EasyCrop | No Fail cases and pass rate P1/(P1+P2+F) exceeds 90%. (P1: Other peripheral tissues cropped, meets user requirements; P2: Cropped images don't meet user requirements, but can be re-cropped; F: EasyCrop fails or original images not saved) | 100% |
| EasyFACT | Satisfied and Acceptable ratio (S+A)/(S+A+F) exceeds 95%. (S: All ROIs placed correctly; A: Fewer than five ROIs placed correctly; F: ROIs positioned incorrectly or none placed) | 100% |
| Auto TI Scout | Average frame difference between auto-calculated TI and gold standard is ≤ 1 frame, and maximum frame difference is ≤ 2 frames. | Average: 0.37-0.44 frames, Maximum: 1-2 frames (PASS) |
| Inline MOCO | Average Dice coefficient of the left ventricular myocardium after motion correction is > 0.87. | Cardiac Perfusion Images: 0.92 Cardiac Dark Blood Images: 0.96 |
| Inline ED/ES Phases Recognition | The average error between the phase indices calculated by the algorithm for the ED and ES of test data and the gold standard phase indices does not exceed 1 frame. | 0.13 frames |
| Inline ECV | No failure cases, satisfaction rate S/(S+A+F) > 95%. (S: Segmentation adheres to myocardial boundary, blood pool ROI correct; A: Small missing/redundant areas but blood pool ROI correct; F: Myocardial mask fails or blood pool ROI incorrect) | 100% |
| EasyRegister (Height Estimation) | PH5 (Percentage of height error within 5%); PH15 (Percentage of height error within 15%); MEAN_H (Average error of height). (Specific numerical criteria not explicitly stated beyond these metrics) | PH5: 92.4% PH15: 100% MEAN_H: 31.53mm |
| EasyRegister (Weight Estimation) | PW10 (Percentage of weight error within 10%); PW20 (Percentage of weight error within 20%); MEAN_W (Average error of weight). (Specific numerical criteria not explicitly stated beyond these metrics) | PW10: 68.64% PW20: 90.68% MEAN_W: 6.18kg |
| EasyBolus | No Fail cases and success rate P1+P2/(P1+P2+F) exceeds 100%. (P1: Monitoring point positioning meets user requirements, frame difference ≤ 1 frame; P2: Monitoring point positioning meets user requirements, frame difference = 2 frames; F: Auto position fails or frame difference > 2 frames) | P1: 80% P2: 20% Total Failure Rate: 0% Pass: 100% |
For the rest of the questions, I will consolidate the information where possible, as some aspects apply across multiple AI features.
2. Sample Sizes Used for the Test Set and Data Provenance
- ACS: 749 samples from 25 volunteers. Diverse demographic distributions covering various genders, age groups, ethnicity (White, Asian, Black), and BMI (Underweight, Healthy, Overweight/Obesity). Data collected from various clinical sites during separated time periods.
- DeepRecon: 25 volunteers (nearly 2200 samples). Diverse demographic distributions covering various genders, age groups, ethnicity (White, Asian, Black), and BMI. Data collected from various clinical sites during separated time periods.
- EasyScan: 444 cases from 116 subjects. Diverse demographic distributions covering various genders, age groups, and ethnicities. Data acquired from UIH MRI equipment (1.5T and 3T). Data provenance not explicitly stated (e.g., country of origin), but given the company location (China) and "U.S. credentials" for evaluators, it likely includes data from both. The document states "The testing dataset was collected independently from the training dataset".
- t-ACS: 1173 cases from 60 volunteers. Diverse demographic distributions covering various genders, age groups, ethnicities (White, Black, Asian) and BMI. Data acquired by uMR Ultra scanners. Data provenance not explicitly stated, but implies global standard testing.
- AiCo: 218 samples from 24 healthy volunteers. Diverse demographic distributions covering various genders, age groups, BMI (Under/healthy weight, Overweight/Obesity), and ethnicity (White, Black, Asian). Data provenance not explicitly stated.
- SparkCo: 59 cases from 15 patients for real-world spark raw data testing. Diverse demographic distributions including gender, age, BMI (Underweight, Healthy, Overweight, Obesity), and ethnicity (Asian, "N.A." for White, implying not tested as irrelevant). Data acquired by uMR 1.5T and uMR 3T scanners.
- ImageGuard: 191 cases from 80 subjects. Diverse demographic distributions covering various genders, age groups, and ethnicities (White, Black, Asian). Data acquired from UIH MRI equipment (1.5T and 3T).
- EasyCrop: Not explicitly stated as "subjects" vs. "cases," but tested on 5 intended imaging body parts. Sample size (N=65) implies 65 cases/scans, potentially from 65 distinct subjects or fewer if subjects had multiple scans. Diverse demographic distributions covering various genders, age groups, ethnicity (Asian, Black, White). Data acquired from UIH MRI equipment (1.5T and 3T).
- EasyFACT: 25 cases from 25 volunteers. Diverse demographic distributions covering various genders, age groups, weight, and ethnicity (Asian, White, Black).
- Auto TI Scout: 27 patients. Diverse demographic distributions covering various genders, age groups, ethnicity (Asian, White), and BMI. Data acquired from 1.5T and 3T scanners.
- Inline MOCO: Cardiac Perfusion Images: 105 cases from 60 patients. Cardiac Dark Blood Images: 182 cases from 33 patients. Diverse demographic distributions covering age, gender, ethnicity (Asian, White, Black, Hispanic), BMI, field strength, and disease conditions (Positive, Negative, Unknown).
- Inline ED/ES Phases Recognition: 95 cases from 56 volunteers, covering various genders, age groups, field strength, disease conditions (NOR, MINF, DCM, HCM, ARV), and ethnicity (Asian, White, Black).
- Inline ECV: 90 images from 28 patients. Diverse demographic distributions covering gender, age, BMI, field strength, ethnicity (Asian, White), and health status (Negative, Positive, Unknown).
- EasyRegister (Height/Weight Estimation): 118 cases from 63 patients. Diverse ethnic groups (Chinese, US, France, Germany).
- EasyBolus: 20 subjects. Diverse demographic distributions covering gender, age, field strength, and ethnicity (Asia).
Data Provenance (Retrospective/Prospective and Country of Origin):
The document states "The testing dataset was collected independently from the training dataset, with separated subjects and during different time periods." This implies a prospective collection for validation that is distinct from the training data. For ACS and DeepRecon, it explicitly mentions "US subjects" for some evaluations, but for many features, the specific country of origin for the test set is not explicitly stated beyond "diverse ethnic groups" or "Asian" which could be China (where the company is based) or other Asian populations. The use of "U.S. board-certified radiologists" and "licensed MRI technologist with U.S. credentials" for evaluation suggests the data is intended to be representative of, or directly includes, data relevant to the U.S. clinical context.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications
- ACS & DeepRecon: Evaluated by "American Board of Radiologists certificated physicians" (plural, implying multiple, at least 2). Not specified how many exactly, but strong qualifications.
- EasyScan, ImageGuard, EasyCrop, EasyBolus: Evaluated by "licensed MRI technologist with U.S. credentials." For EasyBolus, it specifies "certified professionals in the United States." Number not explicitly stated beyond "the" technologist/professionals, but implying multiple for robust evaluation.
- Inline MOCO & Inline ECV: Ground truth annotations done by a "well-trained annotator" and "finally, all ground truth was evaluated by three licensed physicians with U.S. credentials." This indicates a 3-expert consensus/adjudication.
- SparkCo: "One experienced evaluator" for subjective image quality improvement.
- For other features (t-ACS, EasyFACT, Auto TI Scout, Inline ED/ES Phases Recognition, EasyRegister), the ground truth seems to be based on physical measurements (for EasyRegister) or computational metrics (for t-ACS based on fully-sampled images, and for accuracy of ROI placement against defined standards), rather than human expert adjudication for ground truth.
4. Adjudication Method (e.g., 2+1, 3+1, none) for the Test Set
- Inline MOCO & Inline ECV: "Evaluated by three licensed physicians with U.S. credentials." This implies a 3-expert consensus method for ground truth establishment.
- ACS, DeepRecon, AiCo: "Evaluated by American Board of Radiologists certificated physicians" (plural). While not explicitly stated as 2+1 or 3+1, it suggests a multi-reader review, where consensus was likely reached for the reported diagnostic quality.
- SparkCo: "One experienced evaluator" was used for subjective evaluation, implying no formal multi-reader adjudication for this specific metric.
- For features like EasyScan, ImageGuard, EasyCrop, EasyBolus (evaluated by MRI technologists) and those relying on quantitative metrics against a reference (t-ACS, EasyFACT, Auto TI Scout, EasyRegister, Inline ED/ES Phases Recognition), the "ground truth" is either defined by the system's intended function (e.g., correct auto-positioning) or a mathematically derived reference, so a traditional human adjudication method is not applicable in the same way as for diagnostic image interpretation.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was Done
The document does not explicitly state that a formal MRMC comparative effectiveness study was performed to quantify the effect size of how much human readers improve with AI vs. without AI assistance.
Instead, the evaluations for ACS, DeepRecon, and AiCo involve "American Board of Radiologists certificated physicians" who "verified that [AI feature] meets the requirements of clinical diagnosis. All [AI feature] images were rated with equivalent or higher scores in terms of diagnosis quality." For AiCo, they confirmed images "exhibit fewer motion artifacts and offer greater benefits for clinical diagnosis." This is a qualitative assessment of diagnostic quality by experts, but not a comparative effectiveness study in the sense of measuring reader accuracy or confidence change with AI assistance.
6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) was Done
Yes, for many of the AI-enabled features, a standalone performance evaluation was conducted:
- ACS: Performance was evaluated by comparing quantitative metrics (NRMSE, SNR, Resolution, Contrast, Uniformity, Structure Measurement) against fully-sampled images or CS. This is a standalone evaluation.
- DeepRecon: Quantitative metrics (SNR, uniformity, contrast, structure measurement) were compared between DeepRecon and NADR (without DeepRecon) images. This is a standalone evaluation.
- t-ACS: Quantitative tests (MAE, PSNR, SSIM, structural measurements, motion-time curves) were performed comparing t-ACS and CS results against a reference. This is a standalone evaluation.
- AiCo: PSNR and SSIM values were quantitatively compared, and structural dimensions were assessed, between AiCo processed images and original/motionless reference images. This is a standalone evaluation.
- SparkCo: Spark detection accuracy was calculated, and PSNR of spark-corrected images was compared to original spark images. This is a standalone evaluation.
- Inline MOCO: Evaluated using Dice coefficient, a quantitative metric for segmentation accuracy. This is a standalone evaluation.
- Inline ED/ES Phases Recognition: Evaluated by quantifying the error between algorithm output and gold standard phase indices. This is a standalone evaluation.
- Inline ECV: Evaluated by quantitative scoring for segmentation accuracy (S, A, F criteria). This is a standalone evaluation.
- EasyRegister (Height/Weight): Evaluated by quantitative error metrics (PH5, PH15, MEAN_H; PW10, PW20, MEAN_W) against physical measurements. This is a standalone evaluation.
Features like EasyScan, ImageGuard, EasyCrop, and EasyBolus involve automated workflow assistance where the direct "diagnostic" outcome isn't solely from the algorithm, but the automated function's performance is evaluated in a standalone manner against defined success criteria.
7. The Type of Ground Truth Used
The type of ground truth varies depending on the specific AI feature:
- Reference/Fully-Sampled Data:
- ACS, DeepRecon, t-ACS, AiCo: Fully-sampled k-space data transformed to image space served as "ground-truth" for training and as a reference for quantitative performance metrics in testing. For AiCo, "motionless data" served as gold standard.
- SparkCo: Simulated spark artifacts generated from "spark-free raw data" provided ground truth for spark point locations in training.
- Expert Consensus/Subjective Evaluation:
- ACS, DeepRecon, AiCo: "American Board of Radiologists certificated physicians" provided qualitative assessment of diagnostic image quality ("equivalent or higher scores," "diagnostically acceptable," "fewer motion artifacts," "greater benefits for clinical diagnosis").
- EasyScan, ImageGuard, EasyCrop, EasyBolus: "Licensed MRI technologist with U.S. credentials" or "certified professionals in the United States" performed subjective evaluation against predefined success criteria for the workflow functionality.
- SparkCo: One experienced evaluator for subjective image quality improvement.
- Anatomical/Physiological Measurements / Defined Standards:
- EasyFACT: Defined criteria for ROI placement within liver parenchyma, avoiding borders/vascular structures.
- Auto TI Scout, Inline ED/ES Phases Recognition: Gold standard phase indices were presumably established by expert review or a reference method.
- Inline MOCO & Inline ECV: Ground truth for cardiac left ventricular myocardium segmentation was established by a "well-trained annotator" and "evaluated by three licensed physicians with U.S. credentials" (expert consensus based on anatomical boundaries).
- EasyRegister (Height/Weight Estimation): "Precisely measured height/weight value" using "physical examination standards."
8. The Sample Size for the Training Set
- ACS: 1,262,912 samples (collected from variety of anatomies, image contrasts, and acceleration factors, scanned by UIH MRI systems).
- DeepRecon: 165,837 samples (collected from 264 volunteers, scanned by UIH MRI systems for multiple body parts and clinical protocols).
- EasyScan: Training data collection not explicitly detailed in the same way as ACS/DeepRecon (refers to "collected independently from the training dataset").
- t-ACS: Datasets collected from 108 volunteers ("large number of samples").
- AiCo: 140,000 images collected from 114 volunteers across multiple body parts and clinical protocols.
- SparkCo: 24,866 spark slices generated from 61 cases collected from 10 volunteers.
- EasyFACT, Auto TI Scout, Inline MOCO, Inline ED/ES Phases Recognition, Inline ECV, EasyRegister, EasyBolus: The document states that training data was independent of testing data but does not provide specific sample sizes for the training datasets for these features.
9. How the Ground Truth for the Training Set was Established
- ACS, DeepRecon, t-ACS, AiCo: "Fully-sampled k-space data were collected and transformed to image space as the ground-truth." For DeepRecon specifically, "multiple-averaged images with high-resolution and high SNR were collected as the ground-truth images." For AiCo, "motionless data" served as gold standard. All training data were "manually quality controlled."
- SparkCo: "The training dataset... was generated by simulating spark artifacts from spark-free raw data... with the corresponding ground truth (i.e., the location of spark points)."
- Inline MOCO & Inline ECV: The document states "all ground truth was annotated by a well-trained annotator. The annotator used an interactive tool to observe the image, and then labeled the left ventricular myocardium in the image."
- For EasyScan, EasyFACT, Auto TI Scout, Inline ED/ES Phases Recognition, EasyRegister, and EasyBolus training ground truth establishment is not explicitly detailed, only that the testing data was independent of the training data. For EasyRegister, it implies physical measurements were the basis for ground truth.
Ask a specific question about this device
(258 days)
The uMR 680 system is indicated for use as a magnetic resonance diagnostic device (MRDD) that produces sagittal, transverse, coronal, and oblique cross sectional images, and spectroscopic images, and that display internal anatomical structure and/or function of the head, body and extremities.
These images and the physical parameters derived from the images when interpreted by a trained physician yield information that may assist the diagnosis. Contrast agents may be used depending on the region of interest of the scan.
The uMR 680 is a 1.5T superconducting magnetic resonance diagnostic device with a 70cm size patient bore. It consists of components such as magnet, RF power amplifier, RF coils, gradient power amplifier, gradient coils, patient table, spectrometer, computer, equipment cabinets, power distribution system, internal communication system, and vital signal module etc. The uMR 680 Magnetic Resonance Diagnostic Device is designed to conform to NEMA and DICOM standards.
This traditional 510(k) is to request modifications for the cleared uMR 680(K240744). The modifications performed on the uMR 680 in this submission are due to the following changes that include:
(1) Addition of RF coils and corresponding accessories: Breast Coil -12, Biopsy Configuration, Head Coil-16, Positioning Couch-top, Coil Support.
(2) Deletion of VSM (Wireless UIH Gating Unit REF 453564324621, ECG module Ref 989803163121, SpO2 module Ref 989803163111).
(3) Modification of the dimensions of Detachable table: from width 826mm, height 880mm,2578mm to width 810mm, height 880mm, length 2505mm.
(4) Addition and modification of pulse sequences
a) New sequences: gre_snap, gre_quick_4dncemra, gre_pass, gre_mtp, gre_trass, epi_dwi_msh, epi_dti_msh, svs_hise.
b) Added associated options for certain sequences: fse(add Silicone-Only Imaging, MicroView, MTC, MultiBand), fse_arms(add Silicone-Only Imaging), fse_ssh(add Silicone-Only Imaging), fse_mx(add CEST, T1rho, MicroView, MTC), fse_arms_dwi(add MultiBand), asl_3d(add multi-PLD), gre(add T1rho, MTC, output phase image), gre_fsp(add FSP+), gre_bssfp(add CASS, TI Scout), gre_fsp_c(add 3D LGE, DB/GB PSIR), gre_bssfp_ucs(add real time cine), gre_fq(add 4D Flow), epi_dwi(add IVIM), epi_dti(add DKI, DSI).
c) Added additional accessory equipment required for certain sequences: gre_bssfp(add Virtual ECG Trigger).
d) Name change of certain sequences: gre_fine(old name: gre_bssfp_fi).
e) Added applicable body parts: gre_ute, gre_fine, fse_mx.
(5) Addition of imaging reconstruction methods: AI-assisted Compressed Sensing (ACS), Spark artifact Correction (SparkCo).
(6) Addition of imaging processing methods: Inline Cardiac Function, Inline ECV, Inline MRS, Inline MOCO, 4D Flow, SNAP, CEST, T1rho, FSP+, CASS, PASS, MTP.
(7) Addition of workflow features: TI Scout, EasyCrop, ImageGuard, Mocap, EasyFACT, Auto Bolus tracker, Breast Biopsy and uVision.
(8) Modification of workflow features: EasyScan(add applicable body parts)
The modification does not affect the intended use or alter the fundamental scientific technology of the device.
The provided FDA 510(k) clearance letter and summary for the uMR 680 Magnetic Resonance Imaging System outlines performance data for several new features and algorithms.
Here's an analysis of the acceptance criteria and the studies that prove the device meets them for the AI-assisted Compressed Sensing (ACS), SparkCo, Inline ED/ES Phases Recognition, and Inline MOCO algorithms.
1. Table of Acceptance Criteria and Reported Device Performance
| Feature/Algorithm | Evaluation Item | Acceptance Criteria | Reported Performance |
|---|---|---|---|
| AI-assisted Compressed Sensing (ACS) | AI Module Verification Test | The ratio of error: NRMSE(output)/ NRMSE(input) is always less than 1. | Pass |
| Image SNR | ACS has higher SNR than CS. | Pass (ACS shown to perform better than CS in SNR) | |
| Image Resolution | ACS has higher (standard deviation (SD) / mean value(S)) values than CS. | Pass (ACS shown to perform better than CS in resolution) | |
| Image Contrast | Bland-Altman analysis of image intensities acquired using fully sampled and ACS was shown with less than 1% bias and all sample points falls in the 95% confidence interval. | Pass (less than 1% bias, all sample points within 95% confidence interval) | |
| Image Uniformity | ACS achieved significantly same image uniformities as fully sampled image. | Pass | |
| Structure Measurement | Measurements differences on ACS and fully sampled images of same structures under 5% is acceptable. | Pass | |
| Clinical Evaluation | All ACS images were rated with equivalent or higher scores in terms of diagnosis quality. | "All ACS images were rated with equivalent or higher scores in terms of diagnosis quality" (implicitly, it passed) | |
| SparkCo | Spark Detection Accuracy | The average detection accuracy needs to be larger than 90%. | The average detection accuracy is 94%. |
| Spark Correction Performance (Simulated) | The average PSNR of spark-corrected images needs to be higher than the spark images. Spark artifacts need to be reduced or corrected. | The average PSNR of spark-corrected images is 1.6 dB higher than the spark images. The images with spark artifacts were successfully corrected after enabling SparkCo. | |
| Spark Correction Performance (Real-world) | Spark artifacts need to be reduced or corrected (evaluated by one experienced evaluator assessing image quality improvement). | The images with spark artifacts were successfully corrected after enabling SparkCo. | |
| Inline ED/ES Phases Recognition | Error between algorithm and gold standard | The average error does not exceed 1 frame. | The error between the frame indexes calculated by the algorithm for the ED and ES of all test data and the gold standard frame index is 0.13 frames, which does not exceed 1 frame. |
| Inline MOCO | Dice Coefficient (Left Ventricular Myocardium after Motion Correction) Cardiac Perfusion Images | The average Dice coefficient of the left ventricular myocardium after motion correction is greater than 0.87. | The average Dice coefficient of the left ventricular myocardium after motion correction is 0.92, which is greater than 0.87. Subgroup analysis also showed good generalization: - Age: 0.92-0.93 - Gender: 0.92 - Ethnicity: 0.91-0.92 - BMI: 0.91-0.95 - Magnetic field strength: 0.92-0.93 - Disease conditions: 0.91-0.93 |
| Dice Coefficient (Left Ventricular Myocardium after Motion Correction) Cardiac Dark Blood Images | The average Dice coefficient of the left ventricular myocardium after motion correction is greater than 0.87. | The average Dice coefficient of the left ventricular myocardium after motion correction is 0.96, which is greater than 0.87. Subgroup analysis also showed good generalization: - Age: 0.95-0.96 - Gender: 0.96 - Ethnicity: 0.95-0.96 - BMI: 0.96-0.98 - Magnetic field strength: 0.96 - Disease conditions: 0.96-0.97 |
2. Sample Size Used for the Test Set and Data Provenance
- AI-assisted Compressed Sensing (ACS):
- Sample Size: 1724 samples from 35 volunteers.
- Data Provenance: Diverse demographic distributions (gender, age groups, ethnicity, BMI) covering various clinical sites and separated time periods. Implied to be prospective or a carefully curated retrospective set, collected specifically for validation on the uMR 680 system, and independent of training data.
- SparkCo:
- Simulated Spark Testing Dataset: 159 spark slices (generated from spark-free raw data).
- Real-world Spark Testing Dataset: 59 cases from 15 patients.
- Data Provenance: Real-world data acquired from uMR 1.5T and uMR 3T scanners, covering representative clinical protocols. The report specifies "Asian" for 100% of the real-world dataset's ethnicity, noting that performance is "irrelevant with human ethnicity" due to the nature of spark signal detection. This is retrospective data.
- Inline ED/ES Phases Recognition:
- Sample Size: 95 cases from 56 volunteers.
- Data Provenance: Includes various ages, genders, field strengths (1.5T, 3.0T), disease conditions (NOR, MINF, DCM, HCM, ARV), and ethnicities (Asian, White, Black). The data is independent of the training data. Implied to be retrospective from UIH MRI systems.
- Inline MOCO:
- Sample Size: 287 cases in total (105 cardiac perfusion images from 60 patients, 182 cardiac dark blood images from 33 patients).
- Data Provenance: Acquired from 1.5T and 3T magnetic resonance imaging equipment from UIH. Covers various ages, genders, ethnicities (Asian, White, Black, Hispanic), BMI, field strengths (1.5T, 3.0T), and disease conditions (Positive, Negative, Unknown). The data is independent of the training data. Implied to be retrospective from UIH MRI systems.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and the Qualifications of Those Experts
- AI-assisted Compressed Sensing (ACS):
- Number of Experts: More than one (plural "radiologists" used).
- Qualifications: American Board of Radiologists certificated physicians.
- SparkCo:
- Number of Experts: One expert for real-world SparkCo evaluation.
- Qualifications: "one experienced evaluator." (Specific qualifications like board certification or years of experience are not provided for this specific evaluator).
- Inline ED/ES Phases Recognition:
- Number of Experts: Not explicitly stated for ground truth establishment ("gold standard phase indices"). It implies a single, established method or perhaps a consensus by a team, but details are missing.
- Inline MOCO:
- Number of Experts: Three licensed physicians.
- Qualifications: U.S. credentials.
4. Adjudication Method for the Test Set
- AI-assisted Compressed Sensing (ACS): Not explicitly stated, but implies individual review by "radiologists" to rate diagnostic quality.
- SparkCo: For the real-world dataset, evaluation by "one experienced evaluator."
- Inline ED/ES Phases Recognition: Not explicitly stated; "gold standard phase indices" are referenced, implying a pre-defined or established method without detailing a multi-reader adjudication process.
- Inline MOCO: "Finally, all ground truth was evaluated by three licensed physicians with U.S. credentials." This suggests an adjudication or confirmation process, but the specific method (e.g., 2+1, consensus) is not detailed beyond "evaluated by."
5. If a Multi-Reader, Multi-Case (MRMC) Comparative Effectiveness Study Was Done, If So, What Was the Effect Size of How Much Human Readers Improve with AI vs. Without AI Assistance
- No MRMC comparative effectiveness study was explicitly described to evaluate human reader improvement with AI assistance. The described studies focus on the standalone performance of the algorithms or a qualitative assessment of images by radiologists for diagnostic quality.
6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done
- Yes, standalone performance was done for all listed algorithms.
- ACS: Evaluated quantitatively (SNR, Resolution, Contrast, Uniformity, Structure Measurement) and then qualitatively by radiologists. The quantitative metrics are standalone.
- SparkCo: Quantitative metrics (Detection Accuracy, PSNR) and qualitative assessment by an experienced evaluator. The quantitative metrics are standalone.
- Inline ED/ES Phases Recognition: Evaluated quantitatively as the error between algorithmic output and gold standard. This is a standalone performance metric.
- Inline MOCO: Evaluated using the Dice coefficient, which is a standalone quantitative metric comparing algorithm output to ground truth.
7. The Type of Ground Truth Used
- AI-assisted Compressed Sensing (ACS):
- Quantitative: Fully-sampled k-space data transformed to image space.
- Clinical: Radiologist evaluation ("American Board of Radiologists certificated physicians").
- SparkCo:
- Spark Detection Module: Location of spark points (ground truth for simulated data).
- Spark Correction Module: Visual assessment by "one experienced evaluator."
- Inline ED/ES Phases Recognition: "Gold standard phase indices" (method for establishing this gold standard is not detailed, but implies expert-derived or a highly accurate reference).
- Inline MOCO: Left ventricular myocardium segmentation annotated by a "well-trained annotator" and "evaluated by three licensed physicians with U.S. credentials." This is an expert consensus/pathology-like ground truth.
8. The Sample Size for the Training Set
- AI-assisted Compressed Sensing (ACS): 1,262,912 samples (from a variety of anatomies, image contrasts, and acceleration factors).
- SparkCo: 24,866 spark slices (generated from 61 spark-free cases from 10 volunteers).
- Inline ED/ES Phases Recognition: Not explicitly provided, but stated to be "independent of the data used to test the algorithm."
- Inline MOCO: Not explicitly provided, but stated to be "independent of the data used to test the algorithm."
9. How the Ground Truth for the Training Set Was Established
- AI-assisted Compressed Sensing (ACS): Fully-sampled k-space data were collected and transformed to image space as the ground-truth. All data were manually quality controlled.
- SparkCo: "The training dataset for the AI module in SparkCo was generated by simulating spark artifacts from spark-free raw data... a total of 24,866 spark slices, along with the corresponding ground truth (i.e., the location of spark points), were generated for training." This indicates a hybrid approach using real spark-free data to simulate and generate the ground truth for spark locations.
- Inline ED/ES Phases Recognition: Not explicitly provided.
- Inline MOCO: Not explicitly provided.
Ask a specific question about this device
Page 1 of 67