Search Results

LiverMultiScan (LMSv3) is indicated for use as a magnetic device software application for noninvasive liver evaluation that enables the generation, display and review of 2D magnetic resonance medical image data and pixel maps for MR relaxation times.

LiverMultiScan (LMSv3) is designed to utilize DICOM 3.0 compliant magnetic resonance image datasets, acquired from compatible MR Systems, to display the internal structure of the abdomen including the liver. Other physical parameters derived from the images may also be produced.

LiverMultiScan (LMSv3) provides a number of tools, such as automated liver segmentation and region of interest (ROI) placements, to be used for the assessment of selected regions of an image. Quantitative assessment of selected regions include the determination of triglyceride fat fraction in the liver (PDFF), T2* and iron-corrected T1 (cT1) measurements. PDFF may optionally be computed using the LMS IDEAL or three-point Dixon methodology.

These images and the physical parameters derived from the images, when interpreted by a trained clinician, yield information that may assist in diagnosis.

Device Description

LiverMultiScan (LMSv3) is a standalone software application for displaying 2D Magnetic Resonance (MR) medical image data acquired from compatible MR Scanners. LiverMultiScan runs on general-purpose workstations with a colour monitor, keyboard and mouse.

The main functionality of LiverMultiScan (LMSv3) includes:

Reading DICOM 3.0 compliant datasets stored on workstations, and display of the data acquisition information
Post-processing of MRI data to generate parametric maps of Proton Density Fat Fraction PDFF), T2*, T1 and ironcorrected T1 (cT1) of the liver.
Quantification, and calculation of PDFF, T2* and cT1 metrics using tools such as automatic liver segmentation and ROI (region of interest) placement.
Generation of a summary report demonstrating the quantitative assessment results of fat fraction in the liver (PDFF), T2* and iron-corrected T1 (cT1).

AI/ML Overview

Here's a summary of the acceptance criteria and the study proving the device meets them, based on the provided text:

1. Table of Acceptance Criteria and Reported Device Performance

The document does not explicitly present a "table of acceptance criteria" with corresponding reported performance for specific metrics like sensitivity, specificity, accuracy, etc., as one might expect for a diagnostic device. Instead, the performance testing focuses on the accuracy, repeatability, reproducibility, and inter-/intra-operator variability of quantitative measurements, and particularly on demonstrating substantial equivalence to a predicate device (LMSv2).

The acceptance criteria are implicitly defined by the reported performance ranges and the conclusion that the device performs "as well as its predicate" and that "all the testing results are well within the acceptance criteria."

Metric / Test Type	Acceptance Criteria (Implicit from Study Conclusion)	Reported Device Performance (Summary)
Phantom Testing
T1 Accuracy	T1 measurements consistent with literature-reported underestimation for MOLLI techniques.	Up to 18.89% lower to ground truth (95% CI Limits of Agreement)
T2* Accuracy	Accurate over expected physiological range.	-9.31% to 7.53% of ground truth (95% CI Limits of Agreement)
DIXON PDFF Accuracy (<30%)	Relatively accurate over expected physiological range; minor deviations due to known fat bias.	-7.37% to 1.72% (95% CI Limits of Agreement)
DIXON PDFF Accuracy (>30%)	Relatively accurate over expected physiological range; minor deviations due to known fat bias.	-28.93% to 6.83% (95% CI Limits of Agreement)
IDEAL PDFF Accuracy (<30%)	Accurate over expected physiological range.	-1.17% to 1.43% (95% CI Limits of Agreement)
IDEAL PDFF Accuracy (>30%)	Accurate over expected physiological range.	-5.05% to 10.70% (95% CI Limits of Agreement)
T1 Repeatability (same scanner)	Highly repeatable.	-13.88 to 14.47 ms (95% CI Limits of Agreement)
T2* Repeatability (same scanner)	Highly repeatable.	-0.89 to 1.43 ms (95% CI Limits of Agreement)
DIXON PDFF Repeatability (<30%)	Highly repeatable.	-0.66 to 0.82 % (95% CI Limits of Agreement)
DIXON PDFF Repeatability (>30%)	Highly repeatable.	-2.11 to 1.96% (95% CI Limits of Agreement)
IDEAL PDFF Repeatability (<30%)	Highly repeatable.	-1.27 to 0.87% (95% CI Limits of Agreement)
IDEAL PDFF Repeatability (>30%)	Highly repeatable.	-3.80 to 1.93 % (95% CI Limits of Agreement)
T1 Reproducibility (different scanners)	Reproducible between different scanners.	-2.66 to 10.78% (95% CI Limits of Agreement)
T2* Reproducibility (different scanners)	Reproducible between different scanners.	-3.43 to 2.42 ms (95% CI Limits of Agreement)
DIXON PDFF Reproducibility (<30%)	Reproducible between different scanners.	-1.86 to 5.95% (95% CI Limits of Agreement)
DIXON PDFF Reproducibility (>30%)	Reproducible between different scanners.	-8.64 to 23.52% (95% CI Limits of Agreement)
IDEAL PDFF Reproducibility (<30%)	Reproducible between different scanners.	-1.99 to 2.80% (95% CI Limits of Agreement)
IDEAL PDFF Reproducibility (>30%)	Reproducible between different scanners.	-13.46 to 6.98% (95% CI Limits of Agreement)
In-Vivo Testing
cT1 Repeatability	Highly repeatable.	-94.38 to 63.38 ms (ROI); -76.93 to 59.39 ms (Segmentation)
T2* Repeatability	Highly repeatable.	-6.07 to 5.70 ms (ROI)
DIXON PDFF Repeatability	Highly repeatable.	-1.77 to 3.64 % (ROI); -1.20 to 1.06% (Segmentation)
IDEAL PDFF Repeatability	Highly repeatable.	-1.92 to 1.54% (ROI); -1.83 to 1.28 % (Segmentation)
cT1 Reproducibility (between scanners)	Reproducible between scanners.	-89.70 to 120.58 ms (ROI); -84.91 to 121.79 ms (Segmentation)
T2* Reproducibility (between scanners)	Reproducible between scanners.	-3.68 to 6.35 ms (ROI)
DIXON PDFF Reproducibility (between scanners)	Reproducible between scanners.	-6.21 to 2.63% (ROI); -3.14 to 0.88% (Segmentation)
IDEAL PDFF Reproducibility (between scanners)	Reproducible between scanners.	-2.66 to 2.77% (ROI); -1.74 to 1.21% (Segmentation)
cT1 Intra-Operator Variability	Variation well within prescribed criteria; minor additional variation for ROI method.	-27.38 to 28.33ms (ROI); -20.81 to 13.06ms (Segmentation)
T2* Intra-Operator Variability	Variation well within prescribed criteria; minor additional variation for ROI method.	-2.29 to 2.91 ms (ROI)
DIXON PDFF Intra-Operator Variability	Variation well within prescribed criteria; minor additional variation for ROI method.	-0.78 to 1.90 % (ROI); -0.29 to 0.45% (Segmentation)
IDEAL PDFF Intra-Operator Variability	Variation well within prescribed criteria; minor additional variation for ROI method.	-1.26 to 1.05% (ROI); -0.16 to 0.14% (Segmentation)
cT1 Inter-Operator Variability	Variation well within prescribed criteria; minor additional variation for ROI method.	-48.05 to 39.89ms (ROI); -37.84 to 26.51ms (Segmentation)
T2* Inter-Operator Variability	Variation well within prescribed criteria; minor additional variation for ROI method.	-2.64 to 4.90 ms (ROI)
DIXON PDFF Inter-Operator Variability	Variation well within prescribed criteria; minor additional variation for ROI method.	-2.27 to 4.57% (ROI); -0.55 to 1.22% (Segmentation)
IDEAL PDFF Inter-Operator Variability	Variation well within prescribed criteria; minor additional variation for ROI method.	-2.09 to 1.82 % (ROI); -0.37 to 0.26% (Segmentation)
cT1 Worst-Case Variability	Highly reproducible.	-126.52 to 104.19 ms (ROI); -65.27 to 120.27 ms (Segmentation)
T2* Worst-Case Variability	Highly reproducible.	-3.68 to 6.35 ms (ROI)
DIXON PDFF Worst-Case Variability	Highly reproducible.	-2.04 to 0.76 % (ROI); -2.72 to 1.24% (Segmentation)
IDEAL PDFF Worst-Case Variability	Highly reproducible.	-3.75 to 2.83% (ROI); -1.92 to 1.35% (Segmentation)
Substantial Equivalence (LMSv3 vs. LMSv2.1)	Performs as well as its predicate.
Phantom T1	Negligible difference.	-1.96 to 2.09ms (95% CI Limits of Agreement)
Phantom T2*	Negligible difference.	-0.08 to 0.08ms (95% CI Limits of Agreement)
Phantom DIXON PDFF (< 30%)	Within 1% of predicate.	-0.18 to 0.10 % (95% CI Limits of Agreement)
Phantom DIXON PDFF (≥ 30%)	Within 2% of predicate.	-1.62 to 1.02 % (95% CI Limits of Agreement)
In-vivo T1	cT1 values within 30ms of predicate.	-28.08 to 28.73ms (95% CI Limits of Agreement)
In-vivo T2*	T2* values within 2ms of predicate.	-0.43 to 1.69ms (95% CI Limits of Agreement)
In-vivo DIXON PDFF	Negligible difference.	-0.18 to 0.10 % (95% CI Limits of Agreement)

2. Sample Sizes Used for the Test Set and Data Provenance

Phantom Testing: The sample size for phantom testing is not explicitly stated as a number of phantoms, but it involved phantoms "designed to mimic the human data but provide a wider range" and covered "worst-case scenarios." The data provenance is controlled laboratory conditions, using prepared phantoms.
In-Vivo Testing (Clinical): The study used "in-vivo volunteer data." The precise number of volunteers is not specified. The data provenance is implicitly prospective, as it refers to "volunteer scans." No country of origin is explicitly mentioned, but the submitter (Perspectum Diagnostics Ltd) is based in the United Kingdom.
Substantial Equivalence Testing: Used both "phantom measurements" and "in-vivo measurements." The specific sample sizes for these comparisons are not detailed.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications

The document does not describe the establishment of a "ground truth" through expert consensus for the test set, specifically in the context of diagnostic accuracy. The performance testing focuses on the accuracy and variability of the quantitative measurements (cT1, T2*, PDFF) derived by the software rather than a diagnostic outcome.

For phantom testing, ground truth values are inherent to the precisely calibrated phantoms. For in-vivo testing, variability is measured, and for substantial equivalence, the comparison is against the predicate device's measurements. There is no mention of human experts establishing a ground truth for the quantitative values directly measured by the device for these performance tests.

4. Adjudication Method for the Test Set

Not applicable. The performance testing described does not involve human adjudication of diagnostic outcomes or image interpretation in the way a clinical study for sensitivity/specificity for a diagnosis would. It focuses on the quantitative output of the software.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No. The document describes performance testing for device accuracy, repeatability, reproducibility, and substantial equivalence to a predicate device. It does not mention an MRMC comparative effectiveness study involving human readers with and without AI assistance or an effect size of improvement.

6. Standalone (Algorithm Only Without Human-in-the-Loop Performance) Study

Yes, the performance testing described is primarily standalone performance testing. LiverMultiScan (LMSv3) is explicitly described as a "standalone software application" and a "post-processing, standalone software device." The tests assess the algorithms' performance in generating quantitative metrics (accuracy, repeatability, reproducibility) without requiring human interpretation as part of the core measurement. Operators (trained PD operators) are involved in using the device to generate reports and place ROIs, and inter/intra-operator variability is assessed, but the fundamental measured values are algorithmically derived from the MR data.

7. Type of Ground Truth Used

Phantom Testing: The ground truth used was established by the precisely characterized properties of the phantoms, which were designed to mimic human data over a wide range of physiological values.
In-Vivo Testing: For the in-vivo volunteer data, the concept of "ground truth" for the measured parameters (cT1, T2*, PDFF) refers to the true physiological values within the volunteers, against which the device's precision and variability are assessed. While not explicitly stated how this "true" value would be independently confirmed, the focus is on self-consistency (repeatability, reproducibility, operator variability) of the device's measurements rather than comparison to an external gold standard like pathology.
Substantial Equivalence Testing: The ground truth for this comparison was the performance and measurements of the legally marketed predicate device, LiverMultiScan (LMSv2.1).

8. Sample Size for the Training Set

The document does not explicitly state the sample size (or any details) for a "training set." This type of detail is typically associated with AI/machine learning models where a dataset is used to train the algorithm. While LMSv3 includes "New algorithms" like Automatic Liver Segmentation, the document does not elaborate on how these algorithms were developed or if they involved a distinct training phase with a specific dataset.

9. How the Ground Truth for the Training Set Was Established

Since a "training set" with established ground truth is not detailed, this information is not provided in the document.

Ask a Question

Ask a specific question about this device

K Number

K183133

Device Name

MRCP+ v1.0

Manufacturer

Perspectum Diagnostics Ltd

Date Cleared

2019-01-09

(57 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K131498

Predicate For

K233930

Intended Use

MRCP+v1 is indicated for use as a software-based image processing system for non-invasive, quantitative assessment of biliary system structures by facilitating the generation, visualisation and review of three-dimensional quantitative biliary system models and anatomical image data.

MRCP+v1 calculates quantitative three-dimensional biliary system models that enable measurement of bile duct widths and automatic detection of regions of variation (ROV) of tubular structures. MRCP+v1 includes tools for interactive segmentation and labelling of the biliary system and tubular structures. MRCP+v1 allows for regional volumetric analysis of segmented tree-like, tubular structures and the gallbladder.

Combining image viewing, processing and reporting tools, the metrics provided is designed to support physicians in the visualization, evaluation and reporting of hepatobiliary structures. These models and the physical parameters derived from the models, when interpreted by a trained physician, yield information that may assist in biliary system assessment.

MRCP+v1 is designed to utilize DICOM compliant MRCP datasets, acquired on supported MR scanners using supported MRCP acquisition protocols.

MRCP+v1 is suitable for all patients not contra-indicated for MRI.

Device Description

MRCP+v1 is a standalone software device. The purpose of the MRCP+v1 device is to assist the trained operator with the evaluation of information from Magnetic Resonance (MR) images from a single time-point (patient visit). A trained operator, typically a radiographer or technician trained in radiological anatomy, loads as input into the MRCP+v1 device previously acquired MRCP data. The device is intended to be used by a trained operator to generate metrics in the form of a summary report to enable reporting by a radiologist for subsequent interpretation and diagnosis by a clinician.

MRCP+v1 is designed to utilize DICOM 3.0 compliant MRCP datasets, acquired on supported MR scanners, to display the fluid-filled tubular structures in the abdomen, including the intra- and extrahepatic biliary tree, gallbladder and pancreatic ductal system.

Analysis and evaluation tools include:

Segmentation of structures utilising user input of seeding points.
Interactive labelling of segmented areas.
Quantitative measurement derived from segmentation and labelling results.

The radiologist may use existing radiological tools to report on the biliary tree. The reviewing radiologist needs to take into consideration the device's limitations and accuracy during review. The MRCP+v1 report is intended to supplement a conventional radiology report, for interpretation by a clinician.

MRCP+v1 does not replace the usual procedures for assessment of the biliary system by a reviewing radiologist or interpreting clinician, providing many opportunities for competent human intervention in the interpretation of images and information displayed.

The metrics are intended to be used as an additional input to radiologists and clinicians in addition to that provided by conventional MRCP.

AI/ML Overview

The provided FDA 510(k) document for MRCP+ v1.0 describes its acceptance criteria and the studies performed to demonstrate its performance.

Here's a breakdown of the requested information:

1. Table of Acceptance Criteria and Reported Device Performance

The document does not explicitly state acceptance criteria in terms of specific performance thresholds that must be met. Instead, it reports the observed performance metrics and concludes that the device is "highly accurate," "highly repeatable," and "reproducible," and that "the variation introduced by operator analysis is well within the prescribed acceptance criteria set for performance testing."

However, we can infer the acceptance criteria are related to the reported 95% Confidence Intervals (CI) for the Limits of Agreement, with the implicit goal being that these limits should be acceptably narrow.

Here's a table summarizing the reported device performance:

Performance Metric	Reported Device Performance (95% CI Limit of Agreement)
Algorithmic Accuracy – Digital Synthetic Data
Clinical Phantom	Upper: 0.9mm, Lower: -0.9mm
Tube Width Phantom	Upper: 1.3mm, Lower: -0.6mm
Device Accuracy – Physical Phantoms
Clinical Phantom	Upper: 1.0mm, Lower: -1.1mm
Tube Width Phantom	Upper: 0.7mm, Lower: -0.9mm
Precision (Repeatability) – Physical Phantoms
Clinical Phantom	Upper: 0.4mm, Lower: -0.4mm
Tube Width Phantom	Upper: 0.3mm, Lower: -0.3mm
Precision (Reproducibility) – Physical Phantoms
Clinical Phantom	Upper: 0.5mm, Lower: -1.1mm
Tube Width Phantom	Upper: 0.7mm, Lower: -0.8mm
Precision (Repeatability) – In-vivo Data
Tree Volume (ml)	Upper: 3.4, Lower: -3.5
Gallbladder Volume (ml)	Upper: 4.4, Lower: -9.4
3-5mm (%)	Upper: 17.7, Lower: -19.2
5-7mm (%)	Upper: 8.5, Lower: -8.0
Greater than 7mm (%)	Upper: 1.6, Lower: -1.8
Less than 3mm (%)	Upper: 20.2, Lower: -19.2
Duct Median (mm)	Upper: 1.7, Lower: -2.0
Duct Minimum (mm)	Upper: 1.9, Lower: -2.2
Duct Maximum (mm)	Upper: 3.2, Lower: -2.9
Duct IQR (mm)	Upper: 1.8, Lower: -1.7
Precision (Reproducibility) – In-vivo Data
Tree Volume (ml)	Upper: 4.7, Lower: -6.9
Gallbladder Volume (ml)	Upper: 13.8, Lower: -18.4
3-5mm (%)	Upper: 27.8, Lower: -15.8
5-7mm (%)	Upper: 10.8, Lower: -11.8
Greater than 7mm (%)	Upper: 2.6, Lower: -3.1
Less than 3mm (%)	Upper: 30.8, Lower: -23.4
Duct Median (mm)	Upper: 2.6, Lower: -2.8
Duct Minimum (mm)	Upper: 2.6, Lower: -2.8
Duct Maximum (mm)	Upper: 4.0, Lower: -3.5
Duct IQR (mm)	Upper: 1.9, Lower: -2.3
Intra-Operator Performance
Tree Volume (ml)	Upper: 0.9, Lower: -0.6
Gallbladder Volume (ml)	Upper: 0.0, Lower: 0.0
3-5mm (%)	Upper: 7.4, Lower: -7.7
5-7mm (%)	Upper: 5.0, Lower: -5.1
Greater than 7mm (%)	Upper: 3.9, Lower: -4.7
Less than 3mm (%)	Upper: 11.7, Lower: -10.7
Duct Median (mm)	Upper: 0.4, Lower: -0.4
Duct Minimum (mm)	Upper: 0.5, Lower: -0.4
Duct Maximum (mm)	Upper: 2.0, Lower: -1.6
Duct IQR (mm)	Upper: 0.5, Lower: -0.4
Inter-Operator Performance
Tree Volume (ml)	Upper: 3.6, Lower: -6.8
Gallbladder Volume (ml)	Upper: 0.6, Lower: -0.7
3-5mm (%)	Upper: 16.8, Lower: -11.1
5-7mm (%)	Upper: 6.3, Lower: -6.3
Greater than 7mm (%)	Upper: 1.4, Lower: -1.7
Less than 3mm (%)	Upper: 12.3, Lower: -17.4
Duct Median (mm)	Upper: 1.1, Lower: -1.8
Duct Minimum (mm)	Upper: 1.0, Lower: -1.3
Duct Maximum (mm)	Upper: 2.0, Lower: -3.2
Duct IQR (mm)	Upper: 1.6, Lower: -1.9

2. Sample Size Used for the Test Set and Data Provenance

Algorithmic Accuracy (Digital Synthetic Data): "a dataset of digital synthetic data" - no specific number of samples is provided.
Device Accuracy (Physical Phantoms): "Two different types of phantom, a clinical and tubewidth" - no specific number of scans or repetitions is provided. One would infer 'multiple' scans given the reporting of CI, but the exact count isn't in the provided text.
Precision (In-vivo Data): "in-vivo data acquired from volunteers" - no specific number of volunteers or scans is provided.
Intra- and Inter-Operator: No specific sample size (number of cases or readings) is provided.

Data Provenance: The provenance is described as:

"digital synthetic data"
"phantom scans"
"in-vivo (healthy volunteer and patients with suspected hepatobiliary disease) scans"
MR systems from Siemens, GE, and Philips across 1.5T and 3T field strengths were used for in-vivo testing (and presumably for phantoms as well).
The document implies the company is based in the United Kingdom ("OXFORD, OX1 2ET OXFORDSHIRE UNITED KINGDOM"), suggesting the data could originate from there or other international sites. The text doesn't specify if the data was retrospective or prospective, though "acquired from volunteers" might suggest prospective data collection for the precision study if these were specifically for validation.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications

The document does not explicitly state the number of experts or their qualifications used to establish ground truth for the test set. The data provenance for "Algorithmic Accuracy – Digital Synthetic Data" mentions "the duct file specification of the physically printed phantoms," which suggests an objective, engineered ground truth rather than expert consensus on images for this part of the testing. For in-vivo data, the ground truth establishment method isn't detailed, nor are the experts.

It does state that the device is to be used by a "trained operator, typically a radiographer or technician trained in radiological anatomy" to generate reports for "reporting by a radiologist for subsequent interpretation and diagnosis by a clinician." This describes the user, but not the experts for ground truth.

4. Adjudication Method for the Test Set

The document does not describe any specific adjudication method (e.g., 2+1, 3+1, none) for establishing ground truth for the test set.

5. Multi Reader Multi Case (MRMC) Comparative Effectiveness Study

No Multi Reader Multi Case (MRMC) comparative effectiveness study evaluating human reader improvement with AI assistance versus without AI assistance is mentioned in the provided text. The "Intra- and Inter-Operator" section assesses the variability of the device's output when operated by different or the same human operators, not directly the diagnostic performance improvement of human readers using the device.

6. Standalone (Algorithm-Only) Performance Study

Yes, a standalone performance study was conducted. The section "Algorithmic Accuracy – Digital Synthetic Data" describes testing the algorithm's accuracy "against the duct file specification of the physically printed phantoms, a dataset of digital synthetic data." The conclusion clarifies this: "It can be said that the algorithmic accuracy of MRCP+v1 when used to quantify tubular structures from acquired MRCP data is high when not presented with the variability introduced by phantom or in-vivo scanning." This directly speaks to the algorithm's performance independent of scanning variability, which constitutes a standalone assessment.

7. Type of Ground Truth Used

Algorithmic Accuracy: "Duct file specification of the physically printed phantoms" (an engineered/objective ground truth).
Device Accuracy (Physical Phantoms): The document implies the "physical phantoms" themselves serve as the ground truth or have known parameters against which the device's measurements are compared.
Precision (In-vivo Data) and Intra-/Inter-Operator: The ground truth for these measurements (Tree Volume, Gallbladder Volume, Duct Median, etc.) would be the presumed true values derived from the MRCP images, but the method of establishing this ground truth (e.g., expert manual segmentation, another reference standard) is not explicitly described. It measures consistency and agreement between measurements rather than agreement with an external gold standard for these particular sections.

8. Sample Size for the Training Set

The document does not provide any information regarding the sample size used for the training set.

9. How the Ground Truth for the Training Set Was Established

The document does not provide any information on how the ground truth for the training set was established.

Ask a Question

Ask a specific question about this device

K Number

K172685

Device Name

LiverMultiScan

Manufacturer

Perspectum Diagnostics Ltd

Date Cleared

2017-11-21

(76 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K143020

Predicate For

K190017

Intended Use

LiverMultiScan (LMSv2) is indicated for use as a magnetic device software application for noninvasive liver evaluation that enables the generation, display and review of 2D magnetic resonance medical image data and pixel maps for MR relaxation times.

LiverMultiScan (LMSv2) is designed to utilize DICOM 3.0 compliant magnetic resonance image datasets, acquired from compatible MR Systems, to display the internal structure of the abdomen including the liver. Other physical parameters derived from the images may also be produced.

LiverMultiScan (LMSv2) provides a number of quantification tools, such as Region of Interest (ROI) placements, to be used for the assessment of regions of an image to quantify liver tissue characteristics, including the determination of triglyceride fat fraction in the liver, T2* and iron-corrected T1 measurements.

These images and the physical parameters derived from the images, when interpreted by a trained clinician, yield information that may assist in diagnosis.

Device Description

LiverMultiScan (LMSv2) is a standalone software application for displaying 2D Magnetic Resonance medical image data acquired from compatible MR Scanners. LiverMultiScan runs on a general purpose workstation with a colour monitor, keyboard and mouse.

LiverMultiScan (LMSv2) is designed to allow the review of DICOM 3.0 compliant datasets stored on the workstation and the operator may also create, display, print, store and distribute reports resulting from interpretation of the datasets.

LiverMultiScan (LMSv2) allows the display and comparison of combinations of magnetic resonance images and provides a number of tools for the quantification of magnetic resonance images, including the determination of triglyceride fat fraction in the liver, T2* and iron-corrected T1 measurements.

LiverMultiScan (LMSv2) provides a number of tools, such as circular region of interest placements, to be used for the assessment of regions of an image to support a clinical workflow.

LiverMultiScan (LMSv2) allows the operator to create relaxometry parameter maps of the abdomen which can be used by clinicians to help determine different tissue characteristics to support a clinical workflow. Examples of such workflows include, but are not limited to, the evaluation of the presence or absence of liver fat.

LiverMultiScan (LMSv2) is intended to be used by trained operators. Reports generated by trained operators are intended for use by interpreting clinicians, including, but not limited to radiologists, gastroenterologists, and hepatologists.

LiverMultiScan (LMSv2) is an aid to diagnosis. When interpreted by a trained clinician, the results provide information, which may be used as an input into existing clinical procedures and diagnostic workflows.

LiverMultiScan (LMSv2) offers the following.

Advanced visualisation of MR data.
Processing of MR data to quantify tissue characteristics including MR relaxivity constants such as T2*, T1, ironcorrected T1 (cT1) and triglyceride fat fraction (expressed as liver fat percentage).
Circular region of interest statistics.
Snapshot of images to include in a report.
Report to include region statistics, snapshot images and user-entered text.
Export of snapshot images to report.

AI/ML Overview

Here's a summary of the acceptance criteria and study details for LiverMultiScan (LMSv2) based on the provided FDA 510(k) summary:

1. Acceptance Criteria and Reported Device Performance

The acceptance criteria are framed in terms of accuracy, repeatability, reproducibility, and equivalence to the predicate device (LMSv1) for the measurements of T1 (corrected T1 or cT1 for in vivo), T2*, and Proton Density Fat Fraction (PDFF, also referred to as triglyceride fat fraction).

Phantom Study Performance:

phantom metric	Acceptance Criteria (Performance metric)	Reported Device Performance (95% CI Limits of Agreement)
Accuracy
T1	Consistent with literature-reported underestimation	19-25% lower than ground truth (consistent with MOLLI techniques)
T2*	Accurate over expected physiological range	+/- 2ms
PDFF < 30%	Accurate over expected physiological range	+/- 3%
PDFF ≥ 30%	Accurate over expected physiological range	+/- 21%
Repeatability	Highly repeatable
T1	+/- 10ms	+/- 10ms
T2*	+/- 1.7ms	+/- 1.7ms
PDFF < 30%	-2.5 to 1 %	-2.5 to 1 %
PDFF ≥ 30%	+/- 5%	+/- 5%
Reproducibility	Reproducible between systems (at same field strength for T2*)
T1	-34 to 27ms	-34 to 27ms
T2*	+/- 4ms	+/- 4ms (between systems at same field strength)
PDFF < 30%	+/- 4%	+/- 4%
PDFF ≥ 30%	+/- 10%	+/- 10% (between systems)
Equivalence to LMSv1
T1	Within 1ms of predicate	-0.5 to 0.4ms
T2*	Within 1ms of predicate	-0.4 to 0.2ms
PDFF < 30%	Within 1% of predicate	-0.6 to 0.4%
PDFF ≥ 30%	Within 2% of predicate	-1.6 to 2%

In Vivo Study Performance:

Volunteer metric	Acceptance Criteria (Performance metric)	Reported Device Performance (95% CI Limits of Agreement)
Repeatability	Highly repeatable
cT1	+/- 60ms	+/- 60ms
T2*	+/- 7ms	+/- 7ms
PDFF	+/- 1%	+/- 1%
Reproducibility	Reproducible between systems (at same field strength for T2*)
cT1	+/- 120ms	+/- 120ms
T2*	+/- 10ms	+/- 10ms (between systems at same field strength)
PDFF	+/- 2%	+/- 2% (between systems)
Intra-operator Variation	Well within proscribed acceptance criteria
cT1	+/- 18ms	+/- 18ms
T2*	+/- 3ms	+/- 3ms
PDFF	+/- 1%	+/- 1%
Inter-operator Variation	Minor additional variation compared to intra-operator
cT1	+/- 25ms	+/- 25ms
T2*	+/- 4ms	+/- 4ms
PDFF	+/- 1.1%	+/- 1.1%
Equivalence to LMSv1
cT1	Within 40ms of predicate	-14.8 to 40ms
T2*	Within 1ms of predicate	-0.5 to 0.8ms
PDFF	Negligible difference	-0.4 to 0.2%

Conclusion: The document states that "all acceptance criteria were met," and LMSv2 is concluded to be substantially equivalent to LMSv1.

2. Sample Size Used for the Test Set and Data Provenance

Test Set Sample Size:
- In Vivo: The document mentions "in vivo volunteer data" but does not explicitly state the number of volunteers used for the test set.
- Phantom: The document mentions "phantom scans" but does not explicitly state the number of phantoms or scans.
- Substantial Equivalence (In Vivo): The document mentions "in vivo measurements" show specific equivalence values, implying a test set was used, but the size is not specified.
- Substantial Equivalence (Phantom): Similarly, "phantom measurements" are cited for equivalence, but the sample size is not specified.
Data Provenance: The document does not specify the country of origin of the data nor whether the studies were retrospective or prospective. It only mentions "volunteer scans" and "phantom scans."

3. Number of Experts and Qualifications for Ground Truth

The document does not provide details on the number of experts, their specific qualifications, or their role in establishing ground truth for the test set. The device is intended to be used by trained operators, and reports interpreted by trained clinicians (radiologists, gastroenterologists, hepatologists).

4. Adjudication Method

The document does not describe any specific adjudication method (e.g., 2+1, 3+1) used for the test set. It does mention "intra-operator" and "inter-operator" variation, indicating that multiple readings by the same operator and different operators were performed.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No explicit MRMC comparative effectiveness study is described where human readers' improvement with AI vs. without AI assistance is quantified. The evaluation focuses on the performance of the software itself and the variability introduced by operators using the software.

6. Standalone Performance Study

Yes, a standalone performance study was done for the algorithm (LiverMultiScan v2). The entire performance data presented in the tables (accuracy, repeatability, reproducibility, equivalence) represents the algorithm's performance. The "intra-operator" and "inter-operator" studies also demonstrate the variability introduced by the human-in-the-loop when using the standalone software.

7. Type of Ground Truth Used

Phantom Studies: For phantom studies, the ground truth for T1 was based on "literature-reported underestimation of ground truth T1 using MOLLI techniques." For T2* and PDFF, the phantoms likely had known, controlled values for these parameters, which is implied by the term "ground truth" and the quantitative comparisons.
In Vivo Studies: The document does not explicitly state how ground truth was established for "in vivo volunteer data." Given the measurements are for MRI parameters (cT1, T2*, PDFF), the "ground truth" for repeatability and reproducibility would likely be the measurements themselves when repeated under identical or varied conditions, rather than an external gold standard like pathology, unless otherwise specified. For accuracy, a reference standard might be implied, but it's not detailed.

8. Sample Size for the Training Set

The document does not specify the sample size used for the training set.

9. How Ground Truth for the Training Set was Established

The document does not describe how ground truth was established for the training set. It focuses on the validation and verification against user needs and intended use, and comparison to a predicate device, rather than the development or training of the algorithm itself.

Ask a Question

Ask a specific question about this device

Page 1 of 1