K Number
K233930
Manufacturer
Date Cleared
2024-03-13

(90 days)

Product Code
Regulation Number
892.2050
Panel
RA
Reference & Predicate Devices
Predicate For
N/A
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

MRCP+ v2 is indicated for use as a software-based image processing system for noninvasive, quantitative assessment of biliary system structures.

MRCP+ v2 calculates quantitative three-dimensional biliary system models that enable the measurement of bile duct widths and quantification of strictures and dilatations. MRCP+ v2 includes tools for interactive segmentation and labelling of the biliary system, including the gallbladder.

MRCP+ v2 provides metrics for interpretation by trained physicians to assist in the assessment of the biliary system. These metrics support physicians in the visualization, evaluation and reporting of hepatobiliary structures.

MRCP+ v2 is designed to utilize DICOM-compliant MRCP datasets acquired on supported MR scanners using supported MRCP acquisition protocols.

Device Description

MRCP+ v2 is a standalone software medical device. The purpose of the MRCP+ v2 device is to assist a trained operator with the quantitative evaluation of biliary system structures acquired from magnetic resonance (MR) images from a single time-point (a patient visit). A Perspectum-trained operator loads previously acquired magnetic resonance cholangiopancreatography (MRCP) data as input into the MRCP+ v2 device. A structured summary report is generated as output by the device, which includes quantitative analysis results and geometric characteristics of the biliary system and pancreatic ducts. The MRCP+ v2 report is intended to facilitate reporting by a radiologist for subsequent interpretation and to aid diagnosis by a physician as part of a panel of testing, including conventional radiological tools.

MRCP+ v2 is designed to utilize Digital Imaging and Communications in Medicine (DICOM)-compliant MRCP datasets acquired on supported MR scanners to display the fluid-filled tubular structures in the abdomen, including intra- and extra-hepatic biliary tree, gallbladder, and pancreatic duct system. Analysis and evaluation tools in MRCP+ v2 include segmentation of structures utilizing user input of seeding points, interactive labelling of segmented areas, and quantitative measurement derived from segmentation results. MRCP+ v2 calculates quantitative three-dimensional biliary system models that enable measurement of bile duct widths and automatic detection of regions of variation in the width of tubular structures, including those quantified as strictures and dilations. MRCP+ v2 includes tools allowing for regional volumetric analysis of segmented tree-like, tubular structures and the gallbladder.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study information for the Perspectum MRCP+ version 2 (MRCP+ v2) device, based on the provided text:

1. Table of Acceptance Criteria and Reported Device Performance

The acceptance criteria are generally expressed as Bland-Altman 95% Limits of Agreement (LoA) or maximum bias and other precision metrics. The document states that the device successfully met all predetermined acceptance criteria. Therefore, the "Reported Device Performance" effectively matches the "Acceptance criteria" as the device passed.

Table: Acceptance Criteria for MRCP+ v2 Performance

MetricAcceptance Criteria (from Tables 2, 3, 4, 5, 6, 7, 8, 9)
Individual Duct Metrics (Stricture & Dilatation Phantom)
Duct length (mm)Max bias: ±9.5, LoA: ±81.3
Stricture length max (mm)Max bias: ±1.1, LoA: ±14.6
Stricture length mean (mm)Max bias: ±1.1, LoA: ±14.3
Stricture length sum (mm)Max bias: ±1.8, LoA: ±17.8
Dilatation length sum (mm)Max bias: ±2.2, LoA: ±23.2
Dilatation diameter max (mm)Max bias: ±0.7, LoA: ±6.0
Stricture absolute severity sum (mm)Max bias: ±2.0, LoA: ±3.8
Dilatation absolute severity sum (mm)Max bias: ±3.3, LoA: ±5.5
Dilatation absolute severity max (mm)Max bias: ±1.1, LoA: ±3.8
Stricture relative severity sum (%)Max bias: ±18.6, LoA: ±76.7
Dilatation relative severity sum (%)Max bias: ±407.5, LoA: ±520.6
Dilatation relative severity max (%)Max bias: ±209.6, LoA: ±281.2
Stricture score sum (mm)Max bias: ±1.95, LoA: ±8.6
Dilatation score sum (mm)Max bias: ±33.3, LoA: ±49.6
Number of stricturesMax bias: ±1.0, LoA: ±3.0
Number of dilatationsMax bias: ±1.0, LoA: ±3.0
Tube Width Measurements (Tube Width & Clinical Biliary Phantoms)
Accuracy of algorithm - synthetic data: % points stably matched> 80%
Accuracy of algorithm - synthetic data: Absolute value of mean bias≤ 0.4 mm
Accuracy of algorithm - synthetic data: Largest absolute LoA≤ 1.4 mm
Accuracy of algorithm - synthetic data: Slope of bias≤ 0.1 mm per mm
Accuracy of device - acquired data: % points stably matched> 80%
Accuracy of device - acquired data: Absolute value of mean bias≤ 0.5 mm
Accuracy of device - acquired data: Largest absolute LoA≤ 1.5 mm
Accuracy of device - acquired data: Slope of bias≤ 0.1 mm per mm
Repeatability: % points stably matched> 80%
Repeatability: Absolute value of mean bias≤ 0.4 mm
Repeatability: Largest absolute LoA≤ 1.4 mm
Repeatability: Slope of bias≤ 0.15 mm per mm
Reproducibility: % points stably matched> 80%
Reproducibility: Absolute value of mean bias≤ 0.6 mm
Reproducibility: Largest absolute LoA≤ 1.6 mm
Reproducibility: Slope of bias≤ 0.15 mm per mm
Substantial equivalence (v1 vs v2): % points stably matched> 80%
Substantial equivalence (v1 vs v2): Absolute value of mean bias≤ 0.6 mm
Substantial equivalence (v1 vs v2): Largest absolute LoA≤ 1.6 mm
Substantial equivalence (v1 vs v2): Slope of bias≤ 0.15 mm per mm
In Vivo Performance (Summary Whole Tree Metrics - LoA)
Total number of ducts±116.2
Total number of strictures±16.5
Total number of dilatations±33.8
Total number of ducts with a stricture or dilatation±27.4
Total number of ducts with a stricture and dilatation±9.6
Number of ducts with a stricture±12.4
Number of ducts with a dilatation±24.5
Duct length mean (mm)±12.3
Duct length sum (mm)±2082.7
Stricture length sum (mm)±121.4
Dilatation length sum (mm)±215.8
Abnormal length sum (mm)±326.3
Stricture absolute severity sum (mm)±32.0
Dilatation absolute severity sum (mm)±78.7
Dilatation absolute severity max (mm)±4.5
Stricture relative severity sum (%)±708.0
Dilatation relative severity sum (%)±2437.6
Dilatation relative severity max (%)±165.3
Stricture score sum (mm)±51.7
Dilatation score sum (mm)±179.2
Dilatation diameter max (mm)±6.3
Biliary tree volume±20 mL
Gallbladder volume±30 mL
Percentage of ducts with median width less than 3 mm±40%
Percentage of ducts with median width greater than 3 mm up to 5 mm±40%
Percentage of ducts with median width greater than 5 mm up to 9 mm±40%
Percentage of ducts with median width greater than 9 mm±40%
Precision: Repeatability and Reproducibility (Whole Tree Metrics - LoA)(Same as In Vivo Performance)
Precision: Repeatability and Reproducibility (Single Duct Metrics - LoA)
Total number of strictures±2.0
Total number of dilatations±2.0
Duct length (mm)±71.8
Stricture length sum (mm)±16.0
Dilatation length sum (mm)±21.0
Stricture length mean (mm)±13.2
Stricture length max (mm)±13.5
Stricture absolute severity sum (mm)±1.8
Dilatation absolute severity sum (mm)±2.2
Dilatation absolute severity max (mm)±2.7
Stricture relative severity sum (%)±58.1
Dilatation relative severity sum (%)±113.1
Dilatation relative severity max (%)±71.6
Stricture score sum (mm)±6.6
Dilatation score sum (mm)±16.3
Dilatation diameter max (mm)±5.3
Median duct width (mm)±3 (Repeatability), ±3.5 (Reproducibility)
Duct width interquartile range (mm)±3.5 (Repeatability), ±4.0 (Reproducibility)
Maximum duct width (mm)±4.5 (Repeatability), ±4.8 (Reproducibility)
Minimum duct width (mm)±3.8 (Repeatability), ±3.8 (Reproducibility)
Intra and Inter-Operator Assessment (Whole Tree Metrics - LoA)(Same as In Vivo Performance)
Intra and Inter-Operator Assessment (Single Duct Metrics - LoA)
Total number of strictures±2.0
Total number of dilatations±2.0
Duct length (mm)±71.8
Stricture length sum (mm)±16.0
Dilatation length sum (mm)±21.0
Stricture length mean (mm)±13.2
Stricture length max (mm)±13.5
Stricture absolute severity sum (mm)±1.8
Dilatation absolute severity sum (mm)±2.2
Dilatation absolute severity max (mm)±2.7
Stricture relative severity sum (%)±58.1
Dilatation relative severity sum (%)±113.1
Dilatation relative severity max (%)±71.6
Stricture score sum (mm)±6.6
Dilatation score sum (mm)±16.3
Dilatation diameter max (mm)±5.3
Median duct width (mm)±2.0
Duct width interquartile range (mm)±2.0
Maximum duct width (mm)±3.0
Minimum duct width (mm)±2.5
Equivalence Testing to Predicate Device (LoA)
Median duct width (mm)±3
Duct width interquartile range (mm)±3.5
Maximum duct width (mm)±4.5
Minimum duct width (mm)±3.8
Biliary tree volume±20 mL
Gallbladder volume±30 mL
Percentage of ducts with median width less than 3 mm±40%
Percentage of ducts with median width greater than 3 mm up to 5 mm±40%
Percentage of ducts with median width greater than 5 mm up to 9 mm±40%
Percentage of ducts with median width greater than 9 mm±40%

2. Sample Size Used for the Test Set and Data Provenance

The document explicitly mentions the use of "previously acquired data" for substantial equivalence testing with MRCP+ v1 and "acquired phantom data" for accuracy, repeatability, and reproducibility. Additionally, "volunteer data" was used for clinical precision and operator variability assessments.

  • Phantoms:

    • Types: Stricture and Dilatation phantom, Tube Width and Clinical Biliary phantoms, 3D printed phantoms, digital synthetic phantoms.
    • Data Provenance: The phantoms were "scanned" on various MRI systems (Siemens 1.5T/3T, GE 1.5T/3T, Philips 1.5T/3T). This indicates controlled, prospective data acquisition specifically for testing. Digital synthetic phantoms are likely "in silico" generated.
    • Sample Size: Not explicitly stated as a single number for all phantom tests. The document refers to "the metrics introduced in MRCP+ v2" and comparison of "results from scanned 3D printed phantoms" suggesting multiple phantom uses. For substantial equivalence, it mentions "previously acquired data from Tube Width and Clinical Biliary phantoms."
  • Clinical Data (for Precision, Repeatability & Reproducibility, Intra/Inter-Operator Assessment):

    • Types: "Volunteers scanned under the same measurement conditions (twice, on the same scanner, a short time apart for repeatability) and under different measurement conditions (on more than one scanner, compared to a reference scanner for reproducibility)." This confirms prospective data acquisition from healthy individuals or similar.
    • Data Provenance: Not explicitly stated (e.g., country of origin), but implies a controlled clinical setting for volunteer scanning.
    • Sample Size: Not explicitly stated. The text notes "volunteer data used for the precision experiments was also used to assess inter-operator and intra-operator variability."

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

  • Phantom Data:

    • Ground Truth: For accuracy assessments, the ground truth was the "specification used to generate the phantom (i.e., the specification file used to create the phantom)" and the "ground truth specification" for digital synthetic phantoms. This implies a known, engineered ground truth rather than expert-derived ground truth.
    • Number/Qualifications of Experts: Not applicable, as the ground truth was based on phantom specifications, not expert interpretation.
  • Clinical Data:

    • The document implies that "Perspectum-trained operator loads previously acquired MRCP data" and "A structured summary report is generated as output by the device, which includes quantitative analysis results and geometric characteristics of the biliary system and pancreatic ducts." These operators were also involved in inter/intra-operator assessment.
    • Ground Truth: The document does not explicitly state how ground truth was established for the clinical precision/variability studies or if a separate, expert-derived ground truth was used for assessing the accuracy of metrics against real anatomy. The focus for clinical testing was on device precision (repeatability and reproducibility) and operator variability, implying a comparison of the device's output against itself under different conditions rather than a comparison to an independent "true" clinical assessment of duct characteristics.

4. Adjudication Method (for the test set)

  • Phantoms: Not applicable, as ground truth was derived from phantom specifications.
  • Clinical Data (Precision/Operator Variability):
    • Adjudication: For inter-operator variability, "two internal operators separately processed the same cases and the metrics obtained by each operator were compared statistically." For intra-operator variability, "one internal operator processed the same set of cases twice... and the metrics from each analysis were compared statistically." The operators were "blinded to all analyses conducted by other operators and origins of datasets." This indicates a statistical comparison rather than a consensus-based adjudication of ground truth.

5. If a Multi Reader Multi Case (MRMC) Comparative Effectiveness Study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

The document does not describe a Multi Reader Multi Case (MRMC) comparative effectiveness study evaluating human reader improvement with AI assistance. The study focuses on the standalone performance characteristics of the MRCP+ v2 device itself, including its accuracy, precision, repeatability, reproducibility, and operator variability, and its substantial equivalence to its predicate device. It assesses the device's ability to generate quantitative measurements, not its impact on human reader performance.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Yes, a standalone performance assessment was conducted, particularly for the "Accuracy of algorithm - synthetic data" (Table 3). This involved comparing the algorithm's output on digital synthetic phantoms directly against the known ground truth specifications of those phantoms. The device is also described as a "standalone software medical device" and the tests for accuracy, repeatability, and reproducibility of the device are essentially standalone assessments of the product's output.

7. The Type of Ground Truth Used

  • Phantoms: The ground truth for phantom studies was the specification used to generate the phantom (e.g., the CAD file or design parameters for 3D printed phantoms, or the configuration files for digital synthetic phantoms). This is a precise, known, and engineered ground truth.
  • Clinical Data (Precision/Operator Variability): For the clinical precision studies, the ground truth was implicitly the device's own output from repeated measurements or different operators. The goal was to assess consistency of the device, not necessarily its accuracy against an independent clinical ground truth (like pathology or clinical outcomes). The document does not mention ground truth being established by pathology, outcomes data, or expert consensus for the clinical performance evaluation of the metrics themselves.

8. The Sample Size for the Training Set

The document does not specify the sample size used for the training set of the MRCP+ v2 algorithms. The provided text is a 510(k) summary, which focuses on validation and comparison to a predicate, rather than detailed algorithm development or training data.

9. How the Ground Truth for the Training Set Was Established

The document does not provide information on how the ground truth for the training set was established. Again, this level of detail is typically not included in a 510(k) summary.

§ 892.2050 Medical image management and processing system.

(a)
Identification. A medical image management and processing system is a device that provides one or more capabilities relating to the review and digital processing of medical images for the purposes of interpretation by a trained practitioner of disease detection, diagnosis, or patient management. The software components may provide advanced or complex image processing functions for image manipulation, enhancement, or quantification that are intended for use in the interpretation and analysis of medical images. Advanced image manipulation functions may include image segmentation, multimodality image registration, or 3D visualization. Complex quantitative functions may include semi-automated measurements or time-series measurements.(b)
Classification. Class II (special controls; voluntary standards—Digital Imaging and Communications in Medicine (DICOM) Std., Joint Photographic Experts Group (JPEG) Std., Society of Motion Picture and Television Engineers (SMPTE) Test Pattern).