Search Results

MRIMath i2contour is intended for the semi-automatic labeling, visualization, and volumetric quantification of WHO grade 4 glioblastoma (GBM) from a set of standard MRI images of male or female patients 18 years of age or older who are known to have pathologically proven glioblastoma. Volumetric measurements may be compared to past measurements if available. MRIMath i2contour is not to be used for primary diagnosis and is not intended to be the sole diagnostic metric.

Device Description

The MRIMath i2Contour is a web-based software platform designed for the contouring and segmentation of the T1c and FLAIR sequences of the MRIs of patients already diagnosed with GBM. It combines AI with a user interface (UI) for review, manual contouring, and approval. The software is intended to be used by trained medical professionals as an aid in the tumor contouring process. Review by a trained professional is a requirement for completion.

The AI algorithm within MRIMath i2Contour generates an initial tumor contour, which serves as a starting point for medical professionals to complete the contouring process manually. It is important to note that the software does not alter the original MRI images and is not intended for turnor detection or diagnostic purposes. MRIMath i2Contour is specifically designed to generate turnor volume contours for GBM. It is not intended for use with images of other brain tumor types.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device's performance, based on the provided text:

Device: i2Contour

1. Table of Acceptance Criteria and Reported Device Performance

The acceptance criteria are implicitly set by comparing the device's performance to the predicate device's best mean DICE score (DSC) and demonstrating statistically significant improvement over a 50% chance of exceeding this threshold.

Acceptance Criteria / Performance Metric	Target Value / Threshold	Reported Device Performance and Confidence Interval (CI) or P-value
Proportion of FLAIR AI DSC measurements exceeding predicate's best mean DSC (0.88)	Significantly different from 50% (P < 0.05)	85% exceeding (CI: 72%-92%), P < 0.001
Proportion of T1c AI DSC measurements exceeding predicate's best mean DSC (0.88)	Significantly different from 50% (P < 0.05)	93% exceeding (CI: 82%-98%), P < 0.001
Mean Overall DICE Score (DSC) for T1c AI	Closely matching radiologists' scores	0.95 (CI: 93%-96%)
Mean Overall DICE Score (DSC) for FLAIR AI	Closely matching radiologists' scores	0.92 (CI: 90%-94%)
Mean DSC for True Positive T1c images (AI segmentation)	Comparable to radiologists' range	83% (Radiologists: 76%-86%)
Mean DSC for True Positive FLAIR images (AI segmentation)	Comparable to radiologists' range	80% (Radiologists: 75%-83%)
Sensitivity for T1c AI	Not explicitly stated as acceptance criteria, but reported	92.7%
Specificity for T1c AI	Not explicitly stated as acceptance criteria, but reported	97.2%
Median Sensitivity for FLAIR AI	Not explicitly stated as acceptance criteria, but reported	93.4%
Median Specificity for FLAIR AI	Not explicitly stated as acceptance criteria, but reported	98.6%
Mean Hausdorff distances	< 5 mm and aligning closely with radiologists' measurements	< 5 mm and aligned with radiologists' measurements
Volume measurements, kappa scores, and Bland-Altman differences	Aligning closely with radiologists' measurements	Aligned closely with radiologists' measurements
Inter-user variability between radiologists (T1c)	Low (explicit criteria not specified, but reported as an indicator of ground truth reliability)	< 5%
Inter-user variability between radiologists (FLAIR)	Low (explicit criteria not specified, but reported as an indicator of ground truth reliability)	< 10%

2. Sample Sizes Used for the Test Set and Data Provenance

Test Set Sample Size: 46 pre- and post-operative MRIs from patients diagnosed with glioblastoma multiforme. The text mentions "46 pre- and post-operative MRIs," suggesting that each patient likely contributed at least one pre- and one post-operative MRI, but the exact number of patients is not explicitly stated.
Data Provenance: The test MRIs were obtained from 19 centers in the United States, including 13 community hospitals and clinics, 4 imaging centers, and two university hospitals and clinics. Specific locations listed: University of Alabama at Birmingham Hospital and Clinics, MD Anderson Cancer Center, St Vincent Hospital (Birmingham, AL), Southwest Diagnostic Imaging Center (Dallas, TX), Thomas Medical Center (Fairhope, AL), Carmichael Imaging Center (Montgomery, AL), East Alabama Medical Center (Opelika, AL), St Dominic (Jackson, MS), Mobile Infirmary (Mobile, AL), North Mississippi Medical Center (Tupelo, MS), Sacred Heart Airport Medical Center (Pensacola, FL), SHHP, LX DCH (Tuscaloosa, AL), Leeds Imaging Center (Leeds, AL), Black Warrior Medical Center (Tuscaloosa, AL), American Health Imaging (Birmingham, AL), Main, Floyd Medical Center (Rome, GA), Trinity Medical Center (Birmingham, AL), Jackson Hospital (Jackson, MS).
Retrospective or Prospective: Not explicitly stated, but the mention of "obtained at" various centers suggests these were existing scans, implying a retrospective study.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications

Number of Experts: Three.
Qualifications: Board-certified neuro-radiologists with expertise in measuring glioblastoma multiforme.

4. Adjudication Method for the Test Set

The text states that the ground truth was established by comparing AI contours to manual segmentations by "three neuro-radiologists, who used the MRIMath smart manual contouring platform." However, it does not specify an adjudication method (e.g., 2+1, 3+1 consensus, averaging, etc.) for resolving discrepancies among the three readers if they existed. The "kappa scores" mentioned and "inter-user variability" suggest direct comparison of their outputs, implying that each expert's contour was used to assess both the AI's performance and inter-reader agreement, rather than creating a single adjudicated ground truth for each case. Given the wording, it appears the AI was compared against each neuro-radiologist's segmentation, and then the overall performance statistics (mean DSC, sensitivity, specificity) were derived from these comparisons.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, If So, What Was the Effect Size of How Much Human Readers Improve with AI vs. Without AI Assistance

No, a MRMC comparative effectiveness study was not explicitly done to evaluate human reader improvement with AI assistance. The study focuses on the standalone performance of the AI model and its alignment with human expertsegmentations, rather than how the AI assists human readers to perform better. The human readers' segmentations serve as the reference "ground truth" to validate the AI, not as a baseline for measuring AI-assisted improvement.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

Yes, a standalone performance evaluation of the algorithm was performed. The study evaluates the "accuracy of the MRIMath i2Contour FLAIR and T1c AI contours" by comparing "their outputs with the manual segmentations by three board certified neuro-radiologists." This means the AI generated its segmentations independently, and these were then compared against the expert manual segmentations.

7. The Type of Ground Truth Used

The ground truth used was expert consensus (or expert individual segmentation, considering the lack of specific adjudication) derived from manual segmentations performed by three board-certified neuro-radiologists using the MRIMath smart manual contouring platform. The patients were "known to have pathologically proven glioblastoma," meaning the disease ground truth (presence of GBM) was established via pathology. The "ground truth" for the segmentation itself was the expert manual contour.

8. The Sample Size for the Training Set

The training set sample size is not explicitly stated in the provided text. The text only mentions the testing dataset of 46 MRIs.

9. How the Ground Truth for the Training Set Was Established

The provided text does not detail how the ground truth for the training set was established. It only describes the ground truth establishment for the test set.

Ask a Question

Ask a specific question about this device

K Number

K132847

Device Name

I2C

Manufacturer

ION BEAM APPLICATIONS S.A.

Date Cleared

2013-11-20

(70 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K080742,K072046,K040192,K042720

Predicate For

K181498

Intended Use

I2C is used with a charged particle or photon radiation therapy system for localization of the patient position with respect to the therapy equipment and to provide correction feedback to the radiation therapy device.

Device Description

For clinical use, l2C must be integrated into a radiation therapy system. I2C will interact with components of the radiation therapy center. I2C supports the acquisition of 2D, 2D stereoscopic and 3D images using 2D detectors. I2C will be used by the clinical therapist to verify by imaging that the treatment target position received from the treatment control applicative laver is 'valid', i.e. that it brings the center of the treatment target volume at the isocenter of the therapy equipment with required accuracy. If it is not, InC will propose a correction shift - or correction vector - that will be exported to the radiation therapy system.

AI/ML Overview

Here's a summary of the acceptance criteria and study information for the I2C device:

1. Table of Acceptance Criteria and Reported Device Performance

Performance / Technological Specification	Acceptance Criteria (Predicate Devices)	Reported Device Performance (I2C)
Generator operating range (radiographic)	40-150 kVp	40-150 kVp
Generator operating range (CBCT)	60-140 kVp (OBI)	40-125 kVp
Flat panel pixel size	127 µm (Verisuite) / 194 µm (OBI)	148 µm
Flat panel pixel matrix	3200x3200 pixels (Verisuite) / 3200x2304 pixels (OBI)	> 2880x2880 pixels
CBCT scale & distance accuracy	1% (OBI)	1%
CBCT spatial resolution	4-7 lp/cm (OBI)	At least 5 lp/cm
CBCT low contrast resolution	15mm@1% (OBI)	15mm@1%
CBCT numbers accuracy	+/- 40 HU (OBI)	+/- 40 HU
CBCT Uniformity	+/- 40 HU (OBI)	+/- 40 HU
Achievable matching accuracy	< 1 mm (Verisuite) / 1-2 mm (ExacTrac)	< 1 mm

2. Sample Size Used for the Test Set and Data Provenance

The document does not specify a distinct "test set" sample size in the traditional sense. Instead, it describes various verification and validation activities:

Simulated Clinical Environment: The X-Ray imaging equipment was installed on a test bench with a phantom to represent different configuration setups and simulate gantry rotation.
Communication Testing: A second test environment was used to verify communication with different third-party software configurations (Elekta Mosaiq, Varian Aria).
Additional Performance Tests: Conducted on a stand-alone system using:
- Appropriate datasets collected from simulated treatments.
- Radiographs of phantoms acquired in IBA treatment centers.
- Anonymized patient data provided by IBA treatment centers.
User Evaluation: Intermediate releases were distributed to a group of "a-users" (reference users in proton therapy) to assess usability.

The data provenance for the additional performance tests includes:

Simulated treatments.
Phantom data acquired in IBA treatment centers.
Anonymized patient data (from IBA treatment centers).
The document does not explicitly state the country of origin for the patient data, but "IBA treatment centers" suggests it likely comes from facilities where IBA technology is used. The data appears to be retrospective in nature for these tests, as it mentions "anonymised patient data provided by IBA treatment centers."

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

The document does not specify the number or qualifications of experts used to establish ground truth for the test set. The validation primarily relies on performance metrics derived from physical phantoms, simulated scenarios, and anonymized patient data.

4. Adjudication Method for the Test Set

No specific adjudication method (e.g., 2+1, 3+1) is mentioned for the test set. The evaluation seems to be based on direct measurement of performance metrics against predefined technological specifications and comparison to predicate devices, rather than a consensus-based expert review for individual cases.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done

No, an MRMC comparative effectiveness study is not mentioned. The document focuses on the technical performance of the device itself and its equivalence to predicate devices, not on the improvement of human reader performance with or without AI assistance.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

Yes, standalone performance tests were done. The document states: "Third, additional performance tests were done on a stand-alone system with appropriate datasets collected from simulated treatments and radiographs of phantom acquired in IBA treatment centres, and from anonymised patient data provided by IBA treatment centers."

7. The Type of Ground Truth Used

The ground truth used for these non-clinical tests appears to be primarily:

Physical measurements/known values from phantoms: For accuracy, resolution, contrast, and uniformity tests.
Simulated treatment parameters: For evaluating the device's ability to process and generate correction vectors in controlled scenarios.
Anonymized patient data: Used as input for the standalone system, likely comparing its output (e.g., calculated shifts) against expected or clinically established values, though the exact method of ground truth for patient data isn't detailed.

8. The Sample Size for the Training Set

The document does not specify a separate "training set" or its sample size. The focus is on the verification and validation of the developed system, suggesting that the algorithm's training (if any involving machine learning) was either done prior to these V&V activities or is not detailed in this summary.

9. How the Ground Truth for the Training Set Was Established

As no training set is explicitly mentioned or detailed, the method for establishing its ground truth is not provided in this document.

Ask a Question

Ask a specific question about this device

Page 1 of 1