Search Results
Found 1 results
510(k) Data Aggregation
(251 days)
MRIMath i2contour is intended for the semi-automatic labeling, visualization, and volumetric quantification of WHO grade 4 glioblastoma (GBM) from a set of standard MRI images of male or female patients 18 years of age or older who are known to have pathologically proven glioblastoma. Volumetric measurements may be compared to past measurements if available. MRIMath i2contour is not to be used for primary diagnosis and is not intended to be the sole diagnostic metric.
The MRIMath i2Contour is a web-based software platform designed for the contouring and segmentation of the T1c and FLAIR sequences of the MRIs of patients already diagnosed with GBM. It combines AI with a user interface (UI) for review, manual contouring, and approval. The software is intended to be used by trained medical professionals as an aid in the tumor contouring process. Review by a trained professional is a requirement for completion.
The AI algorithm within MRIMath i2Contour generates an initial tumor contour, which serves as a starting point for medical professionals to complete the contouring process manually. It is important to note that the software does not alter the original MRI images and is not intended for turnor detection or diagnostic purposes. MRIMath i2Contour is specifically designed to generate turnor volume contours for GBM. It is not intended for use with images of other brain tumor types.
Here's a breakdown of the acceptance criteria and the study proving the device's performance, based on the provided text:
Device: i2Contour
1. Table of Acceptance Criteria and Reported Device Performance
The acceptance criteria are implicitly set by comparing the device's performance to the predicate device's best mean DICE score (DSC) and demonstrating statistically significant improvement over a 50% chance of exceeding this threshold.
| Acceptance Criteria / Performance Metric | Target Value / Threshold | Reported Device Performance and Confidence Interval (CI) or P-value |
|---|---|---|
| Proportion of FLAIR AI DSC measurements exceeding predicate's best mean DSC (0.88) | Significantly different from 50% (P < 0.05) | 85% exceeding (CI: 72%-92%), P < 0.001 |
| Proportion of T1c AI DSC measurements exceeding predicate's best mean DSC (0.88) | Significantly different from 50% (P < 0.05) | 93% exceeding (CI: 82%-98%), P < 0.001 |
| Mean Overall DICE Score (DSC) for T1c AI | Closely matching radiologists' scores | 0.95 (CI: 93%-96%) |
| Mean Overall DICE Score (DSC) for FLAIR AI | Closely matching radiologists' scores | 0.92 (CI: 90%-94%) |
| Mean DSC for True Positive T1c images (AI segmentation) | Comparable to radiologists' range | 83% (Radiologists: 76%-86%) |
| Mean DSC for True Positive FLAIR images (AI segmentation) | Comparable to radiologists' range | 80% (Radiologists: 75%-83%) |
| Sensitivity for T1c AI | Not explicitly stated as acceptance criteria, but reported | 92.7% |
| Specificity for T1c AI | Not explicitly stated as acceptance criteria, but reported | 97.2% |
| Median Sensitivity for FLAIR AI | Not explicitly stated as acceptance criteria, but reported | 93.4% |
| Median Specificity for FLAIR AI | Not explicitly stated as acceptance criteria, but reported | 98.6% |
| Mean Hausdorff distances | < 5 mm and aligning closely with radiologists' measurements | < 5 mm and aligned with radiologists' measurements |
| Volume measurements, kappa scores, and Bland-Altman differences | Aligning closely with radiologists' measurements | Aligned closely with radiologists' measurements |
| Inter-user variability between radiologists (T1c) | Low (explicit criteria not specified, but reported as an indicator of ground truth reliability) | < 5% |
| Inter-user variability between radiologists (FLAIR) | Low (explicit criteria not specified, but reported as an indicator of ground truth reliability) | < 10% |
2. Sample Sizes Used for the Test Set and Data Provenance
- Test Set Sample Size: 46 pre- and post-operative MRIs from patients diagnosed with glioblastoma multiforme. The text mentions "46 pre- and post-operative MRIs," suggesting that each patient likely contributed at least one pre- and one post-operative MRI, but the exact number of patients is not explicitly stated.
- Data Provenance: The test MRIs were obtained from 19 centers in the United States, including 13 community hospitals and clinics, 4 imaging centers, and two university hospitals and clinics. Specific locations listed: University of Alabama at Birmingham Hospital and Clinics, MD Anderson Cancer Center, St Vincent Hospital (Birmingham, AL), Southwest Diagnostic Imaging Center (Dallas, TX), Thomas Medical Center (Fairhope, AL), Carmichael Imaging Center (Montgomery, AL), East Alabama Medical Center (Opelika, AL), St Dominic (Jackson, MS), Mobile Infirmary (Mobile, AL), North Mississippi Medical Center (Tupelo, MS), Sacred Heart Airport Medical Center (Pensacola, FL), SHHP, LX DCH (Tuscaloosa, AL), Leeds Imaging Center (Leeds, AL), Black Warrior Medical Center (Tuscaloosa, AL), American Health Imaging (Birmingham, AL), Main, Floyd Medical Center (Rome, GA), Trinity Medical Center (Birmingham, AL), Jackson Hospital (Jackson, MS).
- Retrospective or Prospective: Not explicitly stated, but the mention of "obtained at" various centers suggests these were existing scans, implying a retrospective study.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications
- Number of Experts: Three.
- Qualifications: Board-certified neuro-radiologists with expertise in measuring glioblastoma multiforme.
4. Adjudication Method for the Test Set
The text states that the ground truth was established by comparing AI contours to manual segmentations by "three neuro-radiologists, who used the MRIMath smart manual contouring platform." However, it does not specify an adjudication method (e.g., 2+1, 3+1 consensus, averaging, etc.) for resolving discrepancies among the three readers if they existed. The "kappa scores" mentioned and "inter-user variability" suggest direct comparison of their outputs, implying that each expert's contour was used to assess both the AI's performance and inter-reader agreement, rather than creating a single adjudicated ground truth for each case. Given the wording, it appears the AI was compared against each neuro-radiologist's segmentation, and then the overall performance statistics (mean DSC, sensitivity, specificity) were derived from these comparisons.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, If So, What Was the Effect Size of How Much Human Readers Improve with AI vs. Without AI Assistance
No, a MRMC comparative effectiveness study was not explicitly done to evaluate human reader improvement with AI assistance. The study focuses on the standalone performance of the AI model and its alignment with human expertsegmentations, rather than how the AI assists human readers to perform better. The human readers' segmentations serve as the reference "ground truth" to validate the AI, not as a baseline for measuring AI-assisted improvement.
6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done
Yes, a standalone performance evaluation of the algorithm was performed. The study evaluates the "accuracy of the MRIMath i2Contour FLAIR and T1c AI contours" by comparing "their outputs with the manual segmentations by three board certified neuro-radiologists." This means the AI generated its segmentations independently, and these were then compared against the expert manual segmentations.
7. The Type of Ground Truth Used
The ground truth used was expert consensus (or expert individual segmentation, considering the lack of specific adjudication) derived from manual segmentations performed by three board-certified neuro-radiologists using the MRIMath smart manual contouring platform. The patients were "known to have pathologically proven glioblastoma," meaning the disease ground truth (presence of GBM) was established via pathology. The "ground truth" for the segmentation itself was the expert manual contour.
8. The Sample Size for the Training Set
The training set sample size is not explicitly stated in the provided text. The text only mentions the testing dataset of 46 MRIs.
9. How the Ground Truth for the Training Set Was Established
The provided text does not detail how the ground truth for the training set was established. It only describes the ground truth establishment for the test set.
Ask a specific question about this device
Page 1 of 1