K Number
K250035
Manufacturer
Date Cleared
2025-02-03

(27 days)

Product Code
Regulation Number
892.2050
Panel
RA
Reference & Predicate Devices
N/A
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

Trained medical professionals use Contour ProtégéAl as a tool to assist in the automated processing of digital medical images of modalities CT and MR, as supported by ACR/NEMA DICOM 3.0. In addition, Contour ProtégéAl supports the following indications:

· Creation of contours using machine-learning algorithms for applications including, but not limited to, quantitative analysis, aiding adaptive therapy, transferring contours to radiation therapy treatment planning systems, and archiving contours for patient follow-up and management.

· Segmenting anatomical structures across a variety of CT anatomical locations.

· And segmenting the prostate, the seminal vesicles, and the urethra within T2-weighted MR images.

Appropriate image visualization software must be used to review and, if necessary, edit results automatically generated by Contour ProtégéAI.

Device Description

Contour ProtégéAl+ is an accessory to MIM software that automatically creates contours on medical images through the use of machine-learning algorithms. It is designed for use in the processing of medical images and operates on Windows, Mac, and Linux computer systems. Contour ProtégéAl+ is deployed on a remote server using the MIMcloud service for data management and transfer; or locally on the workstation or server running MIM software.

AI/ML Overview

Here's a breakdown of Contour ProtégéAI+'s acceptance criteria and study information, based on the provided text:

Acceptance Criteria and Device Performance

The acceptance criteria for each structure's inclusion in the final models were a combination of statistical tests and user evaluation:

Acceptance CriteriaReported Device Performance (Contour ProtégéAI+)
Statistical non-inferiority of the Dice score compared with the reference predicate (MIM Atlas).For most structures, the Contour ProtégéAI+ Dice score mean and 95th percentile confidence bound were equivalent to or better than the MIM Atlas. Equivalence was defined as the lower 95th percentile confidence bound of Contour ProtégéAI+ being greater than 0.1 Dice lower than the mean MIM Atlas performance. Results are shown in Table 2, with '*' indicating demonstrated equivalence.
Statistical non-inferiority of the Mean Distance Accuracy (MDA) score compared with the reference predicate (MIM Atlas).For most structures, the Contour ProtégéAI+ MDA score mean and 95th percentile confidence bound were equivalent to or better than the MIM Atlas. Equivalence was defined as the lower 95th percentile confidence bound of Contour ProtégéAI+ being greater than 0.1 Dice lower than the mean MIM Atlas performance. Results are shown in Table 2, with '*' indicating demonstrated equivalence.
Average user evaluation of 2 or higher (on a three-point scale: 1=negligible, 2=moderate, 3=significant time savings).The "External Evaluation Score" (Table 2) consistently shows scores of 2 or higher across all listed structures, indicating moderate to significant time savings.
(For models as a whole) Statistically non-inferior cumulative Added Path Loss (APL) compared to the reference predicate.For all 4.2.0 CT models (Thorax, Abdomen, Female Pelvis, SurePlan MRT), equivalence in cumulative APL was demonstrated (Table 3), with Contour ProtégéAI+ showing lower mean APL values than MIM Atlas.
(For localization accuracy) No specific passing criterion, but results are included.Localization accuracy results (Table 4) are provided as percentages of images successfully localized for both "Relevant FOV" and "Whole Body CT," ranging from 77% to 100% depending on the structure and model.

Note: Cells highlighted in orange in the original document indicate non-demonstrated equivalence (not reproducible in markdown), and cells marked with '**' indicate that equivalence was not demonstrated because the minimum sample size was not met for that contour.

Study Details

  1. Sample size used for the test set and the data provenance:

    • Test Set Sample Size: The Contour ProtégéAI+ subject device was evaluated on a pool of 770 images.
    • Data Provenance: The images were gathered from 32 institutions. The verification data used for testing is from a set of institutions that are totally disjoint from the datasets used to train each model. Patient demographics for the testing data are: 53.4% female, 31.3% male, 15.3% unknown; 0.3% ages 0-20, 4.7% ages 20-40, 20.9% ages 40-60, 50.0% ages 60+, 24.1% unknown; varying scanner manufacturers (GE, Siemens, Phillips, Toshiba, unknown). The data is retrospective, originating from clinical treatment plans according to the training set description.
  2. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:

    • The document implies that the ground truth for the test set was validated against "original ground-truth contours" when measuring Dice and MDA against MIM Maestro. However, the expert qualifications are explicitly stated for the training set ground truth, which often implies a similar standard for the test set.
    • Ground truth (for training/re-segmentation) was established by:
      • Consultants (physicians and dosimetrists) specifically for this purpose, outside of clinical practice.
      • Initial segmentations were reviewed and corrected by radiation oncologists.
      • Final review and correction by qualified staff at MIM Software (MD or licensed dosimetrists).
      • All segmenters and reviewers were instructed to ensure the highest quality training data according to relevant published contouring guidelines.
  3. Adjudication method for the test set:

    • The document doesn't explicitly describe a specific adjudication method like "2+1" or "3+1" for the test set ground truth. However, it does state that "Detailed instructions derived from relevant published contouring guidelines were prepared for the dosimetrists. The initial segmentations were then reviewed and corrected by radiation oncologists against the same standards and guidelines. Qualified staff at MIM Software (MD or licensed dosimetrists) then performed a final review and correction." This process implies a multi-expert review and correction process to establish the ground truth used for both training and evaluation, ensuring a high standard of accuracy.
  4. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:

    • A direct MRMC comparative effectiveness study measuring human readers' improvement with AI versus without AI assistance (i.e., human-in-the-loop performance) is not explicitly described in terms of effect size.
    • Instead, the study evaluates the standalone performance of the AI device (Contour ProtégéAI+) against a reference device (MIM Maestro atlas segmentation) and user evaluation of time savings.
    • The "Average user evaluation of 2 or higher" on a three-point scale (1=negligible, 2=moderate, 3=significant time savings) provides qualitative evidence of perceived improvement in workflow rather than a quantitative measure of diagnostic accuracy improvement due to AI assistance. "Preliminary user evaluation conducted as part of testing demonstrated that Contour ProtégéAI+ yields comparable time-saving functionality when creating contours as other commercially available automatic segmentation products."
  5. If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:

    • Yes, a standalone performance evaluation was conducted. The primary comparisons for Dice score, MDA, and cumulative APL are between the Contour ProtégéAI+ algorithm's output and the ground truth, benchmarked against the predicate device's (MIM Maestro atlas segmentation) standalone performance. The results in Table 2 and Table 3 directly show the algorithm's performance.
  6. The type of ground truth used (expert consensus, pathology, outcomes data, etc.):

    • Expert Consensus Contour (and review): The ground truth was established by expert re-segmentation of images (by consultants, physicians, and dosimetrists) specifically for this purpose, reviewed and corrected by radiation oncologists, and then subjected to a final review and correction by qualified MIM Software staff (MD or licensed dosimetrists). This indicates a robust expert consensus process based on established clinical guidelines.
  7. The sample size for the training set:

    • The document states that the CT images for the "training set were obtained from clinical treatment plans for patients prescribed external beam or molecular radiotherapy". However, it does not provide a specific numerical sample size for the training set, only for the test set (770 images). It only mentions being "re-segmented by consultants... specifically for this purpose".
  8. How the ground truth for the training set was established:

    • The ground truth for the training set was established through a multi-step expert process:
      • CT images from clinical treatment plans were re-segmented by consultants (physicians and dosimetrists), explicitly for the purpose of creating training data, outside of clinical practice.
      • Detailed instructions from relevant published contouring guidelines were provided to the dosimetrists.
      • Initial segmentations were reviewed and corrected by radiation oncologists against the same standards and guidelines.
      • A final review and correction was performed by qualified staff at MIM Software (MD or licensed dosimetrists).
      • All experts were instructed to spend additional time to ensure the highest quality training data, contouring all specified OAR structures on all images according to referenced standards.

§ 892.2050 Medical image management and processing system.

(a)
Identification. A medical image management and processing system is a device that provides one or more capabilities relating to the review and digital processing of medical images for the purposes of interpretation by a trained practitioner of disease detection, diagnosis, or patient management. The software components may provide advanced or complex image processing functions for image manipulation, enhancement, or quantification that are intended for use in the interpretation and analysis of medical images. Advanced image manipulation functions may include image segmentation, multimodality image registration, or 3D visualization. Complex quantitative functions may include semi-automated measurements or time-series measurements.(b)
Classification. Class II (special controls; voluntary standards—Digital Imaging and Communications in Medicine (DICOM) Std., Joint Photographic Experts Group (JPEG) Std., Society of Motion Picture and Television Engineers (SMPTE) Test Pattern).