K Number
K220598
Manufacturer
Date Cleared
2022-08-24

(175 days)

Product Code
Regulation Number
892.2050
Panel
RA
Reference & Predicate Devices
N/A
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

AutoContour is intended to assist radiation treatment planners in contouring structures within medical images in preparation for radiation therapy treatment planning.

Device Description

As with AutoContour RADAC, the AutoContour RADAC V2 device is software that uses DICOM-compliant image data (CT or MR) as input to (1) automatically contour various structures of interest for radiation therapy treatment planning using machine learning based contouring. The deep-learning based structure models are trained using imaging datasets consisting of anatomical orqans of the head and neck, thorax, abdomen and pelvis for adult male and female patients.(2) allow the user to review and modify the resulting contours, and (3) generate DICOM-compliant structure set data the can be imported into a radiation therapy treatment planning system

AutoContour RADAC V2 consists of 3 main components:

    1. A .NET client application designed to run on the Windows Operating System allowing the user to load image and structure sets for upload to the cloud-based server for automatic contouring, perform registration with other image sets, as well as review, edit, and export the structure set.
    1. A local "agent" service designed to run on the Windows Operating System that is configured by the user to monitor a network storage location for new CT and MR datasets that are to be automatically contoured.
    1. A cloud-based automatic contouring service that produces initial contours based on image sets sent by the user from the .NET client application.
AI/ML Overview

The provided text describes the acceptance criteria and the study that proves the device, AutoContour RADAC V2, meets these criteria. Here's a breakdown of the requested information:

1. A table of acceptance criteria and the reported device performance

The acceptance criterion for contouring accuracy is measured by the Mean Dice Similarity Coefficient (DSC), which varies based on the estimated volume of the structure.

Structure Size CategoryDSC Acceptance Criteria (Mean)Reported Device Performance (Mean DSC +/- STD)
Large volume structures> 0.800.94 +/- 0.03
Medium volume structures> 0.650.82 +/- 0.09
Small volume structures> 0.500.61 +/- 0.14

The document also provides detailed DSC results for each contoured structure, which all meet or exceed their respective size category's acceptance criteria. For example, for "A_Aorta" (Large), the reported DSC Mean is 0.91, which is >0.80. For "Brainstem" (Medium), the reported DSC Mean is 0.90, which is >0.65. For "OpticChiasm" (Small), the reported DSC Mean is 0.63, which is >0.50.

2. Sample size used for the test set and the data provenance

  • CT Test Set:
    • Sample Size: An average of 140 test image sets per CT structure model, constituting 20% of the training images. The specific number of test data sets for each CT structure is provided in the table (e.g., A_Aorta: 60, Bladder: 372).
    • Data Provenance:
      • Country of Origin: Not explicitly stated, but the patient demographics suggest diverse origins, likely within the US, given the prevalence of specific cancers and racial demographics. The acquisition was done using a Philips Big Bore CT simulator.
      • Retrospective or Prospective: Not explicitly stated, but common in such validation studies, the data is typically retrospective patient data.
      • Demographics: 51.7% male, 48.3% female. Age range: 11-30 (0.3%), 31-50 (6.2%), 51-70 (43.3%), 71-100 (50.3%). Race: 84.0% White, 12.8% Black or African American, 3.2% Other.
      • Clinical Relevance: Data spanned across common radiation therapy treatment subgroups (Prostate, Breast, Lung, Head and Neck cancers).
  • MR Test Set:
    • Sample Size: An average of 16 test image sets per MR structure model. Specific numbers are not provided for each MR structure, but the total validation set for sensitivity and specificity was 16 datasets.
    • Data Provenance:
      • Country of Origin: Massachusetts General Hospital, Boston, MA.
      • Retrospective or Prospective: The text states "These training sets consisted primarily of glioblastoma and astrocytoma cases from the Cancer Imaging Archive (TCIA) Glioma data set." and that "The testing dataset was acquired at a different institution using a different scanner and sequence parameters", implying retrospective data collection from existing archives/institutions.
      • Demographics: 56% Male and 44% Female patients, with ages ranging from 20-80. No Race or Ethnicity data was provided.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

  • Number of Experts: Three clinically experienced experts.
  • Qualifications: Two radiation therapy physicists and one radiation dosimetrist.

4. Adjudication method for the test set

  • Method: Ground truthing of each test dataset was generated manually using consensus (NRG/RTOG) guidelines, as appropriate, by the three clinically experienced experts. This implies a form of expert consensus adjudication.

5. If a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was done, if so, what was the effect size of how much human readers improve with AI vs without AI assistance

  • MRMC Study: No, an MRMC comparative effectiveness study involving human readers with and without AI assistance was not conducted. The performance data focuses on the software's standalone accuracy (Dice Similarity Coefficient, sensitivity, and specificity). The text states: "As with the Predicate Device, no clinical trials were performed for AutoContour RADAC V2."

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

  • Standalone Performance: Yes, the primary performance evaluation provided is for the software's standalone performance, measured by the Dice Similarity Coefficient (DSC), sensitivity, and specificity of the auto-generated contours against expert-established ground truth. The study explicitly states, "Further tests were performed on independent datasets from those included in training and validation sets in order to validate the generalizability of the machine learning model." This is a validation of the algorithm's performance.

7. The type of ground truth used

  • Type of Ground Truth: Expert consensus of manually contoured structures, established using NRG/RTOG (Radiation Therapy Oncology Group) guidelines. This is a form of expert consensus.

8. The sample size for the training set

  • CT Training Set: An average of 700 training image sets per CT structure model. The specific number of training data sets for each CT structure is provided in the table (e.g., A_Aorta: 240, Bladder: 1000).
  • MR Training Set: An average of 81 training image sets for MR structure models.

9. How the ground truth for the training set was established

  • The document implies that the ground truth for the training set was also established manually, similar to the test set, as it states "Datasets used for testing were removed from the training dataset pool before model training began, and used exclusively for testing." It is standard practice for medical imaging AI to train on expertly contoured data. While not explicitly detailed for the training set, the consistency in ground truth methodology for both training and testing in such submissions suggests expert manual contouring based on established guidelines would have been used for training as well.
    • Source for MR Training Data: Primarily glioblastoma and astrocytoma cases from The Cancer Imaging Archive (TCIA) Glioma data set.

§ 892.2050 Medical image management and processing system.

(a)
Identification. A medical image management and processing system is a device that provides one or more capabilities relating to the review and digital processing of medical images for the purposes of interpretation by a trained practitioner of disease detection, diagnosis, or patient management. The software components may provide advanced or complex image processing functions for image manipulation, enhancement, or quantification that are intended for use in the interpretation and analysis of medical images. Advanced image manipulation functions may include image segmentation, multimodality image registration, or 3D visualization. Complex quantitative functions may include semi-automated measurements or time-series measurements.(b)
Classification. Class II (special controls; voluntary standards—Digital Imaging and Communications in Medicine (DICOM) Std., Joint Photographic Experts Group (JPEG) Std., Society of Motion Picture and Television Engineers (SMPTE) Test Pattern).