Search Results

AutoContour is intended to assist radiation treatment planners in contouring and reviewing structures within medical images in preparation for radiation therapy treatment planning.

Device Description

As with AutoContour Model RADAC V3, the AutoContour Model RADAC V4 device is software that uses DICOM-compliant image data (CT or MR) as input to: (1) automatically contour various structures of interest for radiation therapy treatment planning using machine learning based contouring. The deep-learning based structure models are trained using imaging datasets consisting of anatomical organs of the head and neck, thorax, abdomen and pelvis for adult male and female patients, (2) allow the user to review and modify the resulting contours, and (3) generate DICOM-compliant structure set data the can be imported into a radiation therapy treatment planning system.

AutoContour Model RADAC V4 consists of 3 main components:

1. A .NET client application designed to run on the Windows Operating System allowing the user to load image and structure sets for upload to the cloud-based server for automatic contouring, perform registration with other image sets, as well as review, edit, and export the structure set.
1. A local "agent" service designed to run on the Windows Operating System that is configured by the user to monitor a network storage location for new CT and MR datasets that are to be automatically contoured.
1. A cloud-based automatic contouring service that produces initial contours based on image sets sent by the user from the .NET client application.

AI/ML Overview

Here's an analysis of the acceptance criteria and study findings for the Radformation AutoContour (Model RADAC V4) device, based on the provided text:

1. Acceptance Criteria and Reported Device Performance

The primary acceptance criterion for the automated contouring models is the Dice Similarity Coefficient (DSC), which measures the spatial overlap between the AI-generated contour and the ground truth contour. The criteria vary based on the estimated size of the anatomical structure. Additionally, for external clinical testing, an external reviewer rating was used to assess clinical appropriateness.

Acceptance Criteria Category	Metric (for AI performance)	Performance Criteria (for AI performance)	Reported Device Performance (Mean ± Std Dev) for CT Models	Reported Device Performance (Mean ± Std Dev) for MR Models	Reported Device Performance (Mean External Reviewer Rating 1-5, higher is better)
Contouring Accuracy (CT Models)	Mean Dice Similarity Coefficient (DSC)	Large Volume Structures: ≥ 0.8	0.92 ± 0.06	0.96 ± 0.03	N/A
		Medium Volume Structures: ≥ 0.65	0.85 ± 0.09	0.84 ± 0.07	N/A
		Small Volume Structures: ≥ 0.5	0.81 ± 0.12	0.74 ± 0.09	N/A
Clinical Appropriateness (CT Models)	External Reviewer Rating (1-5 scale)	Average Score ≥ 3	N/A	N/A	4.57 (across all CT models)
Contouring Accuracy (MR Models)	Mean Dice Similarity Coefficient (DSC)	Large Volume Structures: ≥ 0.8	N/A	0.96 ± 0.03 (training data) 0.80 ± 0.09 (external data)	N/A
		Medium Volume Structures: ≥ 0.65	N/A	0.84 ± 0.07 (training data) 0.84 ± 0.09 (external data)	N/A
		Small Volume Structures: ≥ 0.5	N/A	0.74 ± 0.09 (training data) 0.61 ± 0.14 (external data)	N/A
Clinical Appropriateness (MR Models)	External Reviewer Rating (1-5 scale)	Average Score ≥ 3	N/A	N/A	4.6 (across all MR models)

2. Sample Size Used for the Test Set and Data Provenance

CT Models Test Set:
- Sample Size: For individual CT structure models, the number of testing sets ranged from 10 to 116 for the internal validation (Table 4) and 13 to 82 for the external clinical testing (Table 6). The document states "approximately 10% of the number of training image sets" were used for testing in the internal validation, with an average of 54 testing image sets per CT structure model.
- Data Provenance: Imaging data for training was gathered from 4 institutions in 2 different countries (United States and Switzerland). External clinical testing data for CT models was sourced from various TCIA (The Cancer Imaging Archive) datasets (Pelvic-Ref, Head-Neck-PET-CT, Pancreas-CT-CB, NSCLC, LCTSC, QIN-BREAST) and shared from several unidentified institutions in the United States. Data was retrospective, as it was acquired and then used for model validation.
MR Models Test Set:
- Sample Size: For individual MR structure models, the number of testing sets ranged from 45 for internal validation (Table 8) and 5 to 45 for external clinical testing (Table 10). The document states an average of 45 testing image sets per MR Brain model and 77 testing image sets per MR Pelvis model were used for internal validation.
- Data Provenance: Imaging data for training and internal testing was acquired from the Cancer Imaging Archive GLIS-RT dataset (for Brain models) and two open-source datasets plus one institution in the United States (for Pelvis models). External clinical testing data for MR models was from a clinical partner (for Brain models), two publicly available datasets (Prostate-MRI-U-S-Biopsy, Gold Atlas Pelvis, SynthRad), and two institutions utilizing MR Linacs for image acquisitions. Data was retrospective.
General Note: Test datasets were independent from those used for training.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

Number of Experts: Three (3) experts were used.
Qualifications of Experts: The ground truth was established by three clinically experienced experts consisting of 2 radiation therapy physicists and 1 radiation dosimetrist.

4. Adjudication Method for the Test Set

Method: Ground truthing of each test data set was generated manually using consensus (NRG/RTOG) guidelines as appropriate by the three experts. This implies an expert consensus method, likely involving discussion and agreement among the three. The document does not specify a quantitative adjudication method like "2+1" or "3+1" but rather a "consensus" guided by established clinical guidelines.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was Done

The document does not report an MRMC comparative effectiveness study comparing human readers with AI assistance versus without AI assistance. The study focuses purely on the AI's performance and its clinical appropriateness as rated by external reviewers.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) was Done

Yes, a standalone performance evaluation was done. The core of the performance data presented (Dice Similarity Coefficient) is a measure of the algorithm's direct output compared to the ground truth, without a human in the loop during the contour generation phase. The external reviewer ratings also assess the standalone performance of the AI-generated contours regarding their clinical utility for subsequent editing and approval.

7. The Type of Ground Truth Used

Type: The ground truth used was expert consensus, specifically from three clinically experienced experts (2 radiation therapy physicists and 1 radiation dosimetrist), guided by NRG/RTOG guidelines.

8. The Sample Size for the Training Set

CT Models Training Set: For CT structure models, there was an average of 341 training image sets.
MR Models Training Set: For MR Brain models, there was an average of 149 training image sets. For MR Pelvis models, there was an average of 306 training image sets.

9. How the Ground Truth for the Training Set Was Established

The document states that the deep-learning based structure models were "trained using imaging datasets consisting of anatomical organs" and that the "test datasets were independent from those used for training." While it extensively details how ground truth was established for the test sets (manual generation by three experts using consensus and NRG/RTOG guidelines), it does not explicitly describe how the ground truth for the training sets was established. However, given the nature of deep learning models for medical image segmentation, it is highly probable that the training data also had meticulously generated, expert-annotated ground truth contours, likely following similar rigorous processes as the test sets, potentially from various institutions or public datasets. The consistency of the model architecture and training methodologies (e.g., "very similar CNN architecture was used to train these new CT models") suggests a standardized approach to data preparation, including ground truth generation, for both training and testing.

Ask a Question

Ask a specific question about this device

K Number

K230685

Device Name

AutoContour Model RADAC V3

Manufacturer

Radformation, Inc.

Date Cleared

2023-04-14

(32 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K220598

Predicate For

K242729,K242994

Intended Use

AutoContour is intended to assist radiation treatment planners in contouring structures within medical images in preparation for radiation therapy treatment planning.

Device Description

As with AutoContour Model RADAC V2, the AutoContour Model RADAC V3 device is software that uses DICOM-compliant image data (CT or MR) as input to: (1) automatically contour various structures of interest for radiation therapy treatment planning using machine learning based contouring. The deep-learning based structure models are trained using imaging datasets consisting of anatomical organs of the head and neck, thorax, abdomen and pelvis for adult male and female patients, (2) allow the user to review and modify the resulting contours, and (3) generate DICOM-compliant structure set data the can be imported into a radiation therapy treatment planning system.

AutoContour Model RADAC V3 consists of 3 main components:

1. A .NET client application designed to run on the Windows Operating System allowing the user to load image and structure sets for upload to the cloud-based server for automatic contouring, perform registration with other image sets, as well as review, edit, and export the structure set.
1. A local "agent" service designed to run on the Windows Operating System that is configured by the user to monitor a network storage location for new CT and MR datasets that are to be automatically contoured.
1. A cloud-based automatic contouring service that produces initial contours based on image sets sent by the user from the .NET client application.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device's performance, based on the provided document:

1. Table of Acceptance Criteria & Reported Device Performance

Feature/Metric	Acceptance Criteria	Reported Device Performance (Mean DSC/Rating)
CT Structures
Large volume DSC	>= 0.8	Initial Validation: 0.88 +/- 0.06External Validation: 0.90 +/- 0.09
Medium volume DSC	>= 0.65	Initial Validation: 0.88 +/- 0.08External Validation: 0.83 +/- 0.12
Small volume DSC	>= 0.5	Initial Validation: 0.75 +/- 0.12External Validation: 0.79 +/- 0.11
Clinical Appropriateness (1-5 scale, 5 best)	Average score >= 3	Average rating of 4.5
MR Structures
Medium volume DSC	>= 0.65	Initial Validation: 0.87 +/- 0.07External Validation: 0.87 +/- 0.07
Small volume DSC	>= 0.5	Initial Validation: 0.74 +/- 0.07External Validation: 0.74 +/- 0.07
Clinical Appropriateness (1-5 scale, 5 best)	Average score >= 3	Average rating of 4.4

2. Sample Sizes Used for the Test Set and Data Provenance

CT Test Set (Internal Validation): Approximately 10% of the training images, averaging 50 test images per structure model.
- Provenance: Retrospective data from "among the patients used for CT training and testing 51.7% were male and 48.3% female. Patient ages range 11-30 : 0.3%, 31-50 : 6.2%, 51-70 : 43.3%, 71-100 : 50.3%. Race 84.0% White, 12.8% Black or African American, 3.2% Other." No specific country of origin is mentioned, but implies internal company data.
CT Test Set (External Clinical Validation): Variable per structure model, ranging from 19 to 63 images.
- Provenance: Publicly available CT datasets from The Cancer Imaging Archive (TCIA archive). This suggests diverse, likely multi-national origin, but exact countries are not specified. The studies cited are primarily from US institutions (e.g., Memorial Sloan Kettering Cancer Center, MD Anderson Cancer Center). This data is retrospective.
MR Test Set (Internal Validation):
- Brain models: 92 testing images (from TCIA GLIS-RT dataset).
- Pelvis models: Sample size not explicitly stated for testing, but refers to "Prostate-MRI-US-Biopsy dataset."
- Provenance: TCIA datasets (implying diverse origin, likely US-centric as above), retrospective.
MR Test Set (External Clinical Validation):
- Brain models: 20 MR T1 Ax post (BRAVO) image scans acquired from a clinical partner (no specific country mentioned, but likely US given the context).
- Pelvis models: 19 images from a publicly available Gold Atlas Data set. The Gold Atlas project has references indicating collaboration across European and US institutions (e.g., Medical Physics - Europe/US).
- Provenance: Retrospective.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications

Number of Experts: Three (3)
Qualifications: "clinically experienced experts consisting of 2 radiation therapy physicists and 1 radiation dosimetrist." No specific years of experience are mentioned.

4. Adjudication Method for the Test Set

Method: "Ground truthing of each test data set were generated manually using consensus (NRG/RTOG) guidelines as appropriate by three clinically experienced experts". This implies a consensus-based approach, likely 3-way consensus. If initial contours differed, discussions and adjustments would lead to a final agreed-upon ground truth.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was Done

No, an MRMC comparative effectiveness study was not explicitly done to measure improvement for human readers with AI vs without AI assistance.
- The study focuses on the performance of the AI algorithm itself (standalone performance) and its clinical appropriateness as rated by experts. The "External Reviewer Average Rating" indicates how much editing would be required by a human, rather than directly measuring human reader performance improvement with assistance.
- "independent reviewers (not employed by Radformation) were used to evaluate the clinical appropriateness of structure models as they would be evaluated for the purposes of treatment planning. This external review was performed as a replacement to intraobserver variability testing done with the RADAC V2 structure models as it better quantified the usefulness of the structure model outputs in an unbiased clinical setting." This suggests an assessment of the usability of the AI-generated contours for human review and modification, but not a direct MRMC study comparing assisted vs. unassisted human performance.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

Yes, standalone performance was done.
- The Dice Similarity Coefficient (DSC) metrics presented are a measure of the algorithm's performance in generating contours when compared to expert-defined ground truth, without human intervention during the contour generation process. The "External Reviewer Average Rating" also evaluates the standalone output's quality before any human editing.

7. The Type of Ground Truth Used

Type of Ground Truth: Expert consensus.
- "Ground truthing of each test data set were generated manually using consensus (NRG/RTOG) guidelines as appropriate by three clinically experienced experts consisting of 2 radiation therapy physicists and 1 radiation dosimetrist."

8. The Sample Size for the Training Set

CT Training Set: Average of 373 training image sets per structure model.
MR Training Set:
- Brain models: Average of 274 training image sets.
- Pelvis models: Sample size not explicitly stated for training, but refers to "Prostate-MRI-US-Biopsy dataset."
- It's important to note that specific numbers vary per structure, as shown in Table 4 and Table 8.

9. How the Ground Truth for the Training Set Was Established

The document implies that the training data and their corresponding ground truths were prepared internally prior to the testing phase. While it doesn't explicitly state how the ground truth for the training set was established, it strongly suggests a similar rigorous, expert-driven approach as described for the test sets.
"The test datasets were independent from those used for training and consisted of approximately 10% of the number of training image sets used as input for the model." This indicates that ground truth was established for both training and testing datasets.
"Publically available CT datasets from The Cancer Imaging Archive (TCIA archive) were used and both AutoContour and manually added ground truth contours following the same structure guidelines used for structure model training were added to the image sets." This suggests that for publicly available datasets used for both training and external validation, ground truth was added following the same NRG/RTOG guidelines. For proprietary training data, a similar expert-based ground truth creation likely occurred.

Ask a Question

Ask a specific question about this device

K Number

K220598

Device Name

AutoContour Model RADAC V2

Manufacturer

Radformation, Inc.

Date Cleared

2022-08-24

(175 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

N/A

Predicate For

K230685

Intended Use

AutoContour is intended to assist radiation treatment planners in contouring structures within medical images in preparation for radiation therapy treatment planning.

Device Description

As with AutoContour RADAC, the AutoContour RADAC V2 device is software that uses DICOM-compliant image data (CT or MR) as input to (1) automatically contour various structures of interest for radiation therapy treatment planning using machine learning based contouring. The deep-learning based structure models are trained using imaging datasets consisting of anatomical orqans of the head and neck, thorax, abdomen and pelvis for adult male and female patients.(2) allow the user to review and modify the resulting contours, and (3) generate DICOM-compliant structure set data the can be imported into a radiation therapy treatment planning system

AutoContour RADAC V2 consists of 3 main components:

1. A .NET client application designed to run on the Windows Operating System allowing the user to load image and structure sets for upload to the cloud-based server for automatic contouring, perform registration with other image sets, as well as review, edit, and export the structure set.
1. A local "agent" service designed to run on the Windows Operating System that is configured by the user to monitor a network storage location for new CT and MR datasets that are to be automatically contoured.
1. A cloud-based automatic contouring service that produces initial contours based on image sets sent by the user from the .NET client application.

AI/ML Overview

The provided text describes the acceptance criteria and the study that proves the device, AutoContour RADAC V2, meets these criteria. Here's a breakdown of the requested information:

1. A table of acceptance criteria and the reported device performance

The acceptance criterion for contouring accuracy is measured by the Mean Dice Similarity Coefficient (DSC), which varies based on the estimated volume of the structure.

Structure Size Category	DSC Acceptance Criteria (Mean)	Reported Device Performance (Mean DSC +/- STD)
Large volume structures	> 0.80	0.94 +/- 0.03
Medium volume structures	> 0.65	0.82 +/- 0.09
Small volume structures	> 0.50	0.61 +/- 0.14

The document also provides detailed DSC results for each contoured structure, which all meet or exceed their respective size category's acceptance criteria. For example, for "A_Aorta" (Large), the reported DSC Mean is 0.91, which is >0.80. For "Brainstem" (Medium), the reported DSC Mean is 0.90, which is >0.65. For "OpticChiasm" (Small), the reported DSC Mean is 0.63, which is >0.50.

2. Sample size used for the test set and the data provenance

CT Test Set:
- Sample Size: An average of 140 test image sets per CT structure model, constituting 20% of the training images. The specific number of test data sets for each CT structure is provided in the table (e.g., A_Aorta: 60, Bladder: 372).
- Data Provenance:
  - Country of Origin: Not explicitly stated, but the patient demographics suggest diverse origins, likely within the US, given the prevalence of specific cancers and racial demographics. The acquisition was done using a Philips Big Bore CT simulator.
  - Retrospective or Prospective: Not explicitly stated, but common in such validation studies, the data is typically retrospective patient data.
  - Demographics: 51.7% male, 48.3% female. Age range: 11-30 (0.3%), 31-50 (6.2%), 51-70 (43.3%), 71-100 (50.3%). Race: 84.0% White, 12.8% Black or African American, 3.2% Other.
  - Clinical Relevance: Data spanned across common radiation therapy treatment subgroups (Prostate, Breast, Lung, Head and Neck cancers).
MR Test Set:
- Sample Size: An average of 16 test image sets per MR structure model. Specific numbers are not provided for each MR structure, but the total validation set for sensitivity and specificity was 16 datasets.
- Data Provenance:
  - Country of Origin: Massachusetts General Hospital, Boston, MA.
  - Retrospective or Prospective: The text states "These training sets consisted primarily of glioblastoma and astrocytoma cases from the Cancer Imaging Archive (TCIA) Glioma data set." and that "The testing dataset was acquired at a different institution using a different scanner and sequence parameters", implying retrospective data collection from existing archives/institutions.
  - Demographics: 56% Male and 44% Female patients, with ages ranging from 20-80. No Race or Ethnicity data was provided.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

Number of Experts: Three clinically experienced experts.
Qualifications: Two radiation therapy physicists and one radiation dosimetrist.

4. Adjudication method for the test set

Method: Ground truthing of each test dataset was generated manually using consensus (NRG/RTOG) guidelines, as appropriate, by the three clinically experienced experts. This implies a form of expert consensus adjudication.

5. If a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was done, if so, what was the effect size of how much human readers improve with AI vs without AI assistance

MRMC Study: No, an MRMC comparative effectiveness study involving human readers with and without AI assistance was not conducted. The performance data focuses on the software's standalone accuracy (Dice Similarity Coefficient, sensitivity, and specificity). The text states: "As with the Predicate Device, no clinical trials were performed for AutoContour RADAC V2."

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Standalone Performance: Yes, the primary performance evaluation provided is for the software's standalone performance, measured by the Dice Similarity Coefficient (DSC), sensitivity, and specificity of the auto-generated contours against expert-established ground truth. The study explicitly states, "Further tests were performed on independent datasets from those included in training and validation sets in order to validate the generalizability of the machine learning model." This is a validation of the algorithm's performance.

7. The type of ground truth used

Type of Ground Truth: Expert consensus of manually contoured structures, established using NRG/RTOG (Radiation Therapy Oncology Group) guidelines. This is a form of expert consensus.

8. The sample size for the training set

CT Training Set: An average of 700 training image sets per CT structure model. The specific number of training data sets for each CT structure is provided in the table (e.g., A_Aorta: 240, Bladder: 1000).
MR Training Set: An average of 81 training image sets for MR structure models.

9. How the ground truth for the training set was established

The document implies that the ground truth for the training set was also established manually, similar to the test set, as it states "Datasets used for testing were removed from the training dataset pool before model training began, and used exclusively for testing." It is standard practice for medical imaging AI to train on expertly contoured data. While not explicitly detailed for the training set, the consistency in ground truth methodology for both training and testing in such submissions suggests expert manual contouring based on established guidelines would have been used for training as well.
- Source for MR Training Data: Primarily glioblastoma and astrocytoma cases from The Cancer Imaging Archive (TCIA) Glioma data set.

Ask a Question

Ask a specific question about this device

K Number

K220583

Device Name

ClearCheck Model RADCC V2

Manufacturer

Radformation, Inc.

Date Cleared

2022-08-23

(175 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K220598,K162468,K171350

Predicate For

K231185

Intended Use

ClearCheck is intended to assist radiation therapy professionals in generating and assessing the quality of radiotherapy treatment plans. ClearCheck is also intended to assist radiation treatment planners in predicting when a treatment plan might result in a collision between the treatment machine and the patient or support structures.

Device Description

The ClearCheck Model RADCC V2 device is software that uses treatment data, image data, and structure set data obtained from supported Treatment Planning System and Application Programming Interfaces to present radiotherapy treatment plans in a user-friendly way for user approval of the treatment plan. The ClearCheck device (Model RADCC V2) is also intended to assist users to identify where collisions between the treatment machine and the patient or support structures may occur in a treatment plan.

It is designed to run on Windows Operating Systems. ClearCheck Model RADCC V2 performs calculations on the incoming supported treatment data. Supported Treatment Planning Systems are used by trained medical professionals to simulate radiation therapy treatments for malignant or benign diseases.

AI/ML Overview

The provided text describes the acceptance criteria and study for the ClearCheck Model RADCC V2 device, which assists radiation therapy professionals in generating and assessing treatment plans, including predicting potential collisions.

Here's a breakdown of the requested information:

1. Table of Acceptance Criteria and Reported Device Performance

Acceptance Criteria	Reported Device Performance
BED / EQD2 Functionality
Passing criteria for dose type constraints	0.5% difference when compared to hand calculations using well-known BED/EQD2 formulas.
Passing criteria for Volume type constraints	3% difference when compared to hand calculations using well-known BED/EQD2 formulas.
Deformed Dose Functionality
Qualitative DVH analysis	Good agreement for all cases compared to known dose deformations.
Quantitative Dmax and Dmin differences	+/- 3% difference for deformed dose results compared to known dose deformations.
Overall Verification & Validation Testing	All test cases for BED/EQD2 and Deformed Dose functionalities passed. Overall software verification tests were performed to ensure intended functionality, and pass/fail criteria were used to verify requirements.

2. Sample Size Used for the Test Set and Data Provenance

Test Set Sample Size: The document does not explicitly state a specific numerical sample size for the test set used for the BED/EQD2 and Deformed Dose functionality validation. It mentions "all cases" for Deformed Dose and "a plan and plan sum" for BED/EQD2. This implies testing was done on an unspecified number of representative cases, but not a statistically powered cohort.
Data Provenance: Not specified in the provided text. It does not mention the country of origin or whether the data was retrospective or prospective.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

The document does not mention the use of experts to establish ground truth for the test set.
For BED/EQD2, the ground truth was established by "values calculated by hand using the well-known BED / EQD2 formulas."
For Deformed Dose, the ground truth was established by "known dose deformations."

4. Adjudication Method for the Test Set

Not applicable as there is no mention of expert review or adjudication for the test set. Ground truth was established by calculation or "known" deformations.

5. If a Multi Reader Multi Case (MRMC) Comparative Effectiveness Study Was Done, If So, What Was the Effect Size of How Much Human Readers Improve with AI vs Without AI Assistance

No, a Multi Reader Multi Case (MRMC) comparative effectiveness study was not performed. The document explicitly states: "no clinical trials were performed for ClearCheck Model RADCC V2." The device is intended to "assist radiation therapy professionals," but its impact on human reader performance was not evaluated through a clinical study.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

Yes, performance evaluations for the de novo functionalities (BED/EQD2 and Deformed Dose) appear to be standalone algorithm performance assessments. The device's calculations were compared against established mathematical formulas (BED/EQD2) or known deformations (Deformed Dose) without human intervention in the evaluation process.

7. The Type of Ground Truth Used (expert consensus, pathology, outcomes data, etc.)

BED/EQD2: Ground truth was based on "values calculated by hand using the well-known BED / EQD2 formulas." This is a computational/mathematical ground truth.
Deformed Dose: Ground truth was based on "known dose deformations." This implies a physically or computationally derived ground truth where the expected deformation results were already established.

8. The Sample Size for the Training Set

The document does not specify a sample size for the training set. It primarily focuses on the validation of new features against calculated or known results, rather than reporting on a machine learning model's training data.

9. How the Ground Truth for the Training Set Was Established

The document does not provide information on how the ground truth for any training set was established. Given the nature of the device (software for calculations and collision prediction, building on predicate devices), it's possible that analytical methods and established physics/dosimetry principles form the basis, rather than a large labeled training dataset in the typical machine learning sense for image interpretation.

Ask a Question

Ask a specific question about this device

K Number

K220582

Device Name

ClearCalc Model RADCA V2

Manufacturer

Radformation, Inc.

Date Cleared

2022-08-22

(174 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K193381,K201798,K193640

Predicate For

N/A

Intended Use

ClearCalc is intended to assist radiation treatment planners in determining if their treatment planning calculations are accurate using an independent Monitor Unit (MU) and dose calculation algorithm.

Device Description

The ClearCalc Model RADCA V2 device is software that uses treatment data, image data, and structure set data obtained from supported Treatment Planning System and Application Programming interfaces to perform a dose and/or monitor unit (MU) calculation on the incoming treatment planning parameters. It is designed to assist radiation treatment planners in determining if their treatment planning calculations are accurate using an independent Monitor Unit (MU) and dose calculation algorithm.

AI/ML Overview

The provided text describes ClearCalc Model RADCA V2 and its substantial equivalence to predicate devices. However, the document does NOT contain a detailed study proving the device meets specific acceptance criteria in the manner of a clinical trial or a performance study with detailed statistical results. Instead, it states that "Verification tests were performed to ensure that the software works as intended and pass/fail criteria were used to verify requirements. Validation testing was performed to ensure that the software was behaving as intended, and results from ClearCalc were validated against accepted results for known planning parameters from clinically-utilized treatment planning systems."

Therefore, I can provide the acceptance criteria as stated for the ClearCalc Model RADCA V2's primary dose calculation algorithms and the monte carlo calculations, but comprehensive study details such as sample size, data provenance, expert ground truth, adjudication methods, or separate training/test sets are not available in the provided text.

Here's a summary of the available information:

1. Table of Acceptance Criteria and Reported Device Performance

Calculation Algorithm	Acceptance Criteria	Reported Device Performance
FSPB (Photon Plans)	Passing criteria consistent with Predicate Device ClearCalc Model RADCA	Passed in all test cases
TG-71 (Electron Plans)	Passing criteria consistent with Predicate Device ClearCalc Model RADCA	Passed in all test cases
TG-43 (Brachytherapy Plans)	Passing criteria consistent with Predicate Device ClearCalc Model RADCA	Passed in all test cases
Monte Carlo Calculations	Gamma analysis passing rate of >93% with +/-3% relative dose agreement and 3mm Distance To Agreement (DTA)	Passed in all test cases

2. Sample size used for the test set and the data provenance:

Sample Size: Not explicitly stated. The text mentions "all test cases" without quantifying the number of cases.
Data Provenance: Not explicitly stated. The text refers to "known planning parameters from clinically-utilized treatment planning systems," suggesting the data would be representative of clinical use but does not specify country of origin or if it's retrospective or prospective.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:

Number of Experts: Not specified.
Qualifications of Experts: The ground truth was established from "accepted results for known planning parameters from clinically-utilized treatment planning systems." This implies that the ground truth is derived from established clinical practices and systems, which are typically validated by qualified medical physicists and radiation oncologists, but specific expert involvement in this validation is not detailed.

4. Adjudication method (e.g. 2+1, 3+1, none) for the test set:

Adjudication Method: Not specified. The validation relies on comparison to "accepted results."

5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:

MRMC Study: No. The device is a "Secondary Check Quality Assurance Software" designed to assist radiation treatment planners by providing independent calculation. It does not involve human readers making diagnoses or interpretations that would be augmented by AI in a MRMC study context.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:

Standalone Performance: Yes, the described verification and validation tests assess the algorithm's performance against "accepted results" from clinical systems, which is a standalone evaluation.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.):

Ground Truth Type: "Accepted results for known planning parameters from clinically-utilized treatment planning systems." This implies a form of established, clinically validated ground truth based on the outputs of other (predicate/reference) treatment planning systems the device is checking.

8. The sample size for the training set:

Training Set Sample Size: Not specified. The text focuses on verification and validation testing, not on the training of the underlying algorithms.

9. How the ground truth for the training set was established:

Training Set Ground Truth: Not specified. As the document focuses on validation rather than algorithm training, this information is not provided.

Ask a Question

Ask a specific question about this device

K Number

K200323

Device Name

AutoContour

Manufacturer

Radformation, Inc.

Date Cleared

2020-10-30

(263 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K130393,K181572

Predicate For

K220598

Intended Use

AutoContour is intended to assist radiation treatment planners in contouring structures within medical images in preparation for radiation therapy treatment planning.

Device Description

AutoContour consists of 3 main components:

An "agent" service designed to run on the Windows Operating System that is configured by the user to monitor a network storage location for new CT datasets that are to be automatically uploaded to:
A cloud-based AutoContour automatic contouring service that produces initial contours and
A web application accessed via web browser which allows the user to perform registration with other image sets as well as review, edit, and export the structure set containing the contours.

AI/ML Overview

The provided text describes the acceptance criteria and study proving the device meets those criteria. Here's a breakdown of the requested information:

1. Table of Acceptance Criteria & Reported Device Performance

The document states that formal acceptance criteria and reported device performance are detailed in "Radformation's AutoContour Complete Test Protocol and Report." However, this specific report is not included in the provided text. The summary only generally states that "Nonclinical tests were performed... which demonstrates that AutoContour performs as intended per its indications for use" and "Verification and validation tests were performed to ensure that the software works as intended and pass/fail criteria were used to verify requirements."

Therefore, a table of acceptance criteria and reported device performance cannot be constructed from the provided text.

2. Sample Size Used for the Test Set and Data Provenance

The document mentions that "tests were performed on independent datasets from those included in training and validation sets in order to validate the generalizability of the machine learning model." However, the sample size for the test set is not explicitly stated.

Regarding data provenance:

The document implies the data used was medical image data (specifically CT, and for registration purposes, MR and PET).
The country of origin is not specified.
The terms "training and validation sets" and "independent datasets" suggest these were retrospective datasets used for model development and evaluation. There is no mention of prospective data collection.

3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications

The document does not provide any information about the number of experts used to establish ground truth for the test set or their qualifications.

4. Adjudication Method for the Test Set

The document does not specify any adjudication method (e.g., 2+1, 3+1, none) used for the test set.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance?

The document explicitly states: "As with the Predicate Devices, no clinical trials were performed for AutoContour." This indicates that an MRMC comparative effectiveness study involving human readers and AI assistance was not conducted. Therefore, no effect size for human reader improvement is reported.

6. If a Standalone (i.e. algorithm only without human-in-the-loop performance) was done

The document mentions "tests were performed on independent datasets from those included in training and validation sets in order to validate the generalizability of the machine learning model." This strongly suggests that standalone performance of the algorithm was evaluated. Although specific metrics for this standalone performance are not detailed in the provided text, the validation of a machine learning model against independent datasets implies a standalone evaluation.

7. The Type of Ground Truth Used

The document mentions that AutoContour is intended to "assist radiation treatment planners in contouring structures within medical images." Given this, the ground truth for the contours would typically be expert consensus or expert-annotated contours. However, the document itself does not explicitly state the type of ground truth used (e.g., expert consensus, pathology, outcomes data).

8. The Sample Size for the Training Set

The document mentions "training and validation sets" but does not provide the sample size for the training set.

9. How the Ground Truth for the Training Set Was Established

The document mentions "training and validation sets" but does not detail how the ground truth for the training set was established. Similar to the test set, it would likely involve expert contouring, but this is not explicitly stated.

Ask a Question

Ask a specific question about this device

K Number

K201119

Device Name

ChartCheck

Manufacturer

Radformation, Inc.

Date Cleared

2020-06-26

(60 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K173838

Predicate For

N/A

Intended Use

The ChartCheck device is intended for the quality assessment of radiotherapy treatment plans and on treatment chart review.

Device Description

The ChartCheck device is software that enables trained radiation oncology personnel to perform quality assessments of treatment plans and treatment chart review utilizing plan, treatment, imaging, and documentation data obtained from the ARIA Radiation Therapy Management database.

ChartCheck contains 3 main components:

1. An agent service that is configured by the user to monitor their ARIA Radiation Therapy Management database. The agent watches for new treatment plans, treatment records, documentation, and imaging data. The agent uploads new data to the cloud based checking service.
1. A cloud based checking service calculates check states as new records are uploaded from the agent.
1. A web application accessed via a web browser which contains several components.
- a. It allows trained radiation oncology personnel to review treatment records, view the check state calculation results, record comments,, and mark the chart checks as approved.
- b. It allows an administrator to set check state colors, configure default settings, and define check state logic.

AI/ML Overview

Here's an analysis of the provided text regarding the acceptance criteria and supporting study for the ChartCheck device.

Acceptance Criteria and Device Performance for Radformation, Inc.'s ChartCheck (K201119)

The provided documentation, a 510(k) premarket notification, indicates that the ChartCheck device did not undergo a traditional clinical study with established acceptance criteria and performance metrics in the way a diagnostic or therapeutic device might. Instead, the submission relies on demonstrating substantial equivalence to a predicate device, ARIA Radiation Therapy Management (K173838), primarily through software verification and validation.

The "acceptance criteria" in this context refer to the successful completion of verification tests to ensure the software functions as intended and meets its requirements. The "reported device performance" is essentially that the software successfully passed these internal tests and demonstrated functionality comparable to the predicate device.

1. Table of Acceptance Criteria and Reported Device Performance

Acceptance Criteria (Proxy)	Reported Device Performance
Software functionality	* Software correctly monitors ARIA database for new treatment plans, records, documentation, and imaging data. * Agent service successfully uploads new data to cloud-based checking service. * Cloud-based checking service accurately calculates check states. * Web application correctly displays treatment records, check state calculation results, allows for comments, and approval marking. * Administrator functions (setting check state colors, configuring settings, defining check state logic) work as designed. * ChartCheck displays planned and treatment values along with check state indicators. * ChartCheck presents control charts.
Substantial Equivalence	* Indications for Use: Substantially Equivalent to Predicate. * Pure software: Equivalent to Predicate. * Intended users: Equivalent to Predicate. * OTC/Rx: Equivalent to Predicate. * Input: Equivalent to Predicate. * Functionality: Substantially Equivalent to Predicate (utilizes data to calculate pass/fail/override/condition check states, comparable to predicate's pass/fail/override check states). * Output: Substantially Equivalent to Predicate.
Safety and Effectiveness	* Verification and Validation testing demonstrated the device is safe and effective. * Hazard Analysis performed. * No new questions regarding safety and effectiveness raised by its indications for use (which are a subset of the predicate's).

2. Sample Size Used for the Test Set and Data Provenance

The document explicitly states: "As with the Predicate Device, no clinical trials were performed for ChartCheck. Verification tests were performed to ensure that the software works as intended and pass/fail criteria were used to verify requirements."

Therefore:

Sample Size for Test Set: Not specified in terms of patient data or clinical cases. The testing was focused on software verification/validation, likely involving simulated data, test cases, and functional scenarios rather than a clinical dataset.
Data Provenance: Not applicable in the context of a clinical test set. The data used for verification would be internally generated or synthetic data used to exercise the software's functions.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

Given that clinical trials were not performed and the focus was on software verification, the concept of "ground truth" derived from expert consensus on medical images or diagnoses isn't directly applicable here. The "ground truth" for the software's functional tests would be the expected output or behavior for a given input, as defined by software requirements and design specifications, and assessed by qualified software and quality assurance personnel.

4. Adjudication Method for the Test Set

Not applicable. There was no clinical test set requiring adjudication of findings by medical experts. The verification process would likely involve pass/fail criteria for individual software tests.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done

No, a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was not done. The submission clearly states no clinical trials were performed.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

The performance described is largely standalone in terms of the algorithm's calculation of "check states." However, the device's purpose is to assist trained radiation oncology personnel in performing quality assessments. The "web application" component explicitly describes human interaction for review, comments, and approval. Therefore, while the core "checking service" operates algorithmically, the overall device is designed for a human-in-the-loop workflow. A standalone study demonstrating the algorithm's performance without any human interaction was not detailed as a separate component of the submitted performance data.

7. The Type of Ground Truth Used

The ground truth used for verifying the software's functionality would be defined software requirements and expected outputs for specific test cases. For instance, if the software is designed to flag a plan where a certain dose constraint is violated, the ground truth would be that specific violation. This is a functional truth rather than a clinical truth established by medical experts or pathology.

8. The Sample Size for the Training Set

The document does not mention a "training set" in the context of machine learning or AI models. The ChartCheck device appears to be a rule-based or logic-based software rather than an AI/ML model that would require extensive data for training. If there are configurable rules, these are likely defined by administrators rather than learned from a dataset.

9. How the Ground Truth for the Training Set Was Established

Not applicable, as no training set for an AI/ML model is mentioned.

Ask a Question

Ask a specific question about this device

K Number

K193640

Device Name

ClearCalc

Manufacturer

Radformation, Inc.

Date Cleared

2020-04-09

(104 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K090531

Predicate For

K220582

Intended Use

Device Description

The ClearCalc device (model RADCA) is software intended to assist users in determining if their treatment planning calculations are accurate using an independent Monitor Unit (MU) and dose calculation. The treatment plans are obtained from supported Treatment Planning System and Application Programming interfaces. It is designed to run on the Windows Operating System. ClearCalc performs calculations on the treatment plan data obtained from supported Treatment Planning System and Application Programming interfaces. A Treatment Planning System is software used by trained medical professionals to install and simulate radiation therapy treatments for malignant or benign diseases.

AI/ML Overview

The provided document describes the Radformation ClearCalc device (K193640), a software intended to assist radiation treatment planners in determining the accuracy of their treatment planning calculations.

Here's an analysis of the acceptance criteria and the study information:

1. Table of Acceptance Criteria and Reported Device Performance

The document does not explicitly state quantitative acceptance criteria or a direct comparison of ClearCalc's performance against such criteria in a tabular format. Instead, it focuses on demonstrating substantial equivalence to a predicate device, RadCalc (K090531), through a comparison of technological characteristics and functionalities.

The "Performance Data" section (5.7) states:
"As with the Predicate Device, no clinical trials were performed for ClearCalc. Verification tests were performed to ensure that the software works as intended and pass/fail criteria were used to verify requirements."

This indicates that internal verification tests were conducted, likely against pre-defined pass/fail criteria for the software's functionality and accuracy in calculating Monitor Units (MU) and dose. However, the specific metrics, thresholds, and numerical results of these tests are not provided in this document.

The "Substantial Equivalence ClearCalc vs. RadCalc" table (Table 3) highlights functional similarities and minor differences, which implicitly form the basis of the performance evaluation for regulatory submission.

Parameter	Acceptance Criteria (Implicit from Predicate Comparison)	Reported Device Performance (ClearCalc)
Photon MU and Dose Calculation	- Utilize an independent calculation algorithm to recalculate MU/Dose on a per-field basis.- Provide accurate MU/Dose calculations for external beam radiation therapy.- Account for patient geometry and heterogeneity corrections.	- Utilizes a Finite-Size Pencil Beam (FSPB) algorithm to calculate MU/Dose on a per-field basis.- Utilizes the full 3D geometry of the patient for heterogeneity corrections and simulating scatter conditions.- Calculates dose from fields in a plan and displays results for per-field MU and provides a difference metric for evaluation in tabular format.- Allows evaluation of global point doses, as well as per-field doses.(No specific accuracy numbers or pass/fail thresholds are provided in this document, but implied to be sufficient based on "works as intended" and comparison to predicate.)
Electron MU and Dose Calculation	- Utilize an independent calculation algorithm to recalculate MU/Dose on a per-field basis.- Provide accurate MU/Dose calculations for electron external beam radiation therapy.- Account for custom cutouts and compute cutout factors.	- Utilizes a library of custom cutouts and computes cutout factors using a sector integration method.- Calculates dose based on electron field parameters and cutout geometry, and displays results for per-field MU and dose and provides a difference metric for evaluation in tabular format.(No specific accuracy numbers or pass/fail thresholds are provided, implied to be sufficient.)
Brachytherapy Dose Calculation	- Utilize an independent calculation algorithm to recalculate dose from radioactive sources.- Adhere to a recognized protocol (e.g., AAPM TG-43) for brachytherapy dose calculations.- Provide dose calculations to arbitrary points and difference metrics.	- Utilizes the AAPM TG-43 protocol for its brachytherapy dose calculations.- Calculates dose to arbitrary calculation point locations and presents difference metrics comparing the TPS dose vs. ClearCalc dose in a tabular format.(No specific accuracy numbers or pass/fail thresholds are provided, implied to be sufficient.)
Overall Functionality	- Software works as intended without errors.- Provides reliable independent verification of treatment planning calculations.	- "Verification tests were performed to ensure that the software works as intended and pass/fail criteria were used to verify requirements." (No specific details on test outcomes or criteria provided).- "Verification and Validation testing and Hazard Analysis demonstrate that ClearCalc is as safe and effective as the Predicate Device."(Implied successful performance under internal testing).

2. Sample Size Used for the Test Set and Data Provenance

The document does not specify the sample size used for the test set or the data provenance (e.g., country of origin, retrospective/prospective). It only mentions "Verification tests were performed to ensure that the software works as intended".

3. Number of Experts Used to Establish Ground Truth and Qualifications

The document does not mention using experts to establish ground truth for a test set. The performance evaluation seems to be based on internal verification against expected computational outcomes, rather than expert-derived ground truth from patient cases.

4. Adjudication Method

The document does not mention any adjudication method, as no expert review or consensus process for ground truth establishment is described.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

An MRMC comparative effectiveness study was not performed or described in the document. The device is a "standalone" software tool for verifying treatment plans, and its primary function is independent calculation, not assisting human readers in interpreting images or making diagnostic/treatment decisions that would typically be evaluated with MRMC studies.

6. Standalone Performance Study

A standalone performance study of the algorithm (i.e., algorithm only without human-in-the-loop performance) was performed implicitly through "Verification tests." The document states, "Verification tests were performed to ensure that the software works as intended and pass/fail criteria were used to verify requirements." However, concrete details, metrics, and quantitative results of this standalone performance are not provided. The entire submission focuses on demonstrating substantial equivalence, implying that its standalone performance would be comparable to the predicate device's independently calculated MU/dose values.

7. Type of Ground Truth Used

The type of ground truth used appears to be calculated ground truth or computationally derived truth. The ClearCalc software performs calculations based on established physics principles (Finite-Size Pencil Beam algorithm, AAPM TG-43 protocol) to verify the output of a primary Treatment Planning System (TPS). The "ground truth" for the verification tests would logically be the expected accurate MU/dose values as determined by these internal algorithms and validated against known calculation methodologies, rather than pathology, outcomes data, or expert consensus from clinical cases.

8. Sample Size for the Training Set

The document does not mention a training set sample size. This suggests that the ClearCalc software likely relies on deterministic algorithms (e.g., Finite-Size Pencil Beam, AAPM TG-43) which are coded based on physical models and parameters, rather than being "trained" on a dataset in the manner of machine learning algorithms.

9. How the Ground Truth for the Training Set Was Established

Since a training set is not mentioned, the method for establishing its ground truth is not applicable/not provided. The device's functionality is based on direct implementation of physics models and calculations.

Ask a Question

Ask a specific question about this device

K Number

K171352

Device Name

EZFluence

Manufacturer

Radformation, Inc.

Date Cleared

2017-12-01

(206 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K141283,K152393

Predicate For

N/A

Intended Use

EZFluence is intended to assist radiation treatment planning professionals in generating optimal fluences for producing a homogeneous dose distribution in external beam radiation therapy treatment plans consisting of photon treatment fields.

Device Description

The EZFluence device (model RADEZ) is software is intended to assist radiation treatment planning professionals in generating optimal fluences for producing a homogeneous dose distribution in external beam radiation therapy treatment plans consisting of photon treatment fields. Inputs are obtained from plan and patient data obtained from the Eclipse Treatment Planning System (also referred to as Eclipse TPS) of Varian Medical Systems. EZFluence runs as a dynamic link library (DLL) plugin to Varian Eclipse. It is designed to run on the Windows Operating System. EZFluence performs calculations on the plan obtained from Eclipse TPS (Version 13.5 (K141283) and Version 13.7 (K152393)) which is a software device used by trained medical professionals to design and simulate radiation therapy treatment plans for malignant or benign diseases.

AI/ML Overview

The document provided discusses the EZFluence device, a software intended to assist radiation treatment planning professionals. However, it does not contain specific details about acceptance criteria, reported device performance figures (like sensitivity, specificity, or accuracy), sample sizes for test or training sets, data provenance, the number or qualifications of experts, ground truth establishment, or any MRMC studies.

The document primarily focuses on establishing substantial equivalence to a predicate device (Eclipse Treatment Planning System) based on similar indications for use and technological characteristics. It mentions that "Verification tests were performed to ensure that the software works as intended and pass/fail criteria were used to verify requirements," but it does not elaborate on these criteria or the results.

Therefore, based solely on the provided text, I cannot generate a table of acceptance criteria and reported performance, nor can I answer many of the specific questions about the study design and results, as this information is not present in the excerpt.

Here's what I can extract and what's missing:

Inability to Fulfill Request Due to Lack of Information

The provided document (K171352) is a 510(k) summary for the EZFluence device. While it describes the device's intended use and compares it to a predicate device, it does not contain the detailed performance data, acceptance criteria, sample sizes, expert qualifications, or ground truth methodology that would be required to answer the questions in the prompt.

The document explicitly states under "5.7 Performance Data": "As with the Predicate Device, no clinical trials were performed for EZFluence. Verification tests were performed to ensure that the software works as intended and pass/fail criteria were used to verify requirements." However, it does not provide the specific "pass/fail criteria" or the results of these "verification tests."

Therefore, I cannot construct the requested table or provide answers to most of the specific questions.

What can be inferred or directly stated from the document:

Device Name: EZFluence
Intended Use: To assist radiation treatment planning professionals in generating optimal fluences for producing a homogeneous dose distribution in external beam radiation therapy treatment plans consisting of photon treatment fields. (Section 5.5)
Regulatory Class: Class II (Section 5.2)
Predicate Device: Eclipse Treatment Planning System (K152393) (Section 5.3)
Study Type: Verification tests were performed; no clinical trials were performed. (Section 5.7)
Performance Metrics Reported: None explicitly stated (e.g., no sensitivity, specificity, accuracy, or any quantitative metric of "optimal fluence" or "homogeneous dose distribution" quality).

Missing Information (Cannot be answered from the provided text):

A table of acceptance criteria and the reported device performance: This information is not present. The document mentions "pass/fail criteria were used to verify requirements" but does not detail them or the results.
Sample sizes used for the test set and the data provenance: Not mentioned.
Number of experts used to establish the ground truth for the test set and the qualifications of those experts: Not mentioned.
Adjudication method for the test set: Not mentioned.
If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance: The document explicitly states "no clinical trials were performed for EZFluence," suggesting no such MRMC study was conducted. The device is also described as assisting professionals, not as an AI-driven diagnostic tool for human readers in the typical MRMC context.
If a standalone (i.e. algorithm only without human-in-the-loop performance) was done: Not explicitly detailed. The "verification tests" mentioned are likely standalone software tests, but no performance metrics are given.
The type of ground truth used (expert consensus, pathology, outcomes data, etc.): Not mentioned.
The sample size for the training set: Not mentioned.
How the ground truth for the training set was established: Not mentioned.

Conclusion based on provided text: The document serves as a regulatory submission demonstrating substantial equivalence rather than a detailed scientific study report detailing performance metrics and validation methodologies.

Ask a Question

Ask a specific question about this device

K Number

K171350

Device Name

Collision Check

Manufacturer

Radformation, Inc.

Date Cleared

2017-11-29

(204 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K131891,K141283,K152393,K153014

Predicate For

K220583

Intended Use

CollisionCheck is intended to assist radiation treatment planners in predicting when a treatment plan might result in a collision between the treatment machine and the patient or support structures.

Device Description

The CollisionCheck device (model RADCO) is software intended to assist users to identify where collisions between the treatment machine and the patient or support structures may occur in a treatment plan. The treatment plans are obtained from the Eclipse Treatment Planning System (also referred to as Eclipse TPS) of Varian Medical Systems. CollisionCheck runs as a dynamic link library (DLL) plugin to Varian Eclipse. It is designed to run on the Windows Operating System. CollisionCheck performs calculations on the plan obtained from Eclipse TPS (Version 12 (K131891), Version 13.5 (K141283), and Version 13.7 (K152393) which is a software used by trained medical professionals to install and simulate radiation therapy treatments for malignant or benign diseases.

AI/ML Overview

The provided text describes the regulatory clearance of CollisionCheck (K171350) and compares it to a predicate device, Mobius3D (K153014). However, it does not contain specific details about acceptance criteria, the study design (e.g., sample size, data provenance, ground truth establishment, expert qualifications, or adjudication methods), or MRMC study results. The document states that "no clinical trials were performed for CollisionCheck" and mentions "Verification tests were performed to ensure that the software works as intended and pass/fail criteria were used to verify requirements." This implies that the performance demonstration was likely limited to software verification and validation, rather than a clinical performance study with human-in-the-loop or standalone AI performance metrics.

Therefore, many of the requested details cannot be extracted from the provided text. I will provide what can be inferred or stated as absent based on the document.

Acceptance Criteria and Device Performance

The document does not explicitly list quantitative acceptance criteria with corresponding performance metrics like sensitivity, specificity, or F1-score for the CollisionCheck device. Instead, the performance demonstration focuses on software verification and validation to ensure the device works as intended and is as safe and effective as the predicate device.

Table of Acceptance Criteria and Reported Device Performance (Inferred/Based on Document Context):

Acceptance Criterion (Inferred from regulatory context and V&V)	Reported Device Performance (Inferred/Based on V&V Statement)
Functionality: Accurately simulate treatment plan and predict gantry collisions with patient or support structures.	Verification tests confirmed the software works as intended, indicating successful simulation and collision prediction. (Pass)
Safety: Device operation does not introduce new safety concerns compared to predicate.	Hazard Analysis demonstrated the device is as safe as the Predicate Device. (Pass)
Effectiveness: Device effectively assists radiation treatment planners in identifying potential collisions.	Verification tests confirmed the software works as intended, indicating effective assistance in collision identification. (Pass)
Algorithm Accuracy (Collision Prediction): Implicitly, the algorithm should correctly identify collision events when they occur and not falsely identify them when they do not.	No specific accuracy metrics (e.g., sensitivity, specificity, precision recall) reported. Performance is based on successful completion of verification tests.
Comparison to Predicate: Substantially equivalent to Mobius3D's collision check feature regarding safety and effectiveness.	Minor technological differences do not raise new questions on safety and effectiveness. Deemed substantially equivalent. (Pass)

Study Details:

Given the statement "no clinical trials were performed for CollisionCheck," and the focus on "Verification tests," most of the questions regarding a typical AI performance study (like those involving test sets, ground truth experts, MRMC studies) cannot be answered with specific data from this document. The performance demonstration appears to have been solely based on internal software verification and validation activities.

Sample sizes used for the test set and data provenance:
- Test Set Sample Size: Not specified. The document only mentions "verification tests" and "pass/fail criteria."
- Data Provenance: Not specified. It's likely synthetic or internal clinical data used for software testing, rather than a distinct, prospectively collected, or retrospectively curated clinical test set for performance evaluation in a regulatory sense.
Number of experts used to establish the ground truth for the test set and the qualifications of those experts:
- Not applicable/Not specified. Given that "no clinical trials were performed," it's highly improbable that a formal expert-adjudicated ground truth was established for a test set in the context of an AI performance study. Ground truth in this context would likely be defined by the physics-based simulation of collisions within the software's design.
Adjudication method (e.g., 2+1, 3+1, none) for the test set:
- Not applicable/Not specified. No adjudication method is mentioned, consistent with the absence of a clinical performance study involving human readers.
If a multi-reader multi-case (MRMC) comparative effectiveness study was done, if so, what was the effect size of how much human readers improve with AI vs without AI assistance:
- No, an MRMC comparative effectiveness study was not done. The document explicitly states, "no clinical trials were performed." Therefore, no effect size of human reader improvement with AI assistance is reported.
If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:
- While the "verification tests" would evaluate the algorithm's standalone functionality, the document does not provide specific performance metrics (e.g., sensitivity, specificity) for its standalone performance that would typically be seen in a standalone AI evaluation. The device assists a human user, so its "standalone" performance wouldn't be in isolation but rather its ability to correctly identify collisions as defined by its internal models.
The type of ground truth used (expert consensus, pathology, outcomes data, etc.):
- The document implies a physics-based or computational ground truth. The device performs calculations and simulations. The "ground truth" for its verification and validation would be whether its simulation correctly identifies collisions based on defined geometric and physical parameters. It's not based on expert consensus, pathology, or outcomes data, as it's a planning assistance tool, not a diagnostic one.
The sample size for the training set:
- Not applicable/Not specified. The document describes CollisionCheck as software that performs calculations and simulations (modeling the linac as a cylinder, supporting applicators, etc.). It is not described as an AI or machine learning model that requires a "training set" in the conventional sense of supervised learning on a large dataset. Its functionality is likely rule-based or physics-informed, rather than learned from data.
How the ground truth for the training set was established:
- Not applicable/Not specified. Since it's not described as an ML model with a training set, the concept of establishing ground truth for a training set does not apply here. The "ground truth" for its development would be the accurate mathematical and physical modeling of collision scenarios.

Ask a Question

Ask a specific question about this device

Page 1 of 1