Search Results
Found 15 results
510(k) Data Aggregation
(113 days)
Ask a specific question about this device
(30 days)
AutoContour is intended to assist radiation treatment planners in contouring and reviewing structures within medical images in preparation for radiation therapy treatment planning.
As with AutoContour Model RADAC V4, the AutoContour Model RADAC V5 device is software that uses DICOM-compliant image data (CT or MR) as input to: (1) automatically contour various structures of interest for radiation therapy treatment planning using machine learning based contouring. The deep-learning-based structure models are trained using imaging datasets consisting of anatomical organs of the head and neck, thorax, abdomen, and pelvis for adult male and female patients, (2) allow the user to review and modify the resulting contours, and (3) generate DICOM-compliant structure set data that can be imported into a radiation therapy treatment planning system.
AutoContour Model RADAC V5 consists of 3 main components:
- A .NET client application designed to run on the Windows Operating System, allowing the user to load image and structure sets for upload to the cloud-based server for automatic contouring, perform registration with other image sets, as well as review, edit, and export the structure set.
- A local "agent" service designed to run on the Windows Operating System that is configured by the user to monitor a network storage location for new CT and MR datasets that are to be automatically contoured.
- A cloud-based automatic contouring service that produces initial contours based on image sets sent by the user from the .NET client application.
Here's a structured summary of the acceptance criteria and study details for the AutoContour Model RADAC V5, based on the provided FDA 510(k) clearance letter:
Acceptance Criteria and Device Performance Study for AutoContour Model RADAC V5
1. Table of Acceptance Criteria and Reported Device Performance
The acceptance criteria for each structure model varied based on its size (Large, Medium, Small) and whether it was a new model, an updated model, or an unchanged existing model. The performance was primarily evaluated through Dice Similarity Coefficient (DSC) and Likert Qualitative Review for new/updated models, and DSC and Hausdorff Distance for existing models.
| Metric Type | Acceptance Criteria (Large, Medium, Small Structures) | Reported CT Training Data Performance (Mean DSC ± Std Dev) | Reported MR Training Data Performance (Mean DSC ± Std Dev) | Reported CT External Reviewer Performance (Mean DSC) | Reported MR External Reviewer Performance (Mean DSC) | Reported External Reviewer Qualitative Performance (Average Rating) |
|---|---|---|---|---|---|---|
| DSC Evaluation (Training/External Dataset) | Large: ≥ 0.80Medium: ≥ 0.65Small: ≥ 0.50 | Large: 0.91 ± 0.14Medium: 0.86 ± 0.13Small: 0.75 ± 0.20 | Medium: 0.82 ± 0.12Small: 0.72 ± 0.09 | Large: 0.94 (A_Aorta)Medium: 0.91 (A_Aorta_Asc)Small: 0.78 (A_Celiac) | Medium: 0.93 (Brainstem)Small: 0.81 (NVB_L) | N/A |
| Likert Qualitative Review (Internal/External) | Average grade ≥ 3 across all external image sets | N/A | N/A | N/A | N/A | 4.3 (across all MR models)4.8 (e.g., A_Aorta)Min. 3.9 (HDR_Bowel - for single structure failing DSC) |
| Existing Structure Model DSC Comparison | Large: > 0.99Medium: > 0.98Small: > 0.95 | (This metric compared new version to previous, not absolute values) | (This metric compared new version to previous, not absolute values) | N/A | N/A | N/A |
| Existing Structure Model Hausdorff Distance | ≤ 3mm | (This metric compared new version to previous, not absolute values) | (This metric compared new version to previous, not absolute values) | N/A | N/A | N/A |
Note: The document provides specific DSC values for many individual structures. The table above shows aggregated or illustrative examples from the tables provided.
2. Sample Size for Test Set and Data Provenance
- CT Test Sets: An average of 49 testing image sets per CT structure model (approximately 10% of training data). Specific examples include:
- A_Aorta_Asc (Update): 60 testing sets
- A_Carotid_L/R (Update): 83 testing sets
- A_Celiac: 44 testing sets
- MR Test Sets:
- Brain models: 58 testing image sets (e.g., Amygdala_L/R: 133, CorpusCallosum: 15)
- Pelvis models: 50 testing image sets (e.g., Rectal_Spacer: 26)
- External Clinical Test Sets:
- CT: 20 (A_Aorta), 37 (A_Carotid_L/R), 24 (A_Celiac), etc.
- MR: 20 (Amygdala_L), 45 (Bladder_Trigone), 7 (HDR_Bowel), etc.
- Data Provenance (Training and Testing): Data was gathered from several institutions in several different countries (not specifically enumerated but mentioned for CT and MR). Specific external clinical datasets for CT included TCIA - Pelvic-Ref, TCIA - Head-Neck-PET-CT, TCIA - Pancreas-CT-CB, TCIA - NSCLC data. MR external datasets included "MR - Renown," "Gold Atlas Pelvis," "SynthRad," "MRLinac Pelvis," "Female HDR MR Pelvis," and "MR Pelvis Barrigel," some of which were open-source or shared by clinical partners/institutions in Canada, Spain, Australia, and the United States. The images used for testing were sequestered from the original training and validation data population and removed from the training dataset pool before model training began.
3. Number of Experts and Qualifications for Ground Truth
- Number of Experts: Six (6) clinically experienced experts.
- Qualifications: 2 radiation therapy physicists, 1 radiation dosimetrist, and 3 radiation therapists with specialized training in radiation therapy contouring.
4. Adjudication Method for the Test Set
The ground truthing of each test dataset was generated manually using consensus (NRG/RTOG/ESTRO) guidelines as appropriate. While a specific (e.g., 2+1, 3+1) adjudication method for individual cases or disagreements is not explicitly stated, the use of "consensus" guidelines by multiple experts implies a form of adjudicated agreement for final ground truth.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- The document does not explicitly describe a conventional MRMC comparative effectiveness study comparing human readers with AI assistance versus without AI assistance.
- Instead, it measures the AI's standalone performance against expert-generated ground truth and uses a qualitative review by external experts (average rating 1-5 where >3 means beneficial, 5 means no edits needed) to assess the clinical appropriateness and required modifications for the AI-generated contours. This qualitative review serves as an indirect assessment of human interaction with AI output, but not a formal MRMC study as typically defined for reader performance improvement with assistance.
6. Standalone Performance (Algorithm Only without Human-in-the-loop)
- Yes, standalone performance was done. The primary performance metrics (Dice Similarity Coefficient - DSC and Hausdorff Distance) directly evaluate the algorithm's output against the expert-generated ground truth without human intervention in the contour generation process. The "Training DSC Evaluation" and "External Dataset DSC Evaluation" explicitly refer to the model's direct output.
- The qualitative review by external experts, while involving human assessment, is done after the algorithm has generated its standalone contours, effectively evaluating the standalone output's clinical utility.
7. Type of Ground Truth Used
- Expert Consensus: Ground truth for both training and test sets was established manually by six clinically experienced experts following consensus guidelines (NRG/RTOG/ESTRO).
8. Sample Size for the Training Set
- CT Training Sets: An average of 459 training image sets per CT structure model. Specific examples:
- A_Aorta_Asc (Update): 240
- A_Carotid_L/R (Update): 328
- A_Celiac: 435
- MR Training Sets:
- Brain models: An average of 259 training image sets.
- Pelvis models: An average of 243 training image sets.
- Specific examples: Amygdala_L/R: 493, CorpusCallosum: 56, Rectal_Spacer: 233.
9. How Ground Truth for Training Set was Established
- The ground truth for the training set was established manually by the same group of six clinically experienced experts (2 radiation therapy physicists, 1 radiation dosimetrist, and 3 radiation therapists with specialized training in radiation therapy contouring) using consensus guidelines (NRG/RTOG/ESTRO).
Ask a specific question about this device
(109 days)
ChartCheck is intended to assist with the quality assessment of radiotherapy treatment plans and on treatment review.
The ChartCheck device is software that enables trained radiation oncology personnel to perform quality assessments of treatment plans and treatment chart reviews utilizing plan, treatment, imaging, as well as documentation data obtained from an Oncology Information System database(s).
ChartCheck contains 3 main components:
a. An agent service that is configured by the user to monitor an Oncology Information System (OIS) database. The agent watches for new treatment plans, treatment records, documentation, and images. The agent uploads data to a checking service.
b. A checking service that compares the treatment records to the treatment plan and calculates check states as new records are uploaded from the agent. The checking service processes on-treatment imaging data and interfaces with outside software platforms for dose calculation activities.
c. A web application accessed via a web browser that contains several components.
i. Chart checking mode, which allows a medical physicist to review treatment records and check state results, record chart check comments, and mark the chart check as approved.
ii. An image viewer that allows a medical physicist to review on-treatment imaging, on-treatment dose calculation results, and perform deformable registration editing.
iii. Settings mode, which allows an administrator to set check state colors, configure settings, define check state templates, set up check alerts, documentation generation, and billing settings.
N/A
Ask a specific question about this device
(101 days)
ClearCalc is intended to assist radiation treatment planners in determining if their treatment planning calculations are accurate using an independent Monitor Unit (MU) and dose calculation algorithm.
The ClearCalc Model RADCA V2.6 device is software that uses treatment data, image data, and structure set data obtained from a supported Treatment Planning System and Application Programming interfaces to perform a dose and/or monitor unit (MU) calculation on the incoming treatment planning parameters. It is designed to assist radiation treatment planners in determining if their treatment planning calculations are accurate using an independent Monitor Unit (MU) and dose calculation algorithm.
N/A
Ask a specific question about this device
(90 days)
AutoContour is intended to assist radiation treatment planners in contouring and reviewing structures within medical images in preparation for radiation therapy treatment planning.
As with AutoContour Model RADAC V3, the AutoContour Model RADAC V4 device is software that uses DICOM-compliant image data (CT or MR) as input to: (1) automatically contour various structures of interest for radiation therapy treatment planning using machine learning based contouring. The deep-learning based structure models are trained using imaging datasets consisting of anatomical organs of the head and neck, thorax, abdomen and pelvis for adult male and female patients, (2) allow the user to review and modify the resulting contours, and (3) generate DICOM-compliant structure set data the can be imported into a radiation therapy treatment planning system.
AutoContour Model RADAC V4 consists of 3 main components:
-
- A .NET client application designed to run on the Windows Operating System allowing the user to load image and structure sets for upload to the cloud-based server for automatic contouring, perform registration with other image sets, as well as review, edit, and export the structure set.
-
- A local "agent" service designed to run on the Windows Operating System that is configured by the user to monitor a network storage location for new CT and MR datasets that are to be automatically contoured.
-
- A cloud-based automatic contouring service that produces initial contours based on image sets sent by the user from the .NET client application.
Here's an analysis of the acceptance criteria and study findings for the Radformation AutoContour (Model RADAC V4) device, based on the provided text:
1. Acceptance Criteria and Reported Device Performance
The primary acceptance criterion for the automated contouring models is the Dice Similarity Coefficient (DSC), which measures the spatial overlap between the AI-generated contour and the ground truth contour. The criteria vary based on the estimated size of the anatomical structure. Additionally, for external clinical testing, an external reviewer rating was used to assess clinical appropriateness.
| Acceptance Criteria Category | Metric (for AI performance) | Performance Criteria (for AI performance) | Reported Device Performance (Mean ± Std Dev) for CT Models | Reported Device Performance (Mean ± Std Dev) for MR Models | Reported Device Performance (Mean External Reviewer Rating 1-5, higher is better) |
|---|---|---|---|---|---|
| Contouring Accuracy (CT Models) | Mean Dice Similarity Coefficient (DSC) | Large Volume Structures: ≥ 0.8 | 0.92 ± 0.06 | 0.96 ± 0.03 | N/A |
| Medium Volume Structures: ≥ 0.65 | 0.85 ± 0.09 | 0.84 ± 0.07 | N/A | ||
| Small Volume Structures: ≥ 0.5 | 0.81 ± 0.12 | 0.74 ± 0.09 | N/A | ||
| Clinical Appropriateness (CT Models) | External Reviewer Rating (1-5 scale) | Average Score ≥ 3 | N/A | N/A | 4.57 (across all CT models) |
| Contouring Accuracy (MR Models) | Mean Dice Similarity Coefficient (DSC) | Large Volume Structures: ≥ 0.8 | N/A | 0.96 ± 0.03 (training data) 0.80 ± 0.09 (external data) | N/A |
| Medium Volume Structures: ≥ 0.65 | N/A | 0.84 ± 0.07 (training data) 0.84 ± 0.09 (external data) | N/A | ||
| Small Volume Structures: ≥ 0.5 | N/A | 0.74 ± 0.09 (training data) 0.61 ± 0.14 (external data) | N/A | ||
| Clinical Appropriateness (MR Models) | External Reviewer Rating (1-5 scale) | Average Score ≥ 3 | N/A | N/A | 4.6 (across all MR models) |
2. Sample Size Used for the Test Set and Data Provenance
-
CT Models Test Set:
- Sample Size: For individual CT structure models, the number of testing sets ranged from 10 to 116 for the internal validation (Table 4) and 13 to 82 for the external clinical testing (Table 6). The document states "approximately 10% of the number of training image sets" were used for testing in the internal validation, with an average of 54 testing image sets per CT structure model.
- Data Provenance: Imaging data for training was gathered from 4 institutions in 2 different countries (United States and Switzerland). External clinical testing data for CT models was sourced from various TCIA (The Cancer Imaging Archive) datasets (Pelvic-Ref, Head-Neck-PET-CT, Pancreas-CT-CB, NSCLC, LCTSC, QIN-BREAST) and shared from several unidentified institutions in the United States. Data was retrospective, as it was acquired and then used for model validation.
-
MR Models Test Set:
- Sample Size: For individual MR structure models, the number of testing sets ranged from 45 for internal validation (Table 8) and 5 to 45 for external clinical testing (Table 10). The document states an average of 45 testing image sets per MR Brain model and 77 testing image sets per MR Pelvis model were used for internal validation.
- Data Provenance: Imaging data for training and internal testing was acquired from the Cancer Imaging Archive GLIS-RT dataset (for Brain models) and two open-source datasets plus one institution in the United States (for Pelvis models). External clinical testing data for MR models was from a clinical partner (for Brain models), two publicly available datasets (Prostate-MRI-U-S-Biopsy, Gold Atlas Pelvis, SynthRad), and two institutions utilizing MR Linacs for image acquisitions. Data was retrospective.
-
General Note: Test datasets were independent from those used for training.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts
- Number of Experts: Three (3) experts were used.
- Qualifications of Experts: The ground truth was established by three clinically experienced experts consisting of 2 radiation therapy physicists and 1 radiation dosimetrist.
4. Adjudication Method for the Test Set
- Method: Ground truthing of each test data set was generated manually using consensus (NRG/RTOG) guidelines as appropriate by the three experts. This implies an expert consensus method, likely involving discussion and agreement among the three. The document does not specify a quantitative adjudication method like "2+1" or "3+1" but rather a "consensus" guided by established clinical guidelines.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was Done
- The document does not report an MRMC comparative effectiveness study comparing human readers with AI assistance versus without AI assistance. The study focuses purely on the AI's performance and its clinical appropriateness as rated by external reviewers.
6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) was Done
- Yes, a standalone performance evaluation was done. The core of the performance data presented (Dice Similarity Coefficient) is a measure of the algorithm's direct output compared to the ground truth, without a human in the loop during the contour generation phase. The external reviewer ratings also assess the standalone performance of the AI-generated contours regarding their clinical utility for subsequent editing and approval.
7. The Type of Ground Truth Used
- Type: The ground truth used was expert consensus, specifically from three clinically experienced experts (2 radiation therapy physicists and 1 radiation dosimetrist), guided by NRG/RTOG guidelines.
8. The Sample Size for the Training Set
- CT Models Training Set: For CT structure models, there was an average of 341 training image sets.
- MR Models Training Set: For MR Brain models, there was an average of 149 training image sets. For MR Pelvis models, there was an average of 306 training image sets.
9. How the Ground Truth for the Training Set Was Established
The document states that the deep-learning based structure models were "trained using imaging datasets consisting of anatomical organs" and that the "test datasets were independent from those used for training." While it extensively details how ground truth was established for the test sets (manual generation by three experts using consensus and NRG/RTOG guidelines), it does not explicitly describe how the ground truth for the training sets was established. However, given the nature of deep learning models for medical image segmentation, it is highly probable that the training data also had meticulously generated, expert-annotated ground truth contours, likely following similar rigorous processes as the test sets, potentially from various institutions or public datasets. The consistency of the model architecture and training methodologies (e.g., "very similar CNN architecture was used to train these new CT models") suggests a standardized approach to data preparation, including ground truth generation, for both training and testing.
Ask a specific question about this device
(32 days)
AutoContour is intended to assist radiation treatment planners in contouring structures within medical images in preparation for radiation therapy treatment planning.
As with AutoContour Model RADAC V2, the AutoContour Model RADAC V3 device is software that uses DICOM-compliant image data (CT or MR) as input to: (1) automatically contour various structures of interest for radiation therapy treatment planning using machine learning based contouring. The deep-learning based structure models are trained using imaging datasets consisting of anatomical organs of the head and neck, thorax, abdomen and pelvis for adult male and female patients, (2) allow the user to review and modify the resulting contours, and (3) generate DICOM-compliant structure set data the can be imported into a radiation therapy treatment planning system.
AutoContour Model RADAC V3 consists of 3 main components:
-
- A .NET client application designed to run on the Windows Operating System allowing the user to load image and structure sets for upload to the cloud-based server for automatic contouring, perform registration with other image sets, as well as review, edit, and export the structure set.
-
- A local "agent" service designed to run on the Windows Operating System that is configured by the user to monitor a network storage location for new CT and MR datasets that are to be automatically contoured.
-
- A cloud-based automatic contouring service that produces initial contours based on image sets sent by the user from the .NET client application.
Here's a breakdown of the acceptance criteria and the study proving the device's performance, based on the provided document:
1. Table of Acceptance Criteria & Reported Device Performance
| Feature/Metric | Acceptance Criteria | Reported Device Performance (Mean DSC/Rating) |
|---|---|---|
| CT Structures | ||
| Large volume DSC | >= 0.8 | Initial Validation: 0.88 +/- 0.06External Validation: 0.90 +/- 0.09 |
| Medium volume DSC | >= 0.65 | Initial Validation: 0.88 +/- 0.08External Validation: 0.83 +/- 0.12 |
| Small volume DSC | >= 0.5 | Initial Validation: 0.75 +/- 0.12External Validation: 0.79 +/- 0.11 |
| Clinical Appropriateness (1-5 scale, 5 best) | Average score >= 3 | Average rating of 4.5 |
| MR Structures | ||
| Medium volume DSC | >= 0.65 | Initial Validation: 0.87 +/- 0.07External Validation: 0.87 +/- 0.07 |
| Small volume DSC | >= 0.5 | Initial Validation: 0.74 +/- 0.07External Validation: 0.74 +/- 0.07 |
| Clinical Appropriateness (1-5 scale, 5 best) | Average score >= 3 | Average rating of 4.4 |
2. Sample Sizes Used for the Test Set and Data Provenance
- CT Test Set (Internal Validation): Approximately 10% of the training images, averaging 50 test images per structure model.
- Provenance: Retrospective data from "among the patients used for CT training and testing 51.7% were male and 48.3% female. Patient ages range 11-30 : 0.3%, 31-50 : 6.2%, 51-70 : 43.3%, 71-100 : 50.3%. Race 84.0% White, 12.8% Black or African American, 3.2% Other." No specific country of origin is mentioned, but implies internal company data.
- CT Test Set (External Clinical Validation): Variable per structure model, ranging from 19 to 63 images.
- Provenance: Publicly available CT datasets from The Cancer Imaging Archive (TCIA archive). This suggests diverse, likely multi-national origin, but exact countries are not specified. The studies cited are primarily from US institutions (e.g., Memorial Sloan Kettering Cancer Center, MD Anderson Cancer Center). This data is retrospective.
- MR Test Set (Internal Validation):
- Brain models: 92 testing images (from TCIA GLIS-RT dataset).
- Pelvis models: Sample size not explicitly stated for testing, but refers to "Prostate-MRI-US-Biopsy dataset."
- Provenance: TCIA datasets (implying diverse origin, likely US-centric as above), retrospective.
- MR Test Set (External Clinical Validation):
- Brain models: 20 MR T1 Ax post (BRAVO) image scans acquired from a clinical partner (no specific country mentioned, but likely US given the context).
- Pelvis models: 19 images from a publicly available Gold Atlas Data set. The Gold Atlas project has references indicating collaboration across European and US institutions (e.g., Medical Physics - Europe/US).
- Provenance: Retrospective.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications
- Number of Experts: Three (3)
- Qualifications: "clinically experienced experts consisting of 2 radiation therapy physicists and 1 radiation dosimetrist." No specific years of experience are mentioned.
4. Adjudication Method for the Test Set
- Method: "Ground truthing of each test data set were generated manually using consensus (NRG/RTOG) guidelines as appropriate by three clinically experienced experts". This implies a consensus-based approach, likely 3-way consensus. If initial contours differed, discussions and adjustments would lead to a final agreed-upon ground truth.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was Done
- No, an MRMC comparative effectiveness study was not explicitly done to measure improvement for human readers with AI vs without AI assistance.
- The study focuses on the performance of the AI algorithm itself (standalone performance) and its clinical appropriateness as rated by experts. The "External Reviewer Average Rating" indicates how much editing would be required by a human, rather than directly measuring human reader performance improvement with assistance.
- "independent reviewers (not employed by Radformation) were used to evaluate the clinical appropriateness of structure models as they would be evaluated for the purposes of treatment planning. This external review was performed as a replacement to intraobserver variability testing done with the RADAC V2 structure models as it better quantified the usefulness of the structure model outputs in an unbiased clinical setting." This suggests an assessment of the usability of the AI-generated contours for human review and modification, but not a direct MRMC study comparing assisted vs. unassisted human performance.
6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done
- Yes, standalone performance was done.
- The Dice Similarity Coefficient (DSC) metrics presented are a measure of the algorithm's performance in generating contours when compared to expert-defined ground truth, without human intervention during the contour generation process. The "External Reviewer Average Rating" also evaluates the standalone output's quality before any human editing.
7. The Type of Ground Truth Used
- Type of Ground Truth: Expert consensus.
- "Ground truthing of each test data set were generated manually using consensus (NRG/RTOG) guidelines as appropriate by three clinically experienced experts consisting of 2 radiation therapy physicists and 1 radiation dosimetrist."
8. The Sample Size for the Training Set
- CT Training Set: Average of 373 training image sets per structure model.
- MR Training Set:
- Brain models: Average of 274 training image sets.
- Pelvis models: Sample size not explicitly stated for training, but refers to "Prostate-MRI-US-Biopsy dataset."
- It's important to note that specific numbers vary per structure, as shown in Table 4 and Table 8.
9. How the Ground Truth for the Training Set Was Established
- The document implies that the training data and their corresponding ground truths were prepared internally prior to the testing phase. While it doesn't explicitly state how the ground truth for the training set was established, it strongly suggests a similar rigorous, expert-driven approach as described for the test sets.
- "The test datasets were independent from those used for training and consisted of approximately 10% of the number of training image sets used as input for the model." This indicates that ground truth was established for both training and testing datasets.
- "Publically available CT datasets from The Cancer Imaging Archive (TCIA archive) were used and both AutoContour and manually added ground truth contours following the same structure guidelines used for structure model training were added to the image sets." This suggests that for publicly available datasets used for both training and external validation, ground truth was added following the same NRG/RTOG guidelines. For proprietary training data, a similar expert-based ground truth creation likely occurred.
Ask a specific question about this device
(175 days)
AutoContour is intended to assist radiation treatment planners in contouring structures within medical images in preparation for radiation therapy treatment planning.
As with AutoContour RADAC, the AutoContour RADAC V2 device is software that uses DICOM-compliant image data (CT or MR) as input to (1) automatically contour various structures of interest for radiation therapy treatment planning using machine learning based contouring. The deep-learning based structure models are trained using imaging datasets consisting of anatomical orqans of the head and neck, thorax, abdomen and pelvis for adult male and female patients.(2) allow the user to review and modify the resulting contours, and (3) generate DICOM-compliant structure set data the can be imported into a radiation therapy treatment planning system
AutoContour RADAC V2 consists of 3 main components:
-
- A .NET client application designed to run on the Windows Operating System allowing the user to load image and structure sets for upload to the cloud-based server for automatic contouring, perform registration with other image sets, as well as review, edit, and export the structure set.
-
- A local "agent" service designed to run on the Windows Operating System that is configured by the user to monitor a network storage location for new CT and MR datasets that are to be automatically contoured.
-
- A cloud-based automatic contouring service that produces initial contours based on image sets sent by the user from the .NET client application.
The provided text describes the acceptance criteria and the study that proves the device, AutoContour RADAC V2, meets these criteria. Here's a breakdown of the requested information:
1. A table of acceptance criteria and the reported device performance
The acceptance criterion for contouring accuracy is measured by the Mean Dice Similarity Coefficient (DSC), which varies based on the estimated volume of the structure.
| Structure Size Category | DSC Acceptance Criteria (Mean) | Reported Device Performance (Mean DSC +/- STD) |
|---|---|---|
| Large volume structures | > 0.80 | 0.94 +/- 0.03 |
| Medium volume structures | > 0.65 | 0.82 +/- 0.09 |
| Small volume structures | > 0.50 | 0.61 +/- 0.14 |
The document also provides detailed DSC results for each contoured structure, which all meet or exceed their respective size category's acceptance criteria. For example, for "A_Aorta" (Large), the reported DSC Mean is 0.91, which is >0.80. For "Brainstem" (Medium), the reported DSC Mean is 0.90, which is >0.65. For "OpticChiasm" (Small), the reported DSC Mean is 0.63, which is >0.50.
2. Sample size used for the test set and the data provenance
- CT Test Set:
- Sample Size: An average of 140 test image sets per CT structure model, constituting 20% of the training images. The specific number of test data sets for each CT structure is provided in the table (e.g., A_Aorta: 60, Bladder: 372).
- Data Provenance:
- Country of Origin: Not explicitly stated, but the patient demographics suggest diverse origins, likely within the US, given the prevalence of specific cancers and racial demographics. The acquisition was done using a Philips Big Bore CT simulator.
- Retrospective or Prospective: Not explicitly stated, but common in such validation studies, the data is typically retrospective patient data.
- Demographics: 51.7% male, 48.3% female. Age range: 11-30 (0.3%), 31-50 (6.2%), 51-70 (43.3%), 71-100 (50.3%). Race: 84.0% White, 12.8% Black or African American, 3.2% Other.
- Clinical Relevance: Data spanned across common radiation therapy treatment subgroups (Prostate, Breast, Lung, Head and Neck cancers).
- MR Test Set:
- Sample Size: An average of 16 test image sets per MR structure model. Specific numbers are not provided for each MR structure, but the total validation set for sensitivity and specificity was 16 datasets.
- Data Provenance:
- Country of Origin: Massachusetts General Hospital, Boston, MA.
- Retrospective or Prospective: The text states "These training sets consisted primarily of glioblastoma and astrocytoma cases from the Cancer Imaging Archive (TCIA) Glioma data set." and that "The testing dataset was acquired at a different institution using a different scanner and sequence parameters", implying retrospective data collection from existing archives/institutions.
- Demographics: 56% Male and 44% Female patients, with ages ranging from 20-80. No Race or Ethnicity data was provided.
3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts
- Number of Experts: Three clinically experienced experts.
- Qualifications: Two radiation therapy physicists and one radiation dosimetrist.
4. Adjudication method for the test set
- Method: Ground truthing of each test dataset was generated manually using consensus (NRG/RTOG) guidelines, as appropriate, by the three clinically experienced experts. This implies a form of expert consensus adjudication.
5. If a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was done, if so, what was the effect size of how much human readers improve with AI vs without AI assistance
- MRMC Study: No, an MRMC comparative effectiveness study involving human readers with and without AI assistance was not conducted. The performance data focuses on the software's standalone accuracy (Dice Similarity Coefficient, sensitivity, and specificity). The text states: "As with the Predicate Device, no clinical trials were performed for AutoContour RADAC V2."
6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done
- Standalone Performance: Yes, the primary performance evaluation provided is for the software's standalone performance, measured by the Dice Similarity Coefficient (DSC), sensitivity, and specificity of the auto-generated contours against expert-established ground truth. The study explicitly states, "Further tests were performed on independent datasets from those included in training and validation sets in order to validate the generalizability of the machine learning model." This is a validation of the algorithm's performance.
7. The type of ground truth used
- Type of Ground Truth: Expert consensus of manually contoured structures, established using NRG/RTOG (Radiation Therapy Oncology Group) guidelines. This is a form of expert consensus.
8. The sample size for the training set
- CT Training Set: An average of 700 training image sets per CT structure model. The specific number of training data sets for each CT structure is provided in the table (e.g., A_Aorta: 240, Bladder: 1000).
- MR Training Set: An average of 81 training image sets for MR structure models.
9. How the ground truth for the training set was established
- The document implies that the ground truth for the training set was also established manually, similar to the test set, as it states "Datasets used for testing were removed from the training dataset pool before model training began, and used exclusively for testing." It is standard practice for medical imaging AI to train on expertly contoured data. While not explicitly detailed for the training set, the consistency in ground truth methodology for both training and testing in such submissions suggests expert manual contouring based on established guidelines would have been used for training as well.
- Source for MR Training Data: Primarily glioblastoma and astrocytoma cases from The Cancer Imaging Archive (TCIA) Glioma data set.
Ask a specific question about this device
(175 days)
ClearCheck is intended to assist radiation therapy professionals in generating and assessing the quality of radiotherapy treatment plans. ClearCheck is also intended to assist radiation treatment planners in predicting when a treatment plan might result in a collision between the treatment machine and the patient or support structures.
The ClearCheck Model RADCC V2 device is software that uses treatment data, image data, and structure set data obtained from supported Treatment Planning System and Application Programming Interfaces to present radiotherapy treatment plans in a user-friendly way for user approval of the treatment plan. The ClearCheck device (Model RADCC V2) is also intended to assist users to identify where collisions between the treatment machine and the patient or support structures may occur in a treatment plan.
It is designed to run on Windows Operating Systems. ClearCheck Model RADCC V2 performs calculations on the incoming supported treatment data. Supported Treatment Planning Systems are used by trained medical professionals to simulate radiation therapy treatments for malignant or benign diseases.
The provided text describes the acceptance criteria and study for the ClearCheck Model RADCC V2 device, which assists radiation therapy professionals in generating and assessing treatment plans, including predicting potential collisions.
Here's a breakdown of the requested information:
1. Table of Acceptance Criteria and Reported Device Performance
| Acceptance Criteria | Reported Device Performance |
|---|---|
| BED / EQD2 Functionality | |
| Passing criteria for dose type constraints | 0.5% difference when compared to hand calculations using well-known BED/EQD2 formulas. |
| Passing criteria for Volume type constraints | 3% difference when compared to hand calculations using well-known BED/EQD2 formulas. |
| Deformed Dose Functionality | |
| Qualitative DVH analysis | Good agreement for all cases compared to known dose deformations. |
| Quantitative Dmax and Dmin differences | +/- 3% difference for deformed dose results compared to known dose deformations. |
| Overall Verification & Validation Testing | All test cases for BED/EQD2 and Deformed Dose functionalities passed. Overall software verification tests were performed to ensure intended functionality, and pass/fail criteria were used to verify requirements. |
2. Sample Size Used for the Test Set and Data Provenance
- Test Set Sample Size: The document does not explicitly state a specific numerical sample size for the test set used for the BED/EQD2 and Deformed Dose functionality validation. It mentions "all cases" for Deformed Dose and "a plan and plan sum" for BED/EQD2. This implies testing was done on an unspecified number of representative cases, but not a statistically powered cohort.
- Data Provenance: Not specified in the provided text. It does not mention the country of origin or whether the data was retrospective or prospective.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts
- The document does not mention the use of experts to establish ground truth for the test set.
- For BED/EQD2, the ground truth was established by "values calculated by hand using the well-known BED / EQD2 formulas."
- For Deformed Dose, the ground truth was established by "known dose deformations."
4. Adjudication Method for the Test Set
- Not applicable as there is no mention of expert review or adjudication for the test set. Ground truth was established by calculation or "known" deformations.
5. If a Multi Reader Multi Case (MRMC) Comparative Effectiveness Study Was Done, If So, What Was the Effect Size of How Much Human Readers Improve with AI vs Without AI Assistance
- No, a Multi Reader Multi Case (MRMC) comparative effectiveness study was not performed. The document explicitly states: "no clinical trials were performed for ClearCheck Model RADCC V2." The device is intended to "assist radiation therapy professionals," but its impact on human reader performance was not evaluated through a clinical study.
6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done
- Yes, performance evaluations for the de novo functionalities (BED/EQD2 and Deformed Dose) appear to be standalone algorithm performance assessments. The device's calculations were compared against established mathematical formulas (BED/EQD2) or known deformations (Deformed Dose) without human intervention in the evaluation process.
7. The Type of Ground Truth Used (expert consensus, pathology, outcomes data, etc.)
- BED/EQD2: Ground truth was based on "values calculated by hand using the well-known BED / EQD2 formulas." This is a computational/mathematical ground truth.
- Deformed Dose: Ground truth was based on "known dose deformations." This implies a physically or computationally derived ground truth where the expected deformation results were already established.
8. The Sample Size for the Training Set
- The document does not specify a sample size for the training set. It primarily focuses on the validation of new features against calculated or known results, rather than reporting on a machine learning model's training data.
9. How the Ground Truth for the Training Set Was Established
- The document does not provide information on how the ground truth for any training set was established. Given the nature of the device (software for calculations and collision prediction, building on predicate devices), it's possible that analytical methods and established physics/dosimetry principles form the basis, rather than a large labeled training dataset in the typical machine learning sense for image interpretation.
Ask a specific question about this device
(174 days)
ClearCalc is intended to assist radiation treatment planners in determining if their treatment planning calculations are accurate using an independent Monitor Unit (MU) and dose calculation algorithm.
The ClearCalc Model RADCA V2 device is software that uses treatment data, image data, and structure set data obtained from supported Treatment Planning System and Application Programming interfaces to perform a dose and/or monitor unit (MU) calculation on the incoming treatment planning parameters. It is designed to assist radiation treatment planners in determining if their treatment planning calculations are accurate using an independent Monitor Unit (MU) and dose calculation algorithm.
The provided text describes ClearCalc Model RADCA V2 and its substantial equivalence to predicate devices. However, the document does NOT contain a detailed study proving the device meets specific acceptance criteria in the manner of a clinical trial or a performance study with detailed statistical results. Instead, it states that "Verification tests were performed to ensure that the software works as intended and pass/fail criteria were used to verify requirements. Validation testing was performed to ensure that the software was behaving as intended, and results from ClearCalc were validated against accepted results for known planning parameters from clinically-utilized treatment planning systems."
Therefore, I can provide the acceptance criteria as stated for the ClearCalc Model RADCA V2's primary dose calculation algorithms and the monte carlo calculations, but comprehensive study details such as sample size, data provenance, expert ground truth, adjudication methods, or separate training/test sets are not available in the provided text.
Here's a summary of the available information:
1. Table of Acceptance Criteria and Reported Device Performance
| Calculation Algorithm | Acceptance Criteria | Reported Device Performance |
|---|---|---|
| FSPB (Photon Plans) | Passing criteria consistent with Predicate Device ClearCalc Model RADCA | Passed in all test cases |
| TG-71 (Electron Plans) | Passing criteria consistent with Predicate Device ClearCalc Model RADCA | Passed in all test cases |
| TG-43 (Brachytherapy Plans) | Passing criteria consistent with Predicate Device ClearCalc Model RADCA | Passed in all test cases |
| Monte Carlo Calculations | Gamma analysis passing rate of >93% with +/-3% relative dose agreement and 3mm Distance To Agreement (DTA) | Passed in all test cases |
2. Sample size used for the test set and the data provenance:
- Sample Size: Not explicitly stated. The text mentions "all test cases" without quantifying the number of cases.
- Data Provenance: Not explicitly stated. The text refers to "known planning parameters from clinically-utilized treatment planning systems," suggesting the data would be representative of clinical use but does not specify country of origin or if it's retrospective or prospective.
3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:
- Number of Experts: Not specified.
- Qualifications of Experts: The ground truth was established from "accepted results for known planning parameters from clinically-utilized treatment planning systems." This implies that the ground truth is derived from established clinical practices and systems, which are typically validated by qualified medical physicists and radiation oncologists, but specific expert involvement in this validation is not detailed.
4. Adjudication method (e.g. 2+1, 3+1, none) for the test set:
- Adjudication Method: Not specified. The validation relies on comparison to "accepted results."
5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:
- MRMC Study: No. The device is a "Secondary Check Quality Assurance Software" designed to assist radiation treatment planners by providing independent calculation. It does not involve human readers making diagnoses or interpretations that would be augmented by AI in a MRMC study context.
6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:
- Standalone Performance: Yes, the described verification and validation tests assess the algorithm's performance against "accepted results" from clinical systems, which is a standalone evaluation.
7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.):
- Ground Truth Type: "Accepted results for known planning parameters from clinically-utilized treatment planning systems." This implies a form of established, clinically validated ground truth based on the outputs of other (predicate/reference) treatment planning systems the device is checking.
8. The sample size for the training set:
- Training Set Sample Size: Not specified. The text focuses on verification and validation testing, not on the training of the underlying algorithms.
9. How the ground truth for the training set was established:
- Training Set Ground Truth: Not specified. As the document focuses on validation rather than algorithm training, this information is not provided.
Ask a specific question about this device
(263 days)
AutoContour is intended to assist radiation treatment planners in contouring structures within medical images in preparation for radiation therapy treatment planning.
AutoContour consists of 3 main components:
- An "agent" service designed to run on the Windows Operating System that is configured by the user to monitor a network storage location for new CT datasets that are to be automatically uploaded to:
- A cloud-based AutoContour automatic contouring service that produces initial contours and
- A web application accessed via web browser which allows the user to perform registration with other image sets as well as review, edit, and export the structure set containing the contours.
The provided text describes the acceptance criteria and study proving the device meets those criteria. Here's a breakdown of the requested information:
1. Table of Acceptance Criteria & Reported Device Performance
The document states that formal acceptance criteria and reported device performance are detailed in "Radformation's AutoContour Complete Test Protocol and Report." However, this specific report is not included in the provided text. The summary only generally states that "Nonclinical tests were performed... which demonstrates that AutoContour performs as intended per its indications for use" and "Verification and validation tests were performed to ensure that the software works as intended and pass/fail criteria were used to verify requirements."
Therefore, a table of acceptance criteria and reported device performance cannot be constructed from the provided text.
2. Sample Size Used for the Test Set and Data Provenance
The document mentions that "tests were performed on independent datasets from those included in training and validation sets in order to validate the generalizability of the machine learning model." However, the sample size for the test set is not explicitly stated.
Regarding data provenance:
- The document implies the data used was medical image data (specifically CT, and for registration purposes, MR and PET).
- The country of origin is not specified.
- The terms "training and validation sets" and "independent datasets" suggest these were retrospective datasets used for model development and evaluation. There is no mention of prospective data collection.
3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications
The document does not provide any information about the number of experts used to establish ground truth for the test set or their qualifications.
4. Adjudication Method for the Test Set
The document does not specify any adjudication method (e.g., 2+1, 3+1, none) used for the test set.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance?
The document explicitly states: "As with the Predicate Devices, no clinical trials were performed for AutoContour." This indicates that an MRMC comparative effectiveness study involving human readers and AI assistance was not conducted. Therefore, no effect size for human reader improvement is reported.
6. If a Standalone (i.e. algorithm only without human-in-the-loop performance) was done
The document mentions "tests were performed on independent datasets from those included in training and validation sets in order to validate the generalizability of the machine learning model." This strongly suggests that standalone performance of the algorithm was evaluated. Although specific metrics for this standalone performance are not detailed in the provided text, the validation of a machine learning model against independent datasets implies a standalone evaluation.
7. The Type of Ground Truth Used
The document mentions that AutoContour is intended to "assist radiation treatment planners in contouring structures within medical images." Given this, the ground truth for the contours would typically be expert consensus or expert-annotated contours. However, the document itself does not explicitly state the type of ground truth used (e.g., expert consensus, pathology, outcomes data).
8. The Sample Size for the Training Set
The document mentions "training and validation sets" but does not provide the sample size for the training set.
9. How the Ground Truth for the Training Set Was Established
The document mentions "training and validation sets" but does not detail how the ground truth for the training set was established. Similar to the test set, it would likely involve expert contouring, but this is not explicitly stated.
Ask a specific question about this device
Page 1 of 2