(73 days)
AI Segmentation uses CT images to segment patient anatomy for use in radiation therapy treatment planning. AI Segmentation utilizes a pre-defined set of organ structures in the following regions: head and neck, thorax, pelvis, abdomen. Segmentation results are subject to review and editing by qualified, expert radiation therapy treatment planners. Results of AI Segmentation are utilized in the Eclipse Treatment Planning System where it is the responsibility of a qualified physician to further review, edit as needed, and approve each structure.
Al Segmentation is a web-based application, running in the cloud, that provides a combined deep learning and classical-based approach for automated segmentation of organs at risk, along with tools for structure visualization. This software medical device product is used by trained medical professionals and consists of a web application user interface where the results from the automated segmentation can be reviewed, edited, and selected for export into the compatible treatment planning system. Al Segmentation is not intended to provide clinical decisions, medical advice, or evaluations of radiation plans or treatment procedures.
Here's an analysis of the acceptance criteria and study detailed in the provided text:
1. Table of Acceptance Criteria and Reported Device Performance
The text doesn't provide a direct, explicit table of acceptance criteria with corresponding performance metrics for all AI models. Instead, it describes a general approach for evaluating performance, focusing on the DICE similarity index for automated contouring and a qualitative expert assessment.
Acceptance Criterion (Implicit) | Reported Device Performance |
---|---|
Automated Contour Quality (Quantitative) | Evaluated using the DICE similarity index. Aggregated DICE scores were compared to literature values or against the performance of the prior model (for updated algorithms). Specific numerical scores are not provided in this document. |
Automated Contour Quality (Qualitative Expert Assessment) | A qualitative scoring system was used to measure the acceptability of auto-generated contours. The target was 80% of expert scores designating the contours as "acceptable with minor or no adjustments". The document states that "AI models in the subject device equivalent performance to the predicate." |
Software Verification and Validation (Safety and Conformance) | Conducted and documentation provided as recommended by FDA guidance. The software was considered a "major" level of concern. Overall test results demonstrated conformance to applicable requirements and specifications. |
Conformance to Standards | The subject device conforms, in whole or in part, to IEC 62304, IEC 62366-1, IEC 62083, and IEC 82304-1. |
Resolution of Discrepancy Reports (DRs) | There were no remaining DRs classified as Safety or Customer Intolerable. |
2. Sample Size Used for the Test Set and Data Provenance
The document does not explicitly state the sample size (number of patients or scans) used for the test set. It mentions "non-clinical performance tests for automated contouring AI models" but lacks the specific number of cases.
- Data Provenance: The document does not specify the country of origin of the data. It indicates that the study was a non-clinical performance evaluation, implying retrospective data was likely used, but this is not explicitly stated.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications
The document states: "Clinical experts also evaluated the performance of these AI models during validation testing." However, it does not specify:
- The exact number of experts used.
- The specific qualifications of these experts (e.g., "radiologist with 10 years of experience"). It generally refers to them as "qualified, expert radiation therapy treatment planners" and "qualified physicians" in the Indications for Use, which implies relevant expertise.
4. Adjudication Method for the Test Set
The document mentions that "Each AI model was assessed using the DICE similarity index as a comparative measure of the auto-generated contours against ground truth contours for a given structure." and that "Clinical experts also evaluated the performance of these AI models during validation testing."
However, it does not explicitly detail the adjudication method used for establishing the ground truth or resolving expert discrepancies (e.g., 2+1, 3+1). The primary comparison seems to be against "ground truth contours" rather than against potentially varying expert opinions that would necessitate an adjudication method for the ground truth itself. The qualitative expert assessment seems to be a separate evaluation of the AI output against the target of "acceptable with minor or no adjustments," rather than a process to establish the ground truth.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- No MRMC study was done. The document explicitly states: "No animal studies or clinical tests have been included in this pre-market submission." and "The predicate device was cleared based only on non-clinical testing, and no animal or clinical studies were performed for the subject device."
- Therefore, there is no reported effect size of how much human readers improve with AI vs. without AI assistance. The device is intended to be reviewed and edited by human experts, but a comparative study on this effect was not performed for this submission.
6. Standalone (Algorithm Only Without Human-in-the-Loop Performance) Study
- Yes, a standalone performance study was done. The performance evaluation focused on the AI models' ability to generate contours, as measured by the DICE similarity index against ground truth and qualitative expert assessment of the auto-generated contours. This indicates a standalone assessment of the algorithm's output before human editing.
7. Type of Ground Truth Used
The ground truth used was expert consensus / expert-generated contours. The text states:
- "Each AI model was assessed using the DICE similarity index as a comparative measure of the auto-generated contours against ground truth contours for a given structure."
- The Indications for Use also mention that "Segmentation results are subject to review and editing by qualified, expert radiation therapy treatment planners" and "it is the responsibility of a qualified physician to further review, edit as needed, and approve each structure," which implies that the ground truth would be established by such experts.
8. Sample Size for the Training Set
The document does not specify the sample size used for the training set of the AI models.
9. How the Ground Truth for the Training Set Was Established
The document does not specify how the ground truth for the training set was established. It describes the evaluation of the AI models but not their development. However, given it's an AI segmentation tool for radiation therapy, it's highly probable that the training data ground truth was also established through expert contouring.
§ 892.5050 Medical charged-particle radiation therapy system.
(a)
Identification. A medical charged-particle radiation therapy system is a device that produces by acceleration high energy charged particles (e.g., electrons and protons) intended for use in radiation therapy. This generic type of device may include signal analysis and display equipment, patient and equipment supports, treatment planning computer programs, component parts, and accessories.(b)
Classification. Class II. When intended for use as a quality control system, the film dosimetry system (film scanning system) included as an accessory to the device described in paragraph (a) of this section, is exempt from the premarket notification procedures in subpart E of part 807 of this chapter subject to the limitations in § 892.9.