Search Results

AI Segmentation uses CT images to segment patient anatomy for use in radiation therapy treatment planning. AI Segmentation utilizes a pre-defined set of organ structures in the following regions: head and neck, thorax, pelvis, abdomen. Segmentation results are subject to review and editing by qualified, expert radiation therapy treatment planners. Results of AI Segmentation are utilized in the Eclipse Treatment Planning System where it is the responsibility of a qualified physician to further review, edit as needed, and approve each structure.

Device Description

Al Segmentation is a web-based application, running in the cloud, that provides a combined deep learning and classical-based approach for automated segmentation of organs at risk, along with tools for structure visualization. This software medical device product is used by trained medical professionals and consists of a web application user interface where the results from the automated segmentation can be reviewed, edited, and selected for export into the compatible treatment planning system. Al Segmentation is not intended to provide clinical decisions, medical advice, or evaluations of radiation plans or treatment procedures.

AI/ML Overview

Here's an analysis of the acceptance criteria and study detailed in the provided text:

1. Table of Acceptance Criteria and Reported Device Performance

The text doesn't provide a direct, explicit table of acceptance criteria with corresponding performance metrics for all AI models. Instead, it describes a general approach for evaluating performance, focusing on the DICE similarity index for automated contouring and a qualitative expert assessment.

Acceptance Criterion (Implicit)	Reported Device Performance
Automated Contour Quality (Quantitative)	Evaluated using the DICE similarity index. Aggregated DICE scores were compared to literature values or against the performance of the prior model (for updated algorithms). Specific numerical scores are not provided in this document.
Automated Contour Quality (Qualitative Expert Assessment)	A qualitative scoring system was used to measure the acceptability of auto-generated contours. The target was 80% of expert scores designating the contours as "acceptable with minor or no adjustments". The document states that "AI models in the subject device equivalent performance to the predicate."
Software Verification and Validation (Safety and Conformance)	Conducted and documentation provided as recommended by FDA guidance. The software was considered a "major" level of concern. Overall test results demonstrated conformance to applicable requirements and specifications.
Conformance to Standards	The subject device conforms, in whole or in part, to IEC 62304, IEC 62366-1, IEC 62083, and IEC 82304-1.
Resolution of Discrepancy Reports (DRs)	There were no remaining DRs classified as Safety or Customer Intolerable.

2. Sample Size Used for the Test Set and Data Provenance

The document does not explicitly state the sample size (number of patients or scans) used for the test set. It mentions "non-clinical performance tests for automated contouring AI models" but lacks the specific number of cases.

Data Provenance: The document does not specify the country of origin of the data. It indicates that the study was a non-clinical performance evaluation, implying retrospective data was likely used, but this is not explicitly stated.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications

The document states: "Clinical experts also evaluated the performance of these AI models during validation testing." However, it does not specify:

The exact number of experts used.
The specific qualifications of these experts (e.g., "radiologist with 10 years of experience"). It generally refers to them as "qualified, expert radiation therapy treatment planners" and "qualified physicians" in the Indications for Use, which implies relevant expertise.

4. Adjudication Method for the Test Set

The document mentions that "Each AI model was assessed using the DICE similarity index as a comparative measure of the auto-generated contours against ground truth contours for a given structure." and that "Clinical experts also evaluated the performance of these AI models during validation testing."

However, it does not explicitly detail the adjudication method used for establishing the ground truth or resolving expert discrepancies (e.g., 2+1, 3+1). The primary comparison seems to be against "ground truth contours" rather than against potentially varying expert opinions that would necessitate an adjudication method for the ground truth itself. The qualitative expert assessment seems to be a separate evaluation of the AI output against the target of "acceptable with minor or no adjustments," rather than a process to establish the ground truth.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No MRMC study was done. The document explicitly states: "No animal studies or clinical tests have been included in this pre-market submission." and "The predicate device was cleared based only on non-clinical testing, and no animal or clinical studies were performed for the subject device."
Therefore, there is no reported effect size of how much human readers improve with AI vs. without AI assistance. The device is intended to be reviewed and edited by human experts, but a comparative study on this effect was not performed for this submission.

6. Standalone (Algorithm Only Without Human-in-the-Loop Performance) Study

Yes, a standalone performance study was done. The performance evaluation focused on the AI models' ability to generate contours, as measured by the DICE similarity index against ground truth and qualitative expert assessment of the auto-generated contours. This indicates a standalone assessment of the algorithm's output before human editing.

7. Type of Ground Truth Used

The ground truth used was expert consensus / expert-generated contours. The text states:

"Each AI model was assessed using the DICE similarity index as a comparative measure of the auto-generated contours against ground truth contours for a given structure."
The Indications for Use also mention that "Segmentation results are subject to review and editing by qualified, expert radiation therapy treatment planners" and "it is the responsibility of a qualified physician to further review, edit as needed, and approve each structure," which implies that the ground truth would be established by such experts.

8. Sample Size for the Training Set

The document does not specify the sample size used for the training set of the AI models.

9. How the Ground Truth for the Training Set Was Established

The document does not specify how the ground truth for the training set was established. It describes the evaluation of the AI models but not their development. However, given it's an AI segmentation tool for radiation therapy, it's highly probable that the training data ground truth was also established through expert contouring.

Ask a Question

Ask a specific question about this device

K Number

K203469

Device Name

AI Segmentation

Manufacturer

Varian Medical Systems, Inc.

Date Cleared

2021-04-19

(145 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K192377,K141248

Predicate For

K211881

Intended Use

Device Description

AI segmentation is a web-based application, running in the cloud, that provides a combined deep learning and classical-based approach for automated segmentation of organs at risk, along with tools for structure visualization. This software medical device product is used by trained medical professionals and consists of a web application user interface where the results from the automated segmentation can be reviewed and selected for export into the compatible treatment planning system. AI Segmentation is not intended to provide clinical decisions, medical advice, or evaluations of radiation plans or treatment procedures.

AI/ML Overview

The provided text describes that the AI Segmentation device does not include clinical data in its premarket submission. The document explicitly states: "No animal studies or clinical tests have been included in this pre-market submission." Therefore, it is not possible to provide acceptance criteria or a study proving the device meets the acceptance criteria using the requested information (e.g., sample size for the test set, number of experts, adjudication method, MRMC study, standalone performance, type of ground truth for test and training sets, and training set size).

Instead, the submission for AI Segmentation focused on non-clinical data, specifically software verification and validation testing, and conformance to relevant standards.

Here is what can be extracted from the document regarding the non-clinical evaluation:

1. Table of Acceptance Criteria and Reported Device Performance:

Since no clinical study data is available, a table of acceptance criteria and reported device performance in terms of clinical accuracy (e.g., Dice score, sensitivity, specificity) cannot be provided. The performance data presented is focused on software quality and adherence to regulatory standards.

Acceptance Criterion (Non-Clinical)	Reported Device Performance
Conformance to applicable software requirements and specifications	"Test results demonstrate conformance to applicable requirements and specifications." (Page 5)
Software level of concern assessment	Assessed as "major" level of concern. (Page 5)
Conformance to IEC 62304 Edition 1.1 2015-06 (Medical device software - Software life cycle processes)	Conforms in whole or in part. (Page 5)
Conformance to IEC 62366-1 Edition 1.0 2015-02 (Application of usability engineering to medical devices)	Conforms in whole or in part. (Page 6)
Conformance to IEC 62083 Edition 2.0 2009-09 (Requirements for the safety of radiotherapy treatment planning systems)	Conforms in whole or in part. (Page 6)
Conformance to IEC 82304-1 Edition 1.0 2016-10 (Health software Part 1: General requirements for product safety)	Conforms in whole or in part. (Page 6)
Absence of Safety or Customer Intolerable Discrepancy Reports (DRs)	"There were no remaining discrepancy reports (DRs) which could be classified as Safety or Customer Intolerable." (Page 6)

2. Sample size used for the test set and the data provenance: Not applicable, as no clinical test set data from patients was submitted. The evaluation was based on software testing.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts: Not applicable, as no clinical test set requiring expert ground truth was submitted.

4. Adjudication method (e.g., 2+1, 3+1, none) for the test set: Not applicable, as no clinical test set requiring adjudication was submitted.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance: No MRMC study was done, as explicitly stated that no clinical tests were included.

6. If a standalone (i.e. algorithm only without human-in-the loop performance) was done: The document states, "Segmentation results are subject to review and editing by qualified, expert radiation therapy treatment planners. Results of AI Segmentation are utilized in the Eclipse Treatment Planning System where it is the responsibility of a qualified physician to further review, edit as needed, and approve each structure." This indicates the device is intended for human-in-the-loop use. However, no clinical performance data (standalone or otherwise) was provided. The software verification and validation would have tested the algorithm's output without human intervention, but these were functional tests, not clinical performance studies.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.): Not applicable for clinical ground truth, as no clinical studies were submitted. For software verification, ground truth would be against predetermined functional requirements and expected outputs established during software development.

8. The sample size for the training set: Not provided in the document.

9. How the ground truth for the training set was established: Not provided in the document.

Ask a Question

Ask a specific question about this device

Page 1 of 1