Search Results

ART-Plan's indicated target population is cancer patients for whom radiotherapy treatment has been prescribed. In this population, any patient for whom relevant modality imaging data is available.

ART-Plan is not intended for patients less than 18 years of age.

The indicated users are trained medical professionals including, but not limited to, radiotherapists, radiation oncologists, medical physicists, dosimetrists and medical professionals involved in the radiation therapy process.

The indicated use environments are, but not limited to, hospitals, clinics and any health facility involved in radiation therapy.

Device Description

The ART-Plan application consists of three key modules: SmartFuse,Annotate and AdaptBox, allowing the user to display and visualise 3D multi-modal medical image data. The user may process, render, review, store, display and distribute DICOM 3.0 compliant datasets within the system and/or across computer networks.

Compared to Ethos Treatment, 2.1; Ethos Treatment Planning, 1.1 (primary predicate), the following additional feature has been added to ART-Plan v2.1.0:

generation of synthetic CT from MR images. This does not represent an additional . claim as the technological characteristics are the same and it does not raise different questions of safety and effectiveness. Also, this feature is already covered by reference and previous version of the device ART-Plan v1.10.1.

The ART-Plan technical functionalities claimed by TheraPanacea are the following:

. Proposing automatic solutions to the user, such as an automatic delineation, automatic multimodal image fusion, etc. towards improving standardization of processes/ performance / reducing user tedious / time consuming involvement.
. Offering to the user a set of tools to assist semi-automatic delineation, semi-automatic registration towards modifying/editing manually automatically generated structures and addina/removing new/undesired structures or imposing user-provided correspondences constraints on the fusion of multimodal images.
. Presenting to the user a set of visualization methods of the delineated structures, and registration fusion maps.
. Saving the delineated structures / fusion results for use in the dosimetry process.
. Enabling rigid and deformable registration of patients images sets to combine information contained in different or same modalities.
Allowing the users to generate, visualize, evaluate and modify pseudo-CT from MRI and CBCT images.
. Allowing the users to generate, visualize and analyze dose on images of CT modality (only within the AdatpBox workflow)
. Presenting to the user metrics to define if there is a need for replanning or not.

AI/ML Overview

The provided document describes the acceptance criteria and the study that proves the ART-Plan device meets these criteria across its various modules (Autosegmentation, SmartFuse, AdaptBox, Synthetic-CT generation, and Dose Engine).

Here's a breakdown of the requested information:

1. Table of Acceptance Criteria and Reported Device Performance

Autosegmentation Tool

Acceptance Criteria Type	Acceptance Criteria	Reported Device Performance
Quantitative (DSC)	a) DSC (mean) ≥ 0.8 (AAPM criterion) ORb) DSC (mean) ≥ 0.54 (inter-expert variability) OR DSC (mean) ≥ mean (DSC inter-expert) + 5%	Duodenum: DICE diff inter-expert = 1.32% (Passed)Large bowel: DICE diff inter-expert = 1.19% (Passed)Small bowel: DICE diff inter-expert = 2.44% (Passed)
Qualitative (A+B%)	A+B % ≥ 85% (A: acceptable without modification, B: acceptable with minor modifications/corrections, C: requires major modifications)	Right lacrimal gland: A+B = 100% (Passed)Left lacrimal gland: A+B = 100% (Passed)Cervical lymph nodes VIA: A+B = 97% (Passed)Cervical lymph nodes VIB: A+B = 100% (Passed)Pharyngeal constrictor muscle: A+B = 100% (Passed)Anal canal: A+B = 98.68% (Passed)Bladder: A+B = 93.42% (Passed)Left femoral head: A+B = 100% (Passed)Right femoral head: A+B = 100% (Passed)Penile bulb: A+B = 96.05% (Passed)Prostate: A+B = 92.10% (Passed)Rectum: A+B = 100% (Passed)Seminal vesicle: A+B = 94.59% (Passed)Sigmoid: A+B = 98.68% (Passed)

SmartFuse Module (Image Registration)

Acceptance Criteria Type	Acceptance Criteria	Reported Device Performance
Quantitative (DSC)	a) DSC (mean) ≥ 0.81 (AAPM criterion) ORb) DSC (mean) ≥ 0.65 (benchmark device)	No specific DSC performance values are directly listed for SmartFuse, but the qualitative evaluations imply successful registration leading to acceptable contours.
Qualitative (A+B%)	Propagated Contours: A+B% ≥ 85% for deformable, A+B% ≥ 50% for rigid.Overall Registration Output: A+B% ≥ 85% for deformable, A+B% ≥ 50% for rigid.	for tCBCT - sCT (Overall Registration Output): Rigid: A+B%=95.56% (Passed); Deformable: A+B%=97.78% (Passed)for tsynthetic-CT - sCT (Propagated Contours): Deformable: A+B%=94.06% (Passed)for tCT - sSCT (Overall Registration Output): Rigid: A+B%=70.37% (Passed)
Geometric	2) Jacobian Determinant must be positive.3) Target Registration Error (TRE) < 2mm (POPI database)	Not explicitly detailed in the performance summary table, but the document states "The results show that both types of fusion algorithms (Rigid & Deformable) in SmartFuse pass the performed tests, and provide valid results for further clinical use in radiotherapy."

Synthetic-CT Generation (from MR images)

Acceptance Criteria Type	Acceptance Criteria	Reported Device Performance
Gamma Passing Criteria	a) Median 2%/2mm gamma passing criteria ≥ 95%b) Median 3%/3mm gamma passing criteria ≥ 99.0%	Not explicitly listed in the performance table, but the document implies meeting criteria based on overall "Passed" status for AdaptBox functionality.
Mean Dose Deviation	c) Mean dose deviation (synthetic-CT compared to standard CT) ≤ 2% in ≥ 88% of patients	Not explicitly listed in the performance table, but the document implies meeting criteria based on overall "Passed" status for AdaptBox functionality.

Synthetic-CT Generation (from CBCT images)

Acceptance Criteria Type	Acceptance Criteria	Reported Device Performance
Gamma Passing Criteria	a) Median 2%/2mm gamma passing criteria ≥ 92%b) Median 3%/3mm gamma passing criteria ≥ 93.57%	Gamma index: 2%/2mm = 98.85 (Passed); 3%/3mm = 99.43 (Passed)
Mean Dose Deviation	c) Mean dose deviation (synthetic-CT compared to standard CT) ≤ 2% in ≥ 76.7% of patients	DVH parameters (PTV): < 0.199% in 100% of the cases (Passed)

Dose Engine Function (within AdaptBox)

Acceptance Criteria Type	Acceptance Criteria	Reported Device Performance
Dose Deviation	a) DVH or median dose deviation ≤ 24.4% for Lung tissue or ≤ 4.4% for other organs	DVH parameters (PTV): < 1.29% (Passed)DVH parameters (OARs): < 3.9% (Passed)
Gamma Passing Criteria	b) Median 2%/2mm gamma passing rate ≥ 86.3%c) Median 3%/3mm gamma passing criteria ≥ 91.75%	Gamma index: 2%/2mm = 97.22% (Passed); 3%/3mm = 99.50% (Passed)

2. Sample sizes used for the test set and data provenance

The document provides "Min sample size for evaluation method" for various tests, and then states the actual "Sample size" used, which generally exceeds the minimum.

Autosegmentation Tool:
- Quantitative (DSC) tests (Duodenum, Large bowel, Small bowel): Sample size = 25 for each, Data Provenance: Retrospective, actual clinical data (implied from "real-world retrospective data which were initially used for treatment of cancer patients"). Country of origin is not explicitly stated but implied to be diverse based on training data representing "market share of the different vendors in EU & USA".
- Qualitative (A+B%) tests (various organs): Sample sizes range from 20 to 38 for different organs. Data Provenance: Retrospective, clinical data. Country of origin: Not explicitly stated, but includes both US and non-US data for performance evaluation.
SmartFuse Module:
- tCBCT - sCT: Sample size = 45. Data Provenance: Retrospective, clinical data.
- tsynthetic-CT - sCT: Sample size = 30. Data Provenance: Retrospective, clinical data.
- tCT - sSCT: Sample size = 27. Data Provenance: Retrospective, clinical data.
Synthetic-CT Generation (from CBCT):
- Sample size = 20. Data Provenance: Retrospective, clinical data.
Dose Engine Function:
- Sample size = 272 total (Brain: 42, H&N: 70, Chest: 44, Breast: 26, Pelvis: 90). Data Provenance: Retrospective, clinical data.

All data for testing derived from "real-world retrospective data which were initially used for treatment of cancer patients." The document emphasizes that the data was pseudo-anonymized and collected from various centers, with efforts to ensure representation of "market share of the different vendors in EU & USA" and equivalent performance between non-US and US populations, suggesting a multi-national provenance for both training and testing.

3. Number of experts used to establish the ground truth for the test set and their qualifications

The document states ground truth contours were produced by "different delineators (clinical experts)" and assessment of "intervariability," and "ground truth contours provided by the centers and validated by a second expert of the center." It also mentions "qualitative evaluation and validation of the contours."

Number of Experts: Not a fixed number, but implies multiple "clinical experts" for initial contouring and a "second expert" for validation. For qualitative evaluations in the performance tests, the column "Qualitative evaluation by experts" is used, implying multiple experts participate in the A/B/C scoring. For instance, usability testers are described as "European medical physicists who have participated in the evaluation have at least an equivalent expertise level compared to a junior US medical physicist (MP), and responsibilities in the radiotherapy clinical workflow are equivalent."
Qualifications: "Clinical experts," "medical physicists," "radiation oncologists," and "dosimetrists." The specific number for each test isn't enumerated, but it implies a panel of qualified professionals for qualitative assessment and ground truth establishment.

4. Adjudication method for the test set

The ground truth establishment involved a mix of:

Contouring guidelines confirmed with data-providing centers.
Data created by "different delineators (clinical experts)."
Assessment of inter-variability among experts.
Ground truth contours provided by centers and "validated by a second expert of the center."
Qualitative evaluation and validation of contours by experts (A, B, C scale). This qualitative assessment serves as a form of adjudication for the performance of the device against expert opinion.

This suggests a consensus-based approach with a "2+1" or similar structure where two initial delineations (or one and a validation by a second expert) contribute to building the ground truth, which is then further evaluated qualitatively. No explicit numerical "2+1" or "3+1" is given, but the description aligns with a multi-reader, consensus-driven process.

5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, and its effect size

The document does not explicitly present a formal MRMC comparative effectiveness study demonstrating how much human readers improve with AI vs. without AI assistance. The studies focus on the performance of the device itself (standalone, or how its output (e.g., auto-segmentations, registrations) is perceived by human experts.

The qualitative evaluations done by experts (A+B% scores) are a form of assessment of the AI's output by multiple readers/experts on multiple cases, but they do not directly measure improvement of human reader performance with AI assistance compared to performance without AI assistance. The qualitative evaluations are on the AI's output, not on the human reader's workflow with AI.

Therefore, no effect size for human reader improvement with AI assistance is provided.

6. If a standalone (i.e., algorithm only without human-in-the-loop performance) was done

Yes, standalone performance was extensively evaluated. The entire set of acceptance criteria and performance study results (DSC scores, gamma passing rates, mean dose deviations, and qualitative evaluation of the AI-generated contours/registrations) directly pertain to the algorithm's performance without a human in the loop during the generation process. The human element comes into play for evaluating the algorithm's output, not for assisting its real-time operation in these specific performance metrics.

For example, the auto-segmentation tests measure the quality of contours generated solely by the AI model. The synthetic-CT generation and dose engine also measure the performance of the algorithm itself.

7. The type of ground truth used

The ground truth used is a combination of:

Expert Consensus/Delineation: For auto-segmentation, ground truth contours were established by "clinical experts," often validated by "a second expert of the center," and confirmed with "contouring guidelines." This aligns with expert consensus.
Imaging Metrics/Physical Measurement Comparisons: For synthetic-CT generation, ground truth involved comparison to "real planning CTs for the same patients," coupled with dose calculations and gamma passing criteria, which are objective quantitative metrics. For the dose engine, measurements on a Linac were compared with the dose engine results.

8. The sample size for the training set

The training set sizes are provided separately for different functionalities:

Auto-segmentation tool: 246,226 samples (corresponding to 8,950 total patients).
Synthetic-CT from MR images: 6,195 samples.
Synthetic-CT from CBCT images: 1,467 samples.

The document clarifies that "one patient can be associated with more images (e.g. CT, MR) and that each image (anatomy) has the delineation of several structures (OARs and lymph nodes) which increases the number of samples used for training and validation."

9. How the ground truth for the training set was established

The ground truth for the training set was established through a rigorous process involving:

Clinical Expert Delineation: Contours for auto-segmentation were produced by "different delineators (clinical experts)."
Guideline Adherence: The contouring guidelines followed were confirmed with the centers providing the data, ensuring consistency and adherence to established medical practices.
Expert Validation: The ground truth contours were provided by the centers and "validated by a second expert of the center."
Qualitative Evaluation: There was also a qualitative evaluation and validation of the contours to ensure clinical acceptability.
Retrospective Real-World Data: The data came from "real-world retrospective data which were initially used for treatment of cancer patients," ensuring clinical relevance.
For Synthetic-CTs: "Clinical evaluation as part of the 'truthing-process' guidelines followed to produce and validate the synthetic-CTs were extracted from the literature and confirmed with the centers which provided the data and helped in the performance evaluation." This involved "imaging metrics based comparison between synthetic-CTs and real planning CTs for the same patients."

Ask a Question

Ask a specific question about this device

K Number

K133227

Device Name

SMART SEGMENTATION - KNOWLEDGE BASED CONTOURING

Manufacturer

VARIAN MEDICAL SYSTEMS, INC.

Date Cleared

2014-03-14

(144 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K112778,K102011

Predicate For

K141248

Intended Use

Smart Segmentation Knowledge Based Contouring provides a combined atlas and model based approach for automated and manual segmentation of structures including target volumes and organs at risk to support the radiation therapy treatment planning process.

Device Description

Smart Segmentation - Knowledge Based Contouring is a software only product that provides a combined atlas and model based approach to automated segmentation of structures together with tools for manual contouring or editing of structures. A library of already contoured expert cases is provided which is searchable by anatomy, staging, or free text. Users also have the ability to add or modify expert cases to suit their clinical needs. Expert cases are registered to the target image and selected structures propagated. Smart Segmentation Knowledge Based Contouring supports inter and intra user consistency in contouring. This product also provides an anatomy atlas which gives examples of delineated organs for the whole upper body, as well as anatomy images and functional description for selectable structures.

AI/ML Overview

The provided 510(k) summary for Varian's Smart Segmentation Knowledge Based Contouring (K133227) is primarily focused on demonstrating substantial equivalence to a predicate device (K112778 and K102011) due to changes in existing features and the addition of new ones (support for 4D-CT data and a new algorithm for mandible segmentation). The document does not contain a detailed study demonstrating specific acceptance criteria with reported performance metrics in the format requested.

The document states "Verification testing was performed to demonstrate that the performance and functionality of the new and existing features met the design input requirements" and "Validation testing was performed on a production equivalent device, under clinically representative conditions by qualified personnel." However, the specific acceptance criteria, performance results, and details of these tests (like sample sizes, ground truth establishment, expert qualifications, etc.) are not included in the provided text.

Therefore, for most of the requested information, a direct answer cannot be extracted from the given input.

Here's a breakdown of what can and cannot be answered based on the provided text:

1. Table of acceptance criteria and the reported device performance

Cannot be provided. The document states that "performance and functionality of the new and existing features met the design input requirements" and "Results from Verification and Validation testing demonstrate that the product met defined user needs and defined design input requirements." However, specific numerical acceptance criteria (e.g., Dice similarity coefficient > 0.8) and the corresponding reported device performance values are not detailed.

2. Sample sized used for the test set and the data provenance (e.g. country of origin of the data, retrospective or prospective)

Cannot be provided. The document mentions "Validation testing was performed... under clinically representative conditions," but it does not specify the sample size of the test set, the country of origin of the data, or whether it was retrospective or prospective.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts (e.g. radiologist with 10 years of experience)

Cannot be provided. The document refers to "expert cases" in the context of the device's functionality (a library of already contoured expert cases), but it does not detail the number or qualifications of experts used to establish ground truth for validation testing of the device itself.

4. Adjudication method (e.g. 2+1, 3+1, none) for the test set

Cannot be provided. The document does not describe any adjudication methods used for establishing ground truth or evaluating the test set.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

Cannot be provided. The document does not mention an MRMC comparative effectiveness study or the effect size of AI assistance on human readers. The device is described as "supporting inter and intra user consistency in contouring," but no study is detailed to quantify this improvement.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Implicitly yes, but no details are provided. The device is described as having "Automated Structure Delineation" and a "new algorithm for segmentation of the mandible." The "Verification testing" and "Validation testing" would logically evaluate the performance of these automated functions, implying a standalone evaluation. However, no specific performance metrics or study details for this standalone performance are given.

7. The type of ground truth used (expert concensus, pathology, outcomes data, etc)

Implicitly expert contoured data, but no specific details for validation. The device itself uses a "library of already contoured expert cases." It is reasonable to infer that the ground truth for validation testing would also be based on expert contoured data, but the document does not explicitly state this for the validation set, nor does it specify if this was expert consensus, single expert, or another method.

8. The sample size for the training set

Cannot be provided. The document mentions a "library of already contoured expert cases" which is central to a "knowledge based" system. This library would constitute the training data (or knowledge base). However, the sample size of this library or training set is not specified.

9. How the ground truth for the training set was established

Implicitly by experts, but no specific details. The device uses a "library of already contoured expert cases." This implies the ground truth for these training cases was established by "experts." However, details on how these experts established this ground truth (e.g., number of experts, consensus process, qualifications) are not provided.

Ask a Question

Ask a specific question about this device

Page 1 of 1