Search Results

ART-Plan+'s indicated target population is cancer patients for whom radiotherapy treatment has been prescribed. In this population, any patient for whom relevant modality imaging data is available.

ART-Plan+'s includes several modules:

SmartPlan which allows automatic generation of radiotherapy treatment plan that the users import into their own Treatment Planning System (TPS) for the dose calculation, review and approval. This module is available for supported prescriptions for prostate only.
Annotate which allows automatic generation of contours for organs at risk, lymph nodes and tumors, based on medical practices, on medical images such as CT and MR images

ART-Plan+ is not intended to be used for patients less than 18 years of age.

The indicated users are trained medical professionals including, but not limited to, radiotherapists, radiation oncologists, medical physicists, dosimetrists and medical professionals involved in the radiation therapy process.

The indicated use environments include, but are not limited to, hospitals, clinics and any health facility offering radiation therapy.

Device Description

ART-Plan+ is a software platform allowing contour regions of interest on 3D images and to provide an automatic treatment plan. It includes several modules:

-Home: tasks

-Annotate and TumorBox: contouring of regions of interest

-SmartPlan: creation of an automatic treatment plan based on a planning CT and a RTSS

-Administration and settings : preferences management, user account management, etc.

-Institute Management: institute information, including licenses, list of users, etc.

-About: information about the software and its use, as well as contact details.

Annotate, TumorBox and SmartPlan are partially based on a batch mode, which allows the user to launch the operations of autocontouring and autoplanning without having to use the interface or the viewers. In that way, the software is completely integrated into the radiotherapy workflow and offer to the user a maximum of flexibility.

ART-Plan+ offers deep-learning based automatic segmentation of OARs and LNs for the following localizations:

-Head and neck (on CT images)

-Thorax/breast (on CT images)

-Abdomen (on CT and male on MR images)

-Pelvis male (on CT and MR images)

-Pelvis female (on CT images)

-Brain (on CT images and MR images)

ART-Plan+ offers deep-learning based automatic segmentation of targets for the following localizations:

-Brain (on MR images)

AI/ML Overview

Based on the provided text, here's a detailed breakdown of the acceptance criteria and the study that proves the device meets them:

1. Table of Acceptance Criteria and the Reported Device Performance:

The document describes five distinct types of evaluations with their respective acceptance criteria. While the exact "reported device performance" (i.e., the specific numerical results obtained for each metric) is not explicitly stated, the document uniformly concludes, "All validation tests were carried out using datasets representative of the worldwide population receiving radiotherapy treatments. Finally, all tests passed their respective acceptance criteria, thus showing ART-Plan + v3.0.0 clinical acceptability." This implies all reported device performances met or exceeded the criteria.

Study Type	Acceptance Criteria	Reported Device Performance (Implied)
Non-regression Testing of Autosegmentation of ORs	Mean DSC should not regress negatively between the current and last validated version of Annotate beyond a maximum tolerance margin set to -5% relative error.	Met
Qualitative Evaluation of Autosegmentation of ORs	Clinicians' qualitative evaluation of the auto-segmentation is considered acceptable for clinical use without modifications (A) or with minor modifications/corrections (B) with an A+B % above or equal to 85%.	Met
Quantitative Evaluation of Autosegmentation of ORs	Mean DSC (annotate) ≥ 0.8	Met
Inter-expert Variability Evaluation of Autosegmentation of ORs	Mean DSC (annotate) ≥ Mean DSC (inter-expert) with a tolerance margin of -5% of relative error.	Met
Quantitative Evaluation of Autosegmentation of Brain Metastasis	Lesion-wise sensitivity ≥ 0.86 AND Lesion-wise precision ≥ 0.70 AND Lesion-wise DSC ≥ 0.78 AND Patient-wise DSC ≥ 0.83 AND Patient-wise false positive (FP) ≤ 2.1	Met
Quantitative Evaluation of Autosegmentation of Glioblastoma	Sensitivity ≥ 0.80 AND DSC ≥ 0.76	Met
Quantitative and Qualitative Evaluation of Automatic Treatment Plans Generations	Quantitative: effectiveness difference (%) in DVH achieved goals between manual plans and automatic plans ≤ 5% AND Qualitative: % of clinical acceptable automatic plans ≥ 93% after expert review.	Met

2. Sample Sizes Used for the Test Set and the Data Provenance:

Non-regression Testing (Autosegmentation of ORs): Minimum sample size of 24 patients.
Qualitative Evaluation (Autosegmentation of ORs): Minimum sample size of 18 patients.
Quantitative Evaluation (Autosegmentation of ORs): Minimum sample size of 24 patients.
Inter-expert Variability Evaluation (Autosegmentation of ORs): Minimum sample size of 13 patients.
Quantitative Evaluation (Brain Metastasis, MR images): Minimum sample size of 51 patients.
Quantitative Evaluation (Glioblastoma, MR images): Minimum sample size of 43 patients.
Quantitative and Qualitative Evaluation (Automatic Treatment Plans): Minimum sample size of 20 patients.

Data Provenance: The document states, "All validation tests were carried out using datasets representative of the worldwide population receiving radiotherapy treatments." It does not specify the country of origin or whether the data was retrospective or prospective.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and the Qualifications of Those Experts:

The document refers to "medical experts" or "clinicians" for establishing ground truth and performing evaluations.

For the non-regression testing of autosegmentation, "manual contours performed by medical experts" were used.
For qualitative evaluation of autosegmentation, "medical experts" performed the qualitative evaluation.
For inter-expert variability evaluation of autosegmentation, "two independent medical experts" were asked to contour the same images.
For brain metastasis and glioblastoma segmentation, "contours provided by medical experts" were used for comparison.
For the evaluation of automatic treatment plans, "medical experts" determined the clinical acceptability.

The specific number of experts beyond "two independent" for inter-expert variability is not consistently provided, nor are their exact qualifications (e.g., specific specialties like "radiation oncologist" or years of experience). However, the stated users of the device include "trained medical professionals including, but not limited to, radiotherapists, radiation oncologists, medical physicists, dosimetrists and medical professionals involved in the radiation therapy process," implying these are the types of professionals who would serve as experts.

4. Adjudication Method for the Test Set:

For the inter-expert variability test, it involved comparing contours between two independent medical experts and with the software's contours. This implies a comparison rather than an explicit formal adjudication method (like 2+1 voting).
For other segmentation evaluations, the ground truth was "manual contours performed by medical experts" or "contours provided by medical experts." It's not specified if these were consensus readings, or if an adjudication method was used if multiple experts contributed to a single ground truth contour for a case.
For the automatic treatment plan qualitative evaluation, "expert review" is mentioned, but the number of reviewers or their adjudication process is not detailed.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:

The document describes studies that evaluate the standalone performance of the AI for segmentation and treatment planning, and how its performance compares to expert-generated contours/plans or inter-expert variability. It does not explicitly describe an MRMC comparative effectiveness study designed to measure the improvement of human readers with AI assistance versus without AI assistance. The focus is on the AI's performance relative to expert-defined ground truths or benchmarks.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) was done:

Yes, the studies are largely focused on standalone algorithm performance.

The "Non-regression testing," "Quantitative evaluation," and "Inter-expert variability evaluation" of autosegmentation explicitly compare the software's generated contours (algorithm only) against manual contours or inter-expert contours.
The "Quantitative evaluation of autosegmentation of Brain metastasis" and "Glioblastoma" assess the algorithm's performance (sensitivity, precision, DSC, FP) against expert-provided contours.
For "Automatic Treatment Plan Generations," the quantitative evaluation compares the algorithm's plans to manual plans, and the qualitative evaluation assesses the acceptance of the automatic plans by experts.

7. The Type of Ground Truth Used:

The primary ground truth relied upon in these studies is:

Expert Consensus/Manual Contours: This is repeatedly stated as "manual contours performed by medical experts" or "contours provided by medical experts."
Inter-expert Variability: For one specific study, the variability between two independent experts was used as a benchmark for comparison.
Manual Treatment Plans: For the treatment plan evaluation, manual plans served as a benchmark for quantitative comparison.

No mention of pathology or outcomes data as ground truth is provided.

8. The Sample Size for the Training Set:

The document does not specify the sample size for the training set. It only mentions the training of the algorithm (e.g., "retraining or algorithm improvement").

9. How the Ground Truth for the Training Set Was Established:

The document does not explicitly describe how the ground truth for the training set was established. It only states that the device uses "deep-learning based automatic segmentation," implying that it would have been trained on curated data with established ground truth, likely also generated by medical experts, but the specifics are not detailed in this excerpt.

Ask a Question

Ask a specific question about this device

K Number

K234068

Device Name

ART-Plan (v.2.2.0)

Manufacturer

Therapanacea SAS

Date Cleared

2024-04-22

(122 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K232479

Predicate For

N/A

Intended Use

ART-Plan's indicated target population is cancer patients for whom radiotherapy treatment has been prescribed. In this population, any patient for whom relevant modality imaging data is available.

ART-Plan is not intended to be used for patients less than 18 years of age.

The indicated use environments are, but not limited to, hospitals, clinics and any health facility involved in radiation therapy.

Device Description

ART-Plan is a software platform allowing users with an account to contour regions of interest on 3D images, perform multi-model registration of images, and help in the decision for the need for replanning based on contours and doses on daily images. It includes several modules:

Home,
Annotate,
SmartFuse,
AdaptBox,
Administration,
About

ART-Plan asks the user to work by project. It is necessary to create a "project" entity in the Patient page by associating it to a reference volume (preferably the positioning CT or the positioning MR) in order to create the contours on this volume in the Annotate module or to use this image as reference to compare it with CBCT images in the AdaptBox module. It is possible to create several projects for a given patient.

The Home module allows the user to search for a patient already present in the software's database or to import it from the structure's imaging servers or from another external source and to manage the different projects of this patient.

The Annotate module allows the user to contour regions of interest on the reference volume. It also allows generation of pseudo-CTs from MRI images. Users are able to visualize, evaluate and modify the HU values of the associated structures on the pseudo-CT, if needed. After validation, manual and automatic contours can be generated on the pseudo-CT images. Registration can also be performed using the pseudo-CT either as a target or source image. All results can be exported upon approval.

The SmartFuse module allows the user to fuse the primary volume of a project with secondary volumes.

The AdapBox module helps the user to decide if a replanning is necessary. For this purpose, the module allows the user to generate a pseudo-CT from a CBCT image, to auto-delineate regions of interest on the pseudo-CT, to compute the dose on both planning CT and pseudo-CT and then define if there is a need for replanning by comparing volume and dose metrics computed on both images and over the course of the treatment. Those metrics are defined by the user.

The Administration module allows users with specific rights to manage the platform's usage parameters. Some more restricted rights are also accessible in the drop-down menu linked to the user account through the Settings menu.

The About module allows the user to obtain information about the software and its use, as well as to contact TheraPanacea.

AI/ML Overview

Here's a summary of the acceptance criteria and study details for ART-Plan (v.2.2.0), based on the provided text:

Acceptance Criteria and Reported Device Performance

Note: The provided text lists multiple acceptance criteria for different aspects of the device (auto-segmentation, synthetic-CT generation, dose engine validation), and not all reported performance metrics are explicitly linked one-for-one to a single acceptance criterion in a consolidated table within the document. The table below attempts to synthesize the acceptance criteria and the general statement of "all tests passed their respective acceptance criteria" for the reported performance.

Test Type/Metric	Acceptance Criterion	Reported Device Performance
Auto-segmentation (Quantitative - DSC)	DSC (mean) ≥ 0.8 (AAPM) OR DSC (mean) ≥ 0.54 OR DSC (mean) ≥ mean (DSC inter-expert) + 5%	All organs included in the model passed at least one acceptance criterion. All tests passed respective acceptance criteria.
Auto-segmentation (Quantitative - HD95)	HD95 (mean) ≤ 5.75 mm	All tests passed respective acceptance criteria.
Auto-segmentation (Qualitative)	A+B % ≥ 85% (A: acceptable w/o modification, B: acceptable w/ minor modification)	All tests passed respective acceptance criteria.
Auto-segmentation (Non-regression)	Mean DSC should not regress negatively beyond -5% relative error	All tests passed respective acceptance criteria.
Auto-segmentation (US vs. nUS data)	Mean DSC (US) ≥ Mean DSC (nUS) AND/OR Mean HD95 (US) ≤ Mean HD95 (nUS)	All tests passed respective acceptance criteria.
Contour Propagation (Qualitative)	A+B % ≥ 85% (deformable) / ≥ 50% (rigid)	All tests passed respective acceptance criteria.
Synthetic-CT Dosimetric Validation	DVH parameters (PTV): < 2% DVH Individual Pass Rates (PTV): > 76.7% Median Gamma Index 2%/2mm: ≥ 92% Median Gamma Index 3%/3mm: ≥ 93.57%	All tests passed respective acceptance criteria.
Synthetic-CT Geometric/Anatomical	Jacobian Determinant = 1 +/- 5%	All tests passed respective acceptance criteria.
Dose Engine Validation	Relative differences on DVH parameters (PTV/OARs): ≤ 4.4% (Lungs ≤ 24.4%) Median Gamma Index 2%/2mm: ≥ 86.3% Median Gamma Index 3%/3mm: ≥ 91.75%	All tests passed respective acceptance criteria.

Study Details

Sample size used for the test set and the data provenance:
- Auto-segmentation (Quantitative & Non-regression): Minimum sample size of 17 patients per anatomical region (where applicable).
- Auto-segmentation (Qualitative): Minimum sample size of 15 patients per anatomical region.
- Auto-segmentation (US patient data performance comparison): Minimum sample size of 17 patients.
- Contour Propagation: Minimum sample size of 15 patients.
- Synthetic-CT Dosimetric Validation: 19 patients per supported anatomy.
- Synthetic-CT Geometric and Anatomic Validation: 19 patients per supported anatomy.
- Dose Engine Validation: 45 patients per supported anatomy.
- Data Provenance: "datasets representative of the worldwide population receiving radiotherapy treatments." (No specific countries or retrospective/prospective information is given, beyond "US patient data" for one specific comparison.)
Number of experts used to establish the ground truth for the test set and the qualifications of those experts:
- The term "medical experts" is used generally. No specific number of experts or detailed qualifications (e.g., years of experience, subspecialty certification) are provided for the ground truth establishment.
Adjudication method for the test set:
- Not explicitly stated. The ground truth seems to be established by "manual contours performed by medical experts," implying a single expert creation or a consensus if multiple experts were involved, but the method for consensus (e.g., 2+1, 3+1) is not detailed.
If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:
- No MRMC comparative effectiveness study involving human readers improving with AI assistance is described. The studies focus on the performance of the AI algorithm (auto-segmentation, synthetic-CT generation, dose calculation) against expert-generated ground truth or established clinical benchmarks.
If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:
- Yes, the described studies are primarily standalone performance evaluations of the ART-Plan (v2.2.0) algorithms. This includes auto-segmentation, synthetic-CT generation, and dose engine validation, all assessed against a ground truth. While there are "qualitative evaluations" by clinicians, these assess the output of the auto-segmentation or contour propagation, not a measure of human performance with or without the device.
The type of ground truth used:
- Expert Consensus: "manual contours performed by medical experts" is the primary ground truth for auto-segmentation and contour propagation assessments.
- Clinical Dosimetric Criteria: For dose engine validation and synthetic-CT dosimetric validation, comparisons are made against established clinical dosimetric criteria and dose distributions from the planning CT.
- Reference Imaging: For geometric/anatomic validation of synthetic-CT, comparisons seem to be made against the original CBCT using metrics like the Jacobian determinant.
The sample size for the training set:
- Not specified in the provided text. The document focuses on the validation of the device (ART-Plan v2.2.0) and mentions "retraining or algorithm improvement" but does not give details about the training set size or composition.
How the ground truth for the training set was established:
- Not specified in the provided text.

Ask a Question

Ask a specific question about this device

K Number

K232479

Device Name

ART-Plan

Manufacturer

TheraPanacea

Date Cleared

2023-12-22

(128 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K230023,K041403,K102011,K212294

Predicate For

K234068

Intended Use

ART-Plan's indicated target population is cancer patients for whom radiotherapy treatment has been prescribed. In this population, any patient for whom relevant modality imaging data is available.

ART-Plan is not intended for patients less than 18 years of age.

The indicated use environments are, but not limited to, hospitals, clinics and any health facility involved in radiation therapy.

Device Description

The ART-Plan application consists of three key modules: SmartFuse,Annotate and AdaptBox, allowing the user to display and visualise 3D multi-modal medical image data. The user may process, render, review, store, display and distribute DICOM 3.0 compliant datasets within the system and/or across computer networks.

Compared to Ethos Treatment, 2.1; Ethos Treatment Planning, 1.1 (primary predicate), the following additional feature has been added to ART-Plan v2.1.0:

generation of synthetic CT from MR images. This does not represent an additional . claim as the technological characteristics are the same and it does not raise different questions of safety and effectiveness. Also, this feature is already covered by reference and previous version of the device ART-Plan v1.10.1.

The ART-Plan technical functionalities claimed by TheraPanacea are the following:

. Proposing automatic solutions to the user, such as an automatic delineation, automatic multimodal image fusion, etc. towards improving standardization of processes/ performance / reducing user tedious / time consuming involvement.
. Offering to the user a set of tools to assist semi-automatic delineation, semi-automatic registration towards modifying/editing manually automatically generated structures and addina/removing new/undesired structures or imposing user-provided correspondences constraints on the fusion of multimodal images.
. Presenting to the user a set of visualization methods of the delineated structures, and registration fusion maps.
. Saving the delineated structures / fusion results for use in the dosimetry process.
. Enabling rigid and deformable registration of patients images sets to combine information contained in different or same modalities.
Allowing the users to generate, visualize, evaluate and modify pseudo-CT from MRI and CBCT images.
. Allowing the users to generate, visualize and analyze dose on images of CT modality (only within the AdatpBox workflow)
. Presenting to the user metrics to define if there is a need for replanning or not.

AI/ML Overview

The provided document describes the acceptance criteria and the study that proves the ART-Plan device meets these criteria across its various modules (Autosegmentation, SmartFuse, AdaptBox, Synthetic-CT generation, and Dose Engine).

Here's a breakdown of the requested information:

1. Table of Acceptance Criteria and Reported Device Performance

Autosegmentation Tool

Acceptance Criteria Type	Acceptance Criteria	Reported Device Performance
Quantitative (DSC)	a) DSC (mean) ≥ 0.8 (AAPM criterion) ORb) DSC (mean) ≥ 0.54 (inter-expert variability) OR DSC (mean) ≥ mean (DSC inter-expert) + 5%	Duodenum: DICE diff inter-expert = 1.32% (Passed)Large bowel: DICE diff inter-expert = 1.19% (Passed)Small bowel: DICE diff inter-expert = 2.44% (Passed)
Qualitative (A+B%)	A+B % ≥ 85% (A: acceptable without modification, B: acceptable with minor modifications/corrections, C: requires major modifications)	Right lacrimal gland: A+B = 100% (Passed)Left lacrimal gland: A+B = 100% (Passed)Cervical lymph nodes VIA: A+B = 97% (Passed)Cervical lymph nodes VIB: A+B = 100% (Passed)Pharyngeal constrictor muscle: A+B = 100% (Passed)Anal canal: A+B = 98.68% (Passed)Bladder: A+B = 93.42% (Passed)Left femoral head: A+B = 100% (Passed)Right femoral head: A+B = 100% (Passed)Penile bulb: A+B = 96.05% (Passed)Prostate: A+B = 92.10% (Passed)Rectum: A+B = 100% (Passed)Seminal vesicle: A+B = 94.59% (Passed)Sigmoid: A+B = 98.68% (Passed)

SmartFuse Module (Image Registration)

Acceptance Criteria Type	Acceptance Criteria	Reported Device Performance
Quantitative (DSC)	a) DSC (mean) ≥ 0.81 (AAPM criterion) ORb) DSC (mean) ≥ 0.65 (benchmark device)	No specific DSC performance values are directly listed for SmartFuse, but the qualitative evaluations imply successful registration leading to acceptable contours.
Qualitative (A+B%)	Propagated Contours: A+B% ≥ 85% for deformable, A+B% ≥ 50% for rigid.Overall Registration Output: A+B% ≥ 85% for deformable, A+B% ≥ 50% for rigid.	for tCBCT - sCT (Overall Registration Output): Rigid: A+B%=95.56% (Passed); Deformable: A+B%=97.78% (Passed)for tsynthetic-CT - sCT (Propagated Contours): Deformable: A+B%=94.06% (Passed)for tCT - sSCT (Overall Registration Output): Rigid: A+B%=70.37% (Passed)
Geometric	2) Jacobian Determinant must be positive.3) Target Registration Error (TRE) < 2mm (POPI database)	Not explicitly detailed in the performance summary table, but the document states "The results show that both types of fusion algorithms (Rigid & Deformable) in SmartFuse pass the performed tests, and provide valid results for further clinical use in radiotherapy."

Synthetic-CT Generation (from MR images)

Acceptance Criteria Type	Acceptance Criteria	Reported Device Performance
Gamma Passing Criteria	a) Median 2%/2mm gamma passing criteria ≥ 95%b) Median 3%/3mm gamma passing criteria ≥ 99.0%	Not explicitly listed in the performance table, but the document implies meeting criteria based on overall "Passed" status for AdaptBox functionality.
Mean Dose Deviation	c) Mean dose deviation (synthetic-CT compared to standard CT) ≤ 2% in ≥ 88% of patients	Not explicitly listed in the performance table, but the document implies meeting criteria based on overall "Passed" status for AdaptBox functionality.

Synthetic-CT Generation (from CBCT images)

Acceptance Criteria Type	Acceptance Criteria	Reported Device Performance
Gamma Passing Criteria	a) Median 2%/2mm gamma passing criteria ≥ 92%b) Median 3%/3mm gamma passing criteria ≥ 93.57%	Gamma index: 2%/2mm = 98.85 (Passed); 3%/3mm = 99.43 (Passed)
Mean Dose Deviation	c) Mean dose deviation (synthetic-CT compared to standard CT) ≤ 2% in ≥ 76.7% of patients	DVH parameters (PTV): < 0.199% in 100% of the cases (Passed)

Dose Engine Function (within AdaptBox)

Acceptance Criteria Type	Acceptance Criteria	Reported Device Performance
Dose Deviation	a) DVH or median dose deviation ≤ 24.4% for Lung tissue or ≤ 4.4% for other organs	DVH parameters (PTV): < 1.29% (Passed)DVH parameters (OARs): < 3.9% (Passed)
Gamma Passing Criteria	b) Median 2%/2mm gamma passing rate ≥ 86.3%c) Median 3%/3mm gamma passing criteria ≥ 91.75%	Gamma index: 2%/2mm = 97.22% (Passed); 3%/3mm = 99.50% (Passed)

2. Sample sizes used for the test set and data provenance

The document provides "Min sample size for evaluation method" for various tests, and then states the actual "Sample size" used, which generally exceeds the minimum.

Autosegmentation Tool:
- Quantitative (DSC) tests (Duodenum, Large bowel, Small bowel): Sample size = 25 for each, Data Provenance: Retrospective, actual clinical data (implied from "real-world retrospective data which were initially used for treatment of cancer patients"). Country of origin is not explicitly stated but implied to be diverse based on training data representing "market share of the different vendors in EU & USA".
- Qualitative (A+B%) tests (various organs): Sample sizes range from 20 to 38 for different organs. Data Provenance: Retrospective, clinical data. Country of origin: Not explicitly stated, but includes both US and non-US data for performance evaluation.
SmartFuse Module:
- tCBCT - sCT: Sample size = 45. Data Provenance: Retrospective, clinical data.
- tsynthetic-CT - sCT: Sample size = 30. Data Provenance: Retrospective, clinical data.
- tCT - sSCT: Sample size = 27. Data Provenance: Retrospective, clinical data.
Synthetic-CT Generation (from CBCT):
- Sample size = 20. Data Provenance: Retrospective, clinical data.
Dose Engine Function:
- Sample size = 272 total (Brain: 42, H&N: 70, Chest: 44, Breast: 26, Pelvis: 90). Data Provenance: Retrospective, clinical data.

All data for testing derived from "real-world retrospective data which were initially used for treatment of cancer patients." The document emphasizes that the data was pseudo-anonymized and collected from various centers, with efforts to ensure representation of "market share of the different vendors in EU & USA" and equivalent performance between non-US and US populations, suggesting a multi-national provenance for both training and testing.

3. Number of experts used to establish the ground truth for the test set and their qualifications

The document states ground truth contours were produced by "different delineators (clinical experts)" and assessment of "intervariability," and "ground truth contours provided by the centers and validated by a second expert of the center." It also mentions "qualitative evaluation and validation of the contours."

Number of Experts: Not a fixed number, but implies multiple "clinical experts" for initial contouring and a "second expert" for validation. For qualitative evaluations in the performance tests, the column "Qualitative evaluation by experts" is used, implying multiple experts participate in the A/B/C scoring. For instance, usability testers are described as "European medical physicists who have participated in the evaluation have at least an equivalent expertise level compared to a junior US medical physicist (MP), and responsibilities in the radiotherapy clinical workflow are equivalent."
Qualifications: "Clinical experts," "medical physicists," "radiation oncologists," and "dosimetrists." The specific number for each test isn't enumerated, but it implies a panel of qualified professionals for qualitative assessment and ground truth establishment.

4. Adjudication method for the test set

The ground truth establishment involved a mix of:

Contouring guidelines confirmed with data-providing centers.
Data created by "different delineators (clinical experts)."
Assessment of inter-variability among experts.
Ground truth contours provided by centers and "validated by a second expert of the center."
Qualitative evaluation and validation of contours by experts (A, B, C scale). This qualitative assessment serves as a form of adjudication for the performance of the device against expert opinion.

This suggests a consensus-based approach with a "2+1" or similar structure where two initial delineations (or one and a validation by a second expert) contribute to building the ground truth, which is then further evaluated qualitatively. No explicit numerical "2+1" or "3+1" is given, but the description aligns with a multi-reader, consensus-driven process.

5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, and its effect size

The document does not explicitly present a formal MRMC comparative effectiveness study demonstrating how much human readers improve with AI vs. without AI assistance. The studies focus on the performance of the device itself (standalone, or how its output (e.g., auto-segmentations, registrations) is perceived by human experts.

The qualitative evaluations done by experts (A+B% scores) are a form of assessment of the AI's output by multiple readers/experts on multiple cases, but they do not directly measure improvement of human reader performance with AI assistance compared to performance without AI assistance. The qualitative evaluations are on the AI's output, not on the human reader's workflow with AI.

Therefore, no effect size for human reader improvement with AI assistance is provided.

6. If a standalone (i.e., algorithm only without human-in-the-loop performance) was done

Yes, standalone performance was extensively evaluated. The entire set of acceptance criteria and performance study results (DSC scores, gamma passing rates, mean dose deviations, and qualitative evaluation of the AI-generated contours/registrations) directly pertain to the algorithm's performance without a human in the loop during the generation process. The human element comes into play for evaluating the algorithm's output, not for assisting its real-time operation in these specific performance metrics.

For example, the auto-segmentation tests measure the quality of contours generated solely by the AI model. The synthetic-CT generation and dose engine also measure the performance of the algorithm itself.

7. The type of ground truth used

The ground truth used is a combination of:

Expert Consensus/Delineation: For auto-segmentation, ground truth contours were established by "clinical experts," often validated by "a second expert of the center," and confirmed with "contouring guidelines." This aligns with expert consensus.
Imaging Metrics/Physical Measurement Comparisons: For synthetic-CT generation, ground truth involved comparison to "real planning CTs for the same patients," coupled with dose calculations and gamma passing criteria, which are objective quantitative metrics. For the dose engine, measurements on a Linac were compared with the dose engine results.

8. The sample size for the training set

The training set sizes are provided separately for different functionalities:

Auto-segmentation tool: 246,226 samples (corresponding to 8,950 total patients).
Synthetic-CT from MR images: 6,195 samples.
Synthetic-CT from CBCT images: 1,467 samples.

The document clarifies that "one patient can be associated with more images (e.g. CT, MR) and that each image (anatomy) has the delineation of several structures (OARs and lymph nodes) which increases the number of samples used for training and validation."

9. How the ground truth for the training set was established

The ground truth for the training set was established through a rigorous process involving:

Clinical Expert Delineation: Contours for auto-segmentation were produced by "different delineators (clinical experts)."
Guideline Adherence: The contouring guidelines followed were confirmed with the centers providing the data, ensuring consistency and adherence to established medical practices.
Expert Validation: The ground truth contours were provided by the centers and "validated by a second expert of the center."
Qualitative Evaluation: There was also a qualitative evaluation and validation of the contours to ensure clinical acceptability.
Retrospective Real-World Data: The data came from "real-world retrospective data which were initially used for treatment of cancer patients," ensuring clinical relevance.
For Synthetic-CTs: "Clinical evaluation as part of the 'truthing-process' guidelines followed to produce and validate the synthetic-CTs were extracted from the literature and confirmed with the centers which provided the data and helped in the performance evaluation." This involved "imaging metrics based comparison between synthetic-CTs and real planning CTs for the same patients."

Ask a Question

Ask a specific question about this device

K Number

K230023

Device Name

ART-Plan

Manufacturer

Therapanacea SAS

Date Cleared

2023-04-19

(105 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K220813

Predicate For

N/A

Intended Use

ART-Plan is indicated for cancer patients for whom radiation treatment has been planned. It is intended to be used by trained medical professionals including, but not limited to, radiologists, radiation oncologists, dosimetrists, and medical physicists.

ART-Plan is a software application intended to display and visualize 3D multi-modal medical image data. The user may import, define, display, transform and store DICOM3.0 compliant datasets (including regions of interest structures). These images, contours and objects can subsequently be exported/distributed within the system, across computer networks and/or to radiation treatment planning systems. Supported modalities include CT, PET-CT, CBCT, 4D-CT and MR images.

ART-Plan supports Al-based contouring on CT and MR images and offers semi-automatic and manual tools for segmentation.

To help the user assess changes in image data and to obtain combined multi-modal image information, ART-Plan allows the registration of anatomical and functional images and display of fused and non-fused images to facilitate the comparison of patient image data by the user.

With ART-Plan, users are also able to generate, visualize, evaluate and modify pseudo-CT from MRI images.

Device Description

The ART-Plan application consists of two kev modules: SmartFuse and Annotate, allowing the user to display and visualize 3D multi-modal medical image data. The user may process, render, review, store, display and distribute DICOM 3.0 compliant datasets within the system and/or across computer networks. Supported modalities cover static and gated CT (computerized tomography including CBCT and 4D-CT), PET (positron emission tomography) and MR (magnetic resonance).

The ART-Plan technical functionalities claimed by TheraPanacea are the following:

Proposing automatic solutions to the user, such as an automatic delineation, automatic . multimodal image fusion, etc. towards improving standardization of processes/ performance / reducing user tedious / time consuming involvement.
. Offering to the user a set of tools to assist semi-automatic delineation, semi-automatic reqistration towards modifying/editing manually automatically generated structures and adding/removing new/undesired structures or imposing user-provided correspondences constraints on the fusion of multimodal images.
. Presenting to the user a set of visualization methods of the delineated structures, and registration fusion maps.
. Saving the delineated structures / fusion results for use in the dosimetry process.
Enabling rigid and deformable registration of patients images sets to combine information contained in different or same modalities.
. Allowing the users to generate, visualize, evaluate and modify pseudo-CT from MRI images.

ART-Plan offers deep-learning based automatic segmentation for the following localizations:

head and neck (on CT images) ●
thorax/breast (for male/female and on CT images) ●
abdomen (on CT images and MR images) ●
pelvis male(on CT images and MR images) ●
pelvis female (on CT images) ●
brain (on CT images and MR images)

ART-Plan offers deep-learning based synthetic CT-generation from MR images for the following localizations:

. pelvis male
brain

AI/ML Overview

The provided text describes the acceptance criteria and the study conducted to prove that the ART-Plan v1.10.1 device meets these criteria. Note that this submission is a Special 510(k) for modifications to an already cleared device (ART-Plan v1.10.0), focusing on the addition of 48 new structures to existing localizations and 8 bug fixes. The performance studies primarily validate these new structures.

Here's the detailed breakdown:

1. Table of Acceptance Criteria and Reported Device Performance

The device ART-Plan v1.10.1 is an AI-based contouring tool. The acceptance criteria and reported performance for the new structures are categorized into two main types: quantitative (using Dice Similarity Coefficient) and qualitative.

For Auto-segmentation Models (New Structures):

Acceptance Criteria Type	Acceptance Criteria	Reported Device Performance (Examples from Table 4)	Pass/Fail
Quantitative	a) DSC (mean) ≥ 0.8 (AAPM standard)	(Not explicitly shown for new structures, but implied passed)	Pass
	b) DSC (mean) ≥ 0.54 OR DSC (mean) ≥ mean (DSC inter-expert) + 5%	Carina: DICE diff inter-expert = 6.58%	Pass
		Lad coronary: DICE diff inter-expert = 15.56%	Pass
		Left bronchia: DICE diff inter-expert = 14.75%	Pass
		Right cochlea: DICE diff inter-expert = 29.22%	Pass
Qualitative	c) A+B % ≥ 85% (clinically acceptable without modifications or with minor corrections)	Ascending aorta: A+B = 100%	Pass
		Left atrium: A+B = 100%	Pass
		Left main coronary artery: A+B = 93%	Pass
		Sigmoid: A+B = 100%	Pass

For Synthetic-CT Generation Tool (General, not specifically for new features in this submission):

Acceptance Criteria Type	Acceptance Criteria	Reported Device Performance	Pass/Fail
Quantitative	a) A median 2%/2mm gamma passing criteria of ≥95%	(Not explicitly shown in this document, but implied passed for prior clearance)	Pass
	b) A median 3%/3mm gamma passing criteria of ≥99.0%	(Not explicitly shown in this document, but implied passed for prior clearance)	Pass
	c) A mean dose deviation (pseudo-CT compared to standard CT) of ≤2% in ≥88% of patients	(Not explicitly shown in this document, but implied passed for prior clearance)	Pass

2. Sample Size Used for the Test Set and Data Provenance

The document indicates that for the new structures, the sample sizes for the test set varied:

For quantitative evaluations (Dice difference inter-expert):
- Minimum sample size for evaluation method: 20
- Reported sample size for most structures (e.g., Carina, Lad coronary, Left bronchia, Right cochlea): 33
- Reported sample size for some Brain T1 (MR) structures (e.g., Anterior cerebellum, Left cochlea): 30
For qualitative evaluations (A+B %):
- Minimum sample size for evaluation method: 15
- Reported sample size for most structures (e.g., Ascending aorta, Left atrium, Left main coronary artery): 20
- Reported sample size for Left cervical lymph node IVB, Right cervical lymph node IVB: 15
- Reported sample size for Sigmoid: 30

Data Provenance: The data used for training and testing are described as "real-world retrospective data which were initially used for treatment of cancer patients." The document mentions that the data originated from various centers, with a statistical analysis of imaging vendors in EU & USA to represent the market share. It also states that the data demographic distribution (gender, age) aligns with cancer incidence statistics in the US, UK, and globally.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications

The document explicitly states that the "truthing" process "includes a mix of data created by different delineators (clinical experts) and assessment of intervariability, ground truth contours provided by the centers and validated by a second expert of the center, and qualitative evaluation and validation of the contours."

Number of Experts: For the inter-expert variability comparison, at least two experts are implied (one for ground truth, and comparison to other delineators or a second expert validation). For qualitative evaluations, "experts" (plural) are mentioned.
Qualifications of Experts: The document states "trained medical professionals including, but not limited to, radiation oncologists, dosimetrists, and medical physicists." The ground truth contours were "provided by the centers and validated by a second expert of the center," indicating a high level of clinical expertise.

4. Adjudication Method for the Test Set

The adjudication method is implied to be a form of expert consensus or validation. The "truthing process" includes:

"data created by different delineators (clinical experts)"
"assessment of intervariability"
"ground truth contours provided by the centers and validated by a second expert of the center"
"qualitative evaluation and validation of the contours"

This suggests that for creating the reference standard, multiple experts contributed, and a validation step often involving a second expert was performed. For comparing the AI model's performance to human experts, it was compared to "inter-expert variability" or validated qualitatively by "experts." This is not a strict "2+1" or "3+1" for every single case, but rather a process involving consensus, validation, and inter-variability analysis among clinical experts for establishing the ground truth and for evaluating the AI's performance against that truth and against other expert interpretations.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No explicit Multi-Reader Multi-Case (MRMC) comparative effectiveness study comparing human readers with AI assistance vs. without AI assistance is detailed in the provided text. The studies focus on the standalone performance of the AI model against the established ground truth and inter-expert variability.

6. Standalone Performance Study

Yes, a standalone performance study was done for the algorithm without human-in-the-loop. The tables and descriptions of acceptance criteria and results (Dice Similarity Coefficient, A+B% for qualitative evaluation) directly assess the performance of the AI-based contouring (Annotate module) in generating contours.

7. Type of Ground Truth Used

The ground truth used is primarily expert consensus/delineation. It is described as:

"data created by different delineators (clinical experts)"
"ground truth contours provided by the centers and validated by a second expert of the center"
"qualitative evaluation and validation of the contours"

The contouring guidelines followed were confirmed with the data-providing centers, and the process aimed to be representative of delineation practice across centers and international guidelines.

8. Sample Size for the Training Set

Training samples: 299,142
Validation samples: 75,018
Total samples: 374,160

Although the total number of samples is 374,160, the document clarifies that "The total number of patients used for training (8736) is lower than the number of samples (374160)." This indicates that one patient can contribute to multiple images and multiple structures, leading to a higher number of "samples" for training an AI model.

9. How the Ground Truth for the Training Set Was Established

The ground truth for the training set was established through "real-world retrospective data," where contours were generated by clinical experts. The process included:

Contouring guidelines confirmed with data-providing centers.
A mix of data created by different delineators (clinical experts).
Ground truth contours provided by the centers and validated by a second expert of the center.
Qualitative evaluation and validation of the contours to ensure representativeness of delineation practice and adherence to international guidelines.

This rigorous process aimed to account for expert annotation variability and ensure the training data was clinically relevant and accurate.

Ask a Question

Ask a specific question about this device

K Number

K220813

Device Name

ART-PLAN

Manufacturer

TheraPanacea

Date Cleared

2022-06-17

(88 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K210632,K071964,K182888,K193109,K173635,K202700

Predicate For

K230023

Intended Use

ART-Plan is indicated for cancer patients for whom radiation treatment has been planned. It is intended to be used by trained medical professionals including, but not limited to, radiation oncologists, dosimetrists, and medical physicists.

ART-Plan supports Al-based contouring on CT and MR images and offers semi-automatic and manual tools for segmentation.

With ART-Plan, users are also able to generate, visualize, evaluate and modify pseudo-CT from MRI images.

Device Description

The ART-Plan application is comprised of two key modules: SmartFuse and Annotate, allowing the user to display and visualize 3D multi-modal medical image data. The user may process, render, review, store, display and distribute DICOM 3.0 compliant datasets within the system and/or across computer networks. Supported modalities cover static and gated CT (computerized tomography including CBCT and 4D-CT), PET (positron emission tomography) and MR (magnetic resonance).

Compared to ART-Plan v1.6.1 (primary predicate), the following additional features have been added to ART-Plan v1.10.0:

· an improved version of the existing automatic segmentation tool
· automatic segmentation on more anatomies and organ-at-risk
image registration on 4D-CT and CBCT images .
automatic segmentation on MR images .
· generate synthetic CT from MR images
a cloud-based deployment

The ART-Plan technical functionalities claimed by TheraPanacea are the following:

. Proposing automatic solutions to the user, such as an automatic delineation, automatic multimodal image fusion, etc. towards improving standardization of processes/ performance / reducing user tedious / time consuming involvement.
. Offering to the user a set of tools to assist semi-automatic delineation, semi-automatic registration towards modifying/editing manually automatically generated structures and adding/removing new/undesired structures or imposing user-provided correspondences constraints on the fusion of multimodal images.
. Presenting to the user a set of visualization methods of the delineated structures, and registration fusion maps.
. Saving the delineated structures / fusion results for use in the dosimetry process.
. Enabling rigid and deformable registration of patients images sets to combine information contained in different or same modalities.
Allowing the users to generate, visualize, evaluate and modify pseudo-CT from MRI images.

ART-Plan offers deep-learning based automatic segmentation for the following localizations:

head and neck (on CT images) .
. thorax/breast (for male/female and on CT images)
abdomen (on CT images and MR images) ●
. pelvis male(on CT images and MR images)
. pelvis female (on CT images)
brain (on CT images and MR images)

ART-Plan offers deep-learning based synthetic CT-generation from MR images for the following localizations:

pelvis male .
brain

AI/ML Overview

Here's a summary of the acceptance criteria and study details for the ART-Plan device, extracting information from the provided text:

Acceptance Criteria and Device Performance

Criterion Category	Acceptance Criteria	Reported Device Performance
Auto-segmentation - Dice Similarity Coefficient (DSC)	DSC (mean) ≥ 0.8 (AAPM standard) OR DSC (mean) ≥ 0.54 or DSC (mean) ≥ mean(DSC inter-expert) + 5% (inter-expert variability)	Multiple tests passed demonstrating acceptable contours, exceeding AAPM standards in some cases (e.g., Abdo MRI auto-segmentation), and meeting or exceeding inter-expert variability for others (e.g., Brain MR, Pelvis MRI). For Brain MRI, initially some organs did not meet 0.8 but eventually passed with further improvements and re-evaluation against inter-expert variability. All organs for all anatomies met at least one acceptance criterion.
Auto-segmentation - Qualitative Evaluation	Clinicians' qualitative evaluation of auto-segmentation is considered acceptable for clinical use without modifications (A) or with minor modifications/corrections (B), with A+B % ≥ 85%.	For all tested organs and anatomies, the qualitative evaluation resulted in A+B % ≥ 85%, indicating that clinicians found the contours acceptable for clinical use with minor or no modifications. For example, Pelvis Truefisp model achieved ≥ 85% A or B, and H&N Lymph nodes also met this.
Synthetic-CT Generation	A median 2%/2mm gamma passing criteria of ≥ 95% OR A median 3%/3mm gamma passing criteria of ≥ 99.0% OR A mean dose deviation (pseudo-CT compared to standard CT) of ≤ 2% in ≥ 88% of patients.	For both pelvis and brain synthetic-CT, the performance met these acceptance criteria and demonstrated non-inferiority to previously cleared devices.
Fusion Performance	Not explicitly stated with numerical thresholds, but evaluated qualitatively.	Both rigid and deformable fusion algorithms provided clinically acceptable results for major clinical use cases in radiotherapy workflows, receiving "Passed" in all relevant studies.

Study Details

Sample Size used for the test set and the data provenance:
- Test Set Sample Size: The exact number of patients in the test set is not explicitly given as a single number but is stated that for structures of a given anatomy and modality, two non-overlapping datasets were separated: test patients and train data. The number of test patients was "selected based on thorough literature review and statistical power."
- Data Provenance: Real-world retrospective data, initially used for treatment of cancer patients. Pseudo-anonymized by the centers providing data before transfer. Data was sourced from both non-US and US populations.
Number of experts used to establish the ground truth for the test set and the qualifications of those experts:
- Number of Experts: Varies. For some tests (e.g., Abdo MRI auto-segmentation, Brain MRI autosegmentation, Pelvis MRI auto-segmentation), at least 3 different experts were involved for inter-expert variability calculations. For the qualitative evaluations, it implies multiple clinicians or medical physicists.
- Qualifications of Experts: Clinical experts, medical physicists (for validation of usability and performance tests) with expertise level comparable to a junior US medical physicist and responsibilities in the radiotherapy clinical workflow.
Adjudication method for the test set:
- The document describes a "truthing process [that] includes a mix of data created by different delineators (clinical experts) and assessment of intervariability, ground truth contours provided by the centers and validated by a second expert of the center, and qualitative evaluation and validation of the contours." This suggests a multi-reader approach, potentially with consensus or an adjudicator for ground truth, but a specific "2+1" or "3+1" method is not detailed. The "inter-expert variability" calculation implies direct comparison between multiple experts' delineations of the same cases.
If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:
- A direct MRMC comparative effectiveness study with human readers improving with AI vs without AI assistance is not explicitly described in the provided text. The studies focus on the standalone performance of the AI algorithm against established criteria (AAPM, inter-expert variability, qualitative acceptance) and non-inferiority to other cleared devices.
If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:
- Yes, a standalone performance evaluation of the algorithm was done. The acceptance criteria and performance data are entirely based on the algorithm's output (e.g., DSC, gamma passing criteria, dose deviation) compared to ground truth or existing standards, and qualitative assessment by experts of the algorithm's generated contours.
The type of ground truth used (expert consensus, pathology, outcomes data, etc.):
- The ground truth used primarily involved:
  - Expert Consensus/Delineation: Contours created by different clinical experts and assessed for inter-variability.
  - Validated Ground Truth Contours: Contours provided by the centers and validated by a second expert from the same center.
  - Qualitative Evaluation: Clinical review and validation of contours.
  - Dosimetric Measures: For synthetic-CT; comparison to standard CT dose calculations.
The sample size for the training set:
- Training Patients: 8,736 patients.
- Training Samples (Images/Anatomies/Structures): 299,142 samples. (One patient can have multiple images, and each image multiple delineated structures).
How the ground truth for the training set was established:
- "The contouring guidelines followed to produce the contours were confirmed with the centers which provided the data. Our truthing process includes a mix of data created by different delineators (clinical experts) and assessment of intervariability, ground truth contours provided by the centers and validated by a second expert of the center, and qualitative evaluation and validation of the contours." This indicates that the ground truth for the training set was established through a combination of expert delineation, internal validation by a second expert, adherence to established guidelines, and assessment of variability among experts.

Ask a Question

Ask a specific question about this device

K Number

K202700

Device Name

ART-Plan

Manufacturer

TheraPanacea

Date Cleared

2021-01-14

(120 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K191928,K130393,K181572,K071964

Predicate For

K220813

Intended Use

ART-Plan is a software designed to assist the contouring process of the target anatomical regions on 3D-images of cancer patients for whom radiotherapy treatment has been planned.

The SmartFuse module allows the user to register combinations of anatomical and functional images and display them with fused and non-fused displays to facilitate the comparison and delineation of image data by the user.

The images created with rigid or elastic registrations, potential modifications, and then the validation of a trained user with professional qualifications in anatomy and radiotherapy.

With the Annotate module, users can edit manually and semi-automatically the contours for the regions of interest. It also allows to generate automatically, and based on medical practices, the contours for the organs at risk and healthy lymph nodes on CT images.

The contours created automatically, semi-automatically or manually require verifications, and then the validation of a trained user with professional qualifications in anatomy and radiotherapy.

The device is intended to be used in a clinical setting, by trained professionals only.

Device Description

The ART-Plan application is comprised of two kev modules: SmartFuse and Annotate, allowing the user to display and visualize 3D multi-modal medical image data. The user may process, render, review, store, display and distribute DICOM 3.0 compliant datasets within the system and/or across computer networks. Supported modalities include static and gated CT (computerized tomography), PET (positron emission tomography), and MR (magnetic resonance).

The overview of the product, in terms of input/output, functionalities and integration within the current clinical workflow for radiation therapy planning.

The ART-Plan technical functionalities claimed by TheraPanacea are the following:

. Proposing automatic solutions to the user, such as an automatic delineation, automatic multimodal image fusion, etc. towards improving standardization of processes/ performance / reducing user tedious / time consuminq involvement.
. Offering to the user a set of tools to assist semi-automatic delineation, semi-automatic registration towards modifying/editing manually automatically generated structures and addinq/removing new/undesired structures or imposing user-provided correspondences constraints on the fusion of multimodal images.
. Presenting to the user a set of visualization methods of the delineated structures, and registration fusion maps.
. Saving the delineated structures / fusion results for use in the dosimetry process.
. Enabling rigid and deformable registration of patients images sets to combine information contained in different or same modalities.

AI/ML Overview

Here's an analysis of the acceptance criteria and supporting studies for the ART-Plan device, based on the provided FDA 510(k) summary:

1. Table of Acceptance Criteria and Reported Device Performance

The provided document details various performance tests but does not explicitly state quantitative "acceptance criteria" alongside "reported device performance" in a structured table. Instead, it describes tests performed and their general outcome ("Passed"). The "results" column indicates whether the device met the implicit expectations for each test.

Therefore, I've created a table summarizing the tests described and their reported outcomes, which implicitly serve as the device meeting performance expectations:

Test Name	Test Description	Reported Device Performance (Implicit Acceptance)
Usability Testing	Assessment for compliance with IEC 62366.	Passed
Autosegmentation performances (European data)	Study gathering information on 3 tests performed on automatic segmentation performances on European data.	Passed
Autosegmentation performances according to AAPM requirements	Demonstrated that the auto-segmentation algorithm of the Annotate module provides acceptable contours for the concerned structures on an image of a patient.	Passed
Autosegmentation performances against MIM	Demonstrated that the auto-segmentation algorithm of the Annotate module provides acceptable contours for the concerned structures on an image of a patient.	Passed
Qualitative validation of autosegmentation performances	Demonstrated that the auto-segmentation algorithm of the Annotate module provides acceptable contours for the concerned structures on an image of a patient.	Passed
External Contour performances according to AAPM requirements	Demonstrated that the External Contour Automatic Segmentation algorithm of the Annotate module provides acceptable contours for the patient's body on an image of a patient.	Passed
Fusion performances according to AAPM recommendations	Evaluated the quality of the rigid and deformable registration tools of the SmartFuse module on retrospective intra-patient images and inter-patient images of different modalities, to ensure the safety of the device for clinical use.	Passed
Registration performances on POPI-model	Evaluated the quality of the deformable registration tools of the SmartFuse module on intra-patient CT images. Testing was conducted according to POPI-model protocol on corresponding public data.	Passed
Autosegmentation performances on US data	Demonstrated that the autosegmentation algorithm of the Annotate module provides clinically acceptable contours for the concerned structures when applied to US patients.	Passed
Pilot study for sample size estimation - literature review	Pilot study estimating a consistent sample size of dataset for performance testing's considering state-of-art studies in image registration and segmentation. Literature review completed on most cited articles in the field of medical vision.	Passed
System Verification and Validation Testing	Verification and validation testing performed to verify the software of the ART-Plan, following FDA's Guidance for Industry and FDA Staff, "Guidance for the Content of Premarket Submissions for Software Contained in Medical Devices." The software was considered a "major" level of concern due to potential for serious injury or death from failure or misapplication.	Passed

2. Sample Size Used for the Test Set and Data Provenance

Autosegmentation performances (European data): The document mentions "3 tests performed on the automatic segmentation performances on European data." No specific sample size (number of patients or images) is given for these tests. The provenance is explicitly stated as European data. The retrospective/prospective nature is not specified, but typically such performance evaluations on existing data are retrospective.
Autosegmentation performances on US data: This test explicitly states that the algorithm was applied to US patients. No specific sample size (number of patients or images) is provided. The retrospective/prospective nature is not specified.
Fusion performances according to AAPM recommendations & Registration performances on POPI-model: These studies evaluated various registration qualities. No specific sample size (number of patients or images) is provided for these tests, although the POPI-model test indicates "corresponding public data." The provenance for the AAPM fusion test includes "retrospective intra-patient images and inter-patient images," suggesting retrospective data.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications

The document does not explicitly state the number of experts used to establish ground truth for the test sets or their specific qualifications (e.g., "radiologist with 10 years of experience").
It does mention that the device's output (contours created automatically, semi-automatically, or manually) "require verifications, and then the validation of a trained user with professional qualifications in anatomy and radiotherapy." This implies that qualified professionals are expected to review and validate results, which is a key part of the human-in-the-loop workflow. However, this is about clinical use, not the ground truth generation for the performance studies themselves.

4. Adjudication Method for the Test Set

The document does not specify any adjudication method (e.g., 2+1, 3+1, none) used for establishing the ground truth of the test sets in the performance studies.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done

No, a multi-reader multi-case (MRMC) comparative effectiveness study comparing human readers with and without AI assistance is not explicitly mentioned or described in the provided non-clinical testing section. The studies focus on the standalone performance of the AI algorithms (autosegmentation, fusion).

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

Yes, standalone performance studies were done. The "Autosegmentation performances" tests, "External Contour performances," "Fusion performances," and "Registration performances" all evaluate the algorithm's output directly. The device description explicitly states it is a software designed "to assist the contouring process," and the contours generated "require verifications, potential modifications, and then the validation of a trained user." This confirms that the AI provides an initial output (standalone performance) that is then subject to human review.

7. The Type of Ground Truth Used

The document refers to the AI providing "acceptable contours" or "clinically acceptable contours." While it doesn't explicitly state the method of ground truth generation (e.g., expert consensus, pathology, outcomes data), the context of "acceptability" for radiotherapy planning strongly implies that the ground truth for these contoured structures would be established by expert consensus (likely radiation oncologists or dosimetrists) or highly correlated with clinical consensus/guidelines for radiotherapy planning. Pathology or outcomes data are less likely to be used directly for geometric contour ground truth.

8. The Sample Size for the Training Set

The document does not provide any information regarding the sample size used for the training set of the AI models.

9. How the Ground Truth for the Training Set Was Established

The document does not provide any information on how the ground truth for the training set was established.

Ask a Question

Ask a specific question about this device

Page 1 of 1