Search Results

Found 4 results

510(k) Data Aggregation

K Number

K230221

Device Name

QDOSE® Multi-purpose Voxel Dosimetry (Personalized Dosimetry in Molecular Radiotherapy)

Manufacturer

Versant Medical Physics and Radiation Safety

Date Cleared

2023-08-28

(214 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

N/A

Predicate For

N/A

Why did this record match?

Reference Devices :

K191216 Hermes Voxel Dosimetry™, K182966 PLANET® Onco Dose with PLANET® Dose, K182624 SurePlan™-MRT Dosimetry

AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview

Intended Use

QDOSE® Multi-purpose Voxel Dosimetry is indicated for use to provide estimates of radiation absorbed dose to organs and tissues of the body from medically administered radiopharmaceuticals, and to calculate total-body effective dose. Radiation absorbed dose calculations are based on clinical measurements of radioactivity biodistributions and biokinetics. QDOSE® is intended for applications in clinical nuclear medicine, molecular radiotherapy, radiation safety evaluations, risk assessment, record-keeping, and regulatory compliance. QDOSE® is indicated for use by professionals (medical physicists, radiologists and oncologists including nuclear medicine physicians), radiologic imaging technologists, health physicists and radiation safety officers and administrators, students in training, and others having interest in ability to calculate internal radiation doses from medically administered radiopharmaceuticals.

Device Description

QDOSE® is a software package for calculating internal radiation doses from clinically administered radiopharmaceuticals. Patient time-activity data may be imported to QDOSE® in DICOM files from nuclear medicine clinical imaging. Dosimetry performed within QDOSE® is based on the use of calculated S values, determined for patient-like phantoms using a Monte Carlo method. The S values provide the average absorbed dose to a target organ generated by a unit of activity in a source organ time-activity curves from quantitative nuclear medicine imaging data are integrated to yield an estimate of the number of radionuclide decays representing the area under a time-activity function, similarly to the mathematical process used by OLINDA/EXM. QDOSE® dose calculations are performed by multiplying a source organ timeactivity curve integral by the S value generated from Monte Carlo calculations. The product of the dose calculations is an output of radiation absorbed doses to specified target organs per unit administered activity.

AI/ML Overview

The provided text describes the QDOSE® Multi-purpose Voxel Dosimetry device and its substantial equivalence to a predicate device (OLINDA/EXM v.2.0) for regulatory approval (K230221). It includes information on performance testing and comparison, but it does not explicitly state specific acceptance criteria in a quantitative manner (e.g., "Accuracy must be within X%"). Instead, it describes performance in terms of favorable comparison and small or insignificant differences relative to theoretical values and predicate/reference devices.

Therefore, the "acceptance criteria" are inferred from the demonstrated performance and the conclusion of substantial equivalence.

Here's an attempt to structure the information based on the provided text, acknowledging the limitations regarding explicit acceptance criteria:

Acceptance Criteria and Study Proving Device Performance: QDOSE® Multi-purpose Voxel Dosimetry

The acceptance criteria for the QDOSE® device are implicitly defined by its demonstrated ability to perform internal radiation dosimetry in a manner "similar" or "comparable" to established predicate and reference devices, and to produce results that are quantitatively close to theoretical values where applicable. The study aims to demonstrate substantial equivalence to the predicate device and other reference devices, indicating that the new device is as safe and effective.

1. Table of Acceptance Criteria (Inferred) and Reported Device Performance:

Acceptance Criteria (Inferred from Performance Goals)	Reported Device Performance and Comparison
Accuracy of Time-Integrated Cumulated Activities:
- Planar Workflow: Effective half-lives and calculated activities should compare favorably to theoretical values and between QDOSE® and Hermes Voxel Dosimetry.	Planar Workflow: Average deviation of measured effective half-lives was ~0.2% (between QDOSE® and Hermes Voxel Dosimetry). Average difference of calculated activities was ~1.3%.
- Hybrid Workflow: Cumulated activities should compare favorably to theoretical values.	Hybrid Workflow: Deviation of cumulated activities was ~0.3%.
- Volumetric Workflow: Cumulated activities should compare favorably to theoretical values.	Volumetric Workflow: Deviation of cumulated activities was ~0.04%.
Accuracy of Organ Absorbed Dose Calculations:
- Mean relative difference for beta/gamma-emitting radionuclides (adult male/female) compared to OLINDA/EXM 2.0 (Note: Anatomical phantom differences acknowledged).	Pooled Beta/Gamma Emitters: Mean relative difference was 7% for adult male phantom and 8.8% for adult female phantom (QDOSE® IDAC-Dose 2.1 vs. OLINDA/EXM 2.0). These differences reflect known anatomical model variations.
- Mean relative difference for alpha-emitting radionuclides (adult male/female) compared to OLINDA/EXM 2.0.	Alpha Emitters: Mean relative difference was 10.7% for adult male phantom and 11.6% for adult female phantom (QDOSE® IDAC-Dose 2.1 vs. OLINDA/EXM 2.0). These differences reflect known anatomical model variations.
- Organ-specific relative differences compared to OLINDA/EXM 2.0 should be within acceptable ranges for clinical use.	Organ-Specific Differences: Varied from ~1% for kidneys, liver, spleen, and thyroid, to ~25% for red marrow. These differences are attributed to variations in assumed anatomical geometry, mass, shape, position, and tissue composition between the software packages.
- Agreement with spherical model calculations, compared to OLINDA/EXM.	Spherical Model: Agreement with less than 5% difference for absorbed dose values.
- Agreement with Voxel S method calculations, compared to OLINDA/EXM.	Voxel S Method: Mean difference relative to OLINDA/EXM of about 6%.
Intercomparison with other established dosimetry software:	Observable variability in reported doses should be generally small and within acceptable clinical ranges. For example, for organ walls with contents, within ±20%.
	Nonclinical Intercomparison by Others: QDOSE® (IDAC-Dose 2.1) compared favorably with OLINDA 1 and 2, ICRP Publication 128, and MIRDcalc 1. Observed variability was generally small, and for organ walls with contents, all results were still within ±20% and within the standard error usually assumed for medical internal radiation dose estimates.
Safety and Effectiveness: Differences from predicate should not raise new questions regarding safety and effectiveness.	The submission concludes that differences between QDOSE® and predicate/reference devices "do not raise new questions regarding safety and effectiveness of the device," and it is "as safe and effective as are its predicate devices." Minor differences in embedded methodology are considered normal and not critical because basic physics equations and nuclear data remain constant. User-related factors (training, calibration, ROI delineation) are acknowledged as main sources of small numerical differences. The device allows for patient-specific internal dosimetry based on fundamental science principles and internationally accepted methods.

2. Sample Size Used for the Test Set and Data Provenance:

Test Set Sample Size: The document refers to "phantom datasets" for the applicant's nonclinical testing. No specific number for the test set "sample size" in terms of unique phantom cases is provided. The tests involved calculating time-integrated cumulated activities across planar, hybrid, and volumetric workflows, and comparing organ absorbed doses for various radionuclides and phantoms (adult male/female).
Data Provenance: The data for the nonclinical tests was generated from "phantom datasets." The document does not specify the country of origin of this data, but the company (Versant Medical Physics and Radiation Safety) is based in Kalamazoo, Michigan, USA, and the software developer (ABX-CRO) is from Dresden, Germany. The tests conducted by "others" refer to intercomparison studies published by groups like the Medical Internal Radiation Dose (MIRD) Committee of the Society of Nuclear Medicine and Medical Imaging. This suggests a mix of internal company data and external, possibly multi-center or widely accepted, phantom-based benchmark data. The studies are nonclinical, using phantom data, not patient data (retrospective or prospective).

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications:

Number of Experts: Not explicitly stated. The "ground truth" for the nonclinical phantom tests appears to be established by comparison with theoretical values (for time-integrated activities) and established, validated software packages (OLINDA/EXM, ICRP Publication 128, MIRDcalc 1).
Qualifications of Experts: The document emphasizes that dose calculations are based on "fundamental science principles and internationally accepted methods and phantom models" (Page 10). The internal dose calculational engine (IDAC-Dose 2.1) is used by the International Commission on Radiological Protection (ICRP) to generate dose estimates. This implies that the 'ground truth' or comparators are derived from well-established scientific communities and their endorsed methodologies rather than individual expert adjudication on a case-by-case basis.

4. Adjudication Method for the Test Set:

Adjudication Method: Not applicable in the sense of human expert consensus for a clinical test set. The validation is primarily through comparison against theoretical values and results from established, previously validated software (OLINDA/EXM, ICRP Publication 128, MIRDcalc 1). Any "adjudication" is implicitly integrated within the established scientific and regulatory standards for dosimetry software validation.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done:

No, an MRMC comparative effectiveness study was not done. The studies described are nonclinical, using phantom data and software comparisons, not human readers interpreting medical images. The device is a dosimetry calculation software, not an AI for image interpretation or diagnosis.

6. If a Standalone (algorithm only without human-in-the-loop performance) was done:

Yes, the primary evaluation is a standalone (algorithm only) performance assessment. The nonclinical tests evaluate the QDOSE® software's computational results (cumulated activities, absorbed doses) against theoretical values and outputs from other dosimetry software. While the software takes "clinical nuclear medicine diagnostic imaging" as input, the performance evaluation itself focuses on the accuracy of the algorithm's calculations, assuming correct data input. The statement "The software device includes all processing and calculation steps required for an internal dosimetry evaluation" (Page 4) and its comparison to other software further supports this.

7. The Type of Ground Truth Used:

Theoretical Values / Computational Benchmarks: For time-integrated cumulated activities, QDOSE® results were compared against "theoretical values from calculations" (Page 13).
Established Software Outputs: For absorbed dose calculations, the ground truth was predominantly the outputs from predicate and reference software devices (e.g., OLINDA/EXM 1.1 and 2.0, Hermes Voxel Dosimetry, ICRP Publication 128, MIRDcalc 1). The document acknowledges that "absolute ground truth in medical internal radiation dosimetry is not known" (Page 12, footnote 1), implying that the established software outputs serve as the best available benchmark.

8. The Sample Size for the Training Set:

Not specified. The document focuses on performance testing (validation) and comparison to predicate devices. It does not mention a "training set" as would be typical for a machine learning model, suggesting that the software relies on established physics models and algorithms rather than statistical learning from a large dataset for its core functionality.

9. How the Ground Truth for the Training Set was Established:

Not Applicable. Since a distinct "training set" in the context of a machine learning model is not described, the concept of establishing ground truth for it also does not apply. The "internal calculational algorithm" (IDAC-Dose2.1) is based on a "phantom-based approach" and incorporates "ICRP computational framework" and "MIRD schema" (Page 9, 6). This points to an engineering and physics-based development rather than a data-driven training approach.

Ask a Question

Ask a specific question about this device

K Number

K223724

Device Name

MOZI TPS

Manufacturer

Manteia Technologies Co., Ltd.

Date Cleared

2023-07-10

(209 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K191928,K182624,K172163

Predicate For

N/A

Intended Use

The MOZI Treatment Planning System (MOZI TPS) is used to plan radiotherapy treatments with malignant or benign diseases. MOZI TPS is used to plan external beam irradiation with photon beams.

Device Description

The proposed device, MOZI Treatment Planning System (MOZI TPS), is a standalone software which is used to plan radiotherapy treatments (RT) for patients with malignant or benign diseases. Its core functions include image processing, structure delineation, plan design, optimization and evaluation. Other functions include user login, graphical interface, system and patient management. It can provide a platform for completing the related work of the whole RT plan.

AI/ML Overview

The provided text describes the performance data for the MOZI TPS device, focusing on its automatic contouring (structure delineation) feature. Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided document:

1. A table of acceptance criteria and the reported device performance

The primary acceptance criterion mentioned for structure delineation (automatic contouring) is based on the Mean Dice Similarity Coefficient (DSC). The study aimed to demonstrate non-inferiority compared to a reference device (AccuContour™ - K191928). While explicit thresholds for "acceptable" Mean DSC values are not given as numerical acceptance criteria in the table below, the text states "The result demonstrated that they have equivalent performance," implying that the reported DSC values met the internal non-inferiority standard set by the manufacturer against the performance of the reference device.

Body Part	OAR	Acceptance Criterion (Implicit)	Reported Mean DSC values	Mean standard deviation
Head&Neck		Mean DSC non-inferior to reference device (AccuContour™ - K191928)
	Brainstem	"equivalent performance" to K191928	0.88	0.03
	BrachialPlexus_L	"equivalent performance" to K191928	0.61	0.05
	BrachialPlexus_R	"equivalent performance" to K191928	0.64	0.05
	Esophagus	"equivalent performance" to K191928	0.84	0.02
	Eye-L	"equivalent performance" to K191928	0.93	0.02
	Eye-R	"equivalent performance" to K191928	0.93	0.02
	InnerEar-L	"equivalent performance" to K191928	0.78	0.06
	InnerEar-R	"equivalent performance" to K191928	0.82	0.04
	Larynx	"equivalent performance" to K191928	0.87	0.02
	Lens-L	"equivalent performance" to K191928	0.77	0.07
	Lens-R	"equivalent performance" to K191928	0.72	0.08
	Mandible	"equivalent performance" to K191928	0.90	0.02
	MiddleEar_L	"equivalent performance" to K191928	0.73	0.04
	MiddleEar_R	"equivalent performance" to K191928	0.74	0.04
	OpticNerve_L	"equivalent performance" to K191928	0.61	0.07
	OpticNerve_R	"equivalent performance" to K191928	0.62	0.08
	OralCavity	"equivalent performance" to K191928	0.90	0.03
	OpticChiasm	"equivalent performance" to K191928	0.64	0.10
	Parotid-L	"equivalent performance" to K191928	0.83	0.03
	Parotid-R	"equivalent performance" to K191928	0.83	0.04
	PharyngealConstrictors_U	"equivalent performance" to K191928	0.87	0.03
	PharyngealConstrictors_M	"equivalent performance" to K191928	0.88	0.02
	PharyngealConstrictors_L	"equivalent performance" to K191928	0.87	0.03
	Pituitary	"equivalent performance" to K191928	0.74	0.14
	SpinalCord	"equivalent performance" to K191928	0.85	0.04
	Submandibular_L	"equivalent performance" to K191928	0.86	0.04
	Submandibular_R	"equivalent performance" to K191928	0.87	0.03
	TemporalLobe_L	"equivalent performance" to K191928	0.89	0.03
	TemporalLobe_R	"equivalent performance" to K191928	0.89	0.03
	Thyroid	"equivalent performance" to K191928	0.86	0.03
	TMJ_L	"equivalent performance" to K191928	0.79	0.06
	TMJ_R	"equivalent performance" to K191928	0.74	0.06
	Trachea	"equivalent performance" to K191928	0.90	0.02
Thorax	Esophagus	"equivalent performance" to K191928	0.80	0.05
	Heart	"equivalent performance" to K191928	0.98	0.01
	Lung_L	"equivalent performance" to K191928	0.99	0.00
	Lung_R	"equivalent performance" to K191928	0.99	0.00
	Spinal Cord	"equivalent performance" to K191928	0.97	0.02
	Trachea	"equivalent performance" to K191928	0.95	0.02
Abdomen	Duodenum	"equivalent performance" to K191928	0.64	0.05
	Kidney_L	"equivalent performance" to K191928	0.96	0.02
	Kidney_R	"equivalent performance" to K191928	0.97	0.01
	Liver	"equivalent performance" to K191928	0.95	0.02
	Pancreas	"equivalent performance" to K191928	0.79	0.04
	SpinalCord	"equivalent performance" to K191928	0.82	0.02
	Stomach	"equivalent performance" to K191928	0.89	0.02
Pelvic-Man	Bladder	"equivalent performance" to K191928	0.92	0.03
	BowelBag	"equivalent performance" to K191928	0.89	0.04
	FemurHead_L	"equivalent performance" to K191928	0.96	0.02
	FemurHead_R	"equivalent performance" to K191928	0.95	0.02
	Marrow	"equivalent performance" to K191928	0.90	0.02
	Prostate	"equivalent performance" to K191928	0.85	0.04
	Rectum	"equivalent performance" to K191928	0.88	0.03
	SeminalVesicle	"equivalent performance" to K191928	0.72	0.07
Pelvic-Female	Bladder	"equivalent performance" to K191928	0.88	0.02
	BowelBag	"equivalent performance" to K191928	0.87	0.02
	FemurHead_L	"equivalent performance" to K191928	0.96	0.02
	FemurHead_R	"equivalent performance" to K191928	0.95	0.02
	Marrow	"equivalent performance" to K191928	0.89	0.02
	Rectum	"equivalent performance" to K191928	0.77	0.04

2. Sample size used for the test set and the data provenance

Test Set Sample Size: 187 image sets (CT structure models).
Data Provenance: The testing image source is from the United States. The data is retrospective, as it consists of existing CT datasets.
- Patient demographics: 57% male, 43% female. Ages: 21-30 (0.3%), 31-50 (31%), 51-70 (51.3%), 71-100 (14.4%). Race: 78% White, 12% Black or African American, 10% Other.
- Anatomical regions: Head and Neck (20.3%), Esophageal and Lung (Thorax, 20.3%), Gastrointestinal (Abdomen, 20.3%), Prostate (Male Pelvis, 20.3%), Female Pelvis (18.7%).
- Scanner models: GE (28.3%), Philips (33.7%), Siemens (38%).
- Slice thicknesses: 1mm (5.3%), 2mm (28.3%), 2.5mm (2.7%), 3mm (23%), 5mm (40.6%).

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

Number of experts: Six
Qualifications of experts: Clinically experienced radiation therapy physicists.

4. Adjudication method for the test set

Adjudication method: Consensus. The ground truth was "generated manually using consensus RTOG guidelines as appropriate by six clinically experienced radiation therapy physicists." This implies that the experts agreed upon the ground truth for each case.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

MRMC Study: No, a multi-reader, multi-case comparative effectiveness study was not performed to assess human reader improvement with AI assistance. The study focused on the standalone performance of the AI algorithm (automatic contouring) and its comparison to a reference device.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Standalone Performance: Yes, a standalone performance evaluation of the automatic segmentation algorithm was performed. The reported Mean DSC values are for the MOZI TPS device's auto-segmentation function without direct human-in-the-loop interaction during the segmentation process. The comparison to the reference device AccuContour™ (K191928) was also a standalone comparison.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.)

Type of Ground Truth: Expert consensus. The ground truth was "generated manually using consensus RTOG guidelines as appropriate by six clinically experienced radiation therapy physicists."

8. The sample size for the training set

Training Set Sample Size: 560 image sets (CT structure models).

9. How the ground truth for the training set was established

The document states that the training image set source is from China. It does not explicitly detail the method for establishing ground truth for the training set. However, given that the ground truth for the test set was established by "clinically experienced radiation therapy physicists" using "consensus RTOG guidelines," it is highly probable that a similar methodology involving expert delineation and review was used for the training data to ensure high-quality labels for the deep learning model. The statement that "They are independent of each other" (training and testing sets) implies distinct data collection and ground truth establishment processes, but the specific details for the training set are not provided.

Ask a Question

Ask a specific question about this device

K Number

K221706

Device Name

AccuContour

Manufacturer

Manteia Technologies Co., Ltd.

Date Cleared

2023-03-09

(269 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K182624,K173636,K181572,K191928

Predicate For

N/A

Intended Use

It is used by radiation oncology department to register multi-modality images and segment (non-contrast) CT images, to generate needed information for treatment planning, treatment evaluation and treatment adaptation.

Device Description

The proposed device, AccuContour, is a standalone software which is used by radiation oncology department to register multi-modality images and segment (non-contrast) CT images, to generate needed information for treatment planning, treatment evaluation and treatment adaptation.

The product has two image processing functions:

(1) Deep learning contouring: it can automatically contour organs-at-risk, in head and neck, thorax, abdomen and pelvis (for both male and female) areas,
(2) Automatic registration: rigid and deformable registration, and
(3) Manual contouring.

It also has the following general functions:

Receive, add/edit/delete, transmit, input/export, medical images and DICOM data;
Patient management;
Review of processed images;
Extension tool;
Plan evaluation and plan comparison;
Dose analysis.

AI/ML Overview

This document (K221706) is a 510(k) Premarket Notification for the AccuContour device by Manteia Technologies Co., Ltd. It declares substantial equivalence to a predicate device and several reference devices. The focus here is on the performance data related to the "Deep learning contouring" feature and the "Automatic registration" feature.

Based on the provided document, here's a detailed breakdown of the acceptance criteria and the study proving the device meets them:

I. Acceptance Criteria and Reported Device Performance

The document does not explicitly provide a clear table of acceptance criteria and the reported device performance for the deep learning contouring in the format requested. Instead, it states that "Software verification and regression testing have been performed successfully to meet their previously determined acceptance criteria as stated in the test plans." This implies that internal acceptance criteria were met, but these specific criteria and the detailed performance results (e.g., dice scores, Hausdorff distance for contours) are not disclosed in this summary.

However, for the deformable registration, it provides a comparative statement:

Feature	Acceptance Criteria (Implied)	Reported Device Performance
Deformable Registration	Non-inferiority to reference device (K182624) based on Normalized Mutual Information (NMI)	The NMI value of the proposed device was non-inferior to that of the reference device.

It's important to note:

For Deep Learning Contouring: No specific performance metrics or acceptance criteria are listed in this 510(k) summary. The summary only broadly mentions that the software "can automatically contour organs-at-risk, in head and neck, thorax, abdomen and pelvis (for both male and female) areas." The success is implicitly covered by the "Software verification and validation testing" section.
For Automatic Registration: The criterion is non-inferiority in NMI compared to a reference device. The specific NMI values are not provided, only the conclusion of non-inferiority.

II. Sample Size and Data Provenance

Test Set (for Deformable Registration):
- Sample Size: Not explicitly stated as a number, but described as "multi-modality image sets from different patients."
- Data Provenance: "All fixed images and moving images are generated in healthcare institutions in U.S." This indicates prospective data collection (or at least collected with the intent for such testing) from the U.S.
Training Set (for Deep Learning Contouring):
- Sample Size: Not explicitly stated in the provided document.
- Data Provenance: Not explicitly stated in the provided document.

III. Number of Experts and Qualifications for Ground Truth

For the Test Set (Deformable Registration): The document does not mention the use of experts or ground truth establishment for the deformable registration test beyond the use of NMI for "evaluation." NMI is an image similarity metric and does not typically require human expert adjudication of registration quality in the same way contouring might.
For the Training Set (Deep Learning Contouring): The document does not specify the number of experts or their qualifications for establishing ground truth for the training set.

IV. Adjudication Method for the Test Set

For Deformable Registration: Not applicable in the traditional sense, as NMI is an objective quantitative metric. There's no mention of human adjudication for registration quality here.
For Deep Learning Contouring (Test Set): The document notes there was no clinical study included in this submission. This implies that if a test set for the deep learning contouring was used, its ground truth (and any adjudication process for it) is not described in this 510(k) summary. Given the absence of a clinical study, it's highly probable that ground truth for performance evaluation of deep learning contouring was established internally through expert consensus or other methods, but details are not provided.

V. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

Was it done?: No, an MRMC comparative effectiveness study was not reported. The document explicitly states: "No clinical study is included in this submission."
Effect Size: Not applicable, as no such study was performed or reported.

VI. Standalone (Algorithm Only) Performance Study

Was it done?: Yes, for the deformable registration feature. The NMI evaluation was "on two sets of images for both the proposed device and reference device (K182624), respectively." This is an algorithm-only (standalone) comparison.
For Deep Learning Contouring: While the deep learning contouring is a standalone feature, the document does not provide details of its standalone performance evaluation (e.g., against expert ground truth). It only states that software verification and validation were performed to meet acceptance criteria.

VII. Type of Ground Truth Used

Deformable Registration: The "ground truth" for the deformable registration evaluation was implicitly the images themselves, with NMI being used as a metric to compare the alignment achieved by the proposed device versus the reference device. It's an internal consistency/similarity metric rather than a "gold standard" truth established by external means like pathology or expert consensus.
Deep Learning Contouring: Not explicitly stated in the provided document. Given that it's an AI-based contouring tool and no clinical study was performed, the ground truth for training and internal testing would typically be established by expert consensus (e.g., radiologist or radiation oncologist contours) or pathology, but the document does not specify.

VIII. Sample Size for the Training Set

Not explicitly stated in the provided document for either the deep learning contouring or the automatic registration.

IX. How Ground Truth for the Training Set was Established

Not explicitly stated in the provided document for either the deep learning contouring or the automatic registration. For deep learning, expert-annotated images are the typical method, but details are absent here.

Ask a Question

Ask a specific question about this device

K Number

K202928

Device Name

DV. Target

Manufacturer

Deepvoxel INC

Date Cleared

2021-04-02

(185 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K182624,K181572

Predicate For

N/A

Intended Use

DV.Target is a software application that enables the routing of DICOM-compliant data (CT Images) to automatic image processing workflows, using machine learning-based algorithms to automatically delineate organs-at-risk (OARs). Contours generated by DV.Target may be used as an input to clinical workflows for treatment planning in radiation therapy.

DV.Target is intended to be used by trained medical professionals including radiologists, radiation oncologists, dosimetrists, and physicists.

DV.Target does not provide a user interface for data visualization. Image data uploaded, auto-contouring results, and other functionalities are managed via an administration interface. Thus, it is required that DV.Target be used in conjunction with appropriate software, such as a treatment planning system (TPS), to review, edit, and approve for all contours generated by DV.Target.

DV.Target is only intended for normal organ contouring, not for tumor or clinical target volume contouring.

Device Description

The proposed device, DV.Target, is a standalone software that is designed to be used by trained medical professionals to automatically delineate organs-at-risk (OARs) on CT images. This OARs delineation function, often referred as auto-contouring, is intended to facilitate radiation therapy workflows. Supported image modalities include CT and RTSTURCT.

DV.Target can automatically delineate major OARs in three anatomical sites --- Head & Neck, Thorax, and Abdomen & Pelvis. It receives CT images in DICOM format as input and automatically generates the contours of OARs, which are stored in DICOM format and in RTSTRUCT modality.

The deployment environment of the proposed device is recommended to be a local network with an existing hospital-grade IT system in place. DV.Target should be installed on a specialized server supporting deep learning processing. After installation, users can login to the DV.Target administration interface via browsers from their local computers. All activities, including autocontouring, are operated by users through the administration interface.

In addition to auto-contouring, DV.Target also has the following auxiliary functions:

User interface for receiving, updating and transmitting medical images in DICOM format;
User management;
Processed image management and output (RTSTRUCT) file management.

Once data is routed to DV.Target auto-contouring workflows, no user interaction is required, nor provided. The image data, auto-contouring results, and other functionalities can be managed by DV.Target users via an administration user interface. Third-party image visualization and editing software, such as a treatment planning system (TPS), must be used to facilitate the review and editing of contours generated by DV.Target.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study details for the DV.Target device, based on the provided text:

1. Table of Acceptance Criteria and Reported Device Performance

The acceptance criteria for DV.Target is non-inferiority to the predicate and reference devices, as measured by the Dice-Sørensen coefficient (DICE score) for auto-contouring accuracy.

Metric (Acceptance Criteria)	Reported Device Performance
Non-inferiority to Predicate device (Mirada) for 19 overlapping OARs (measured by DICE score)	Achieved: "DV.Target is non-inferior to the predicate device Mirada on all 19 overlapping OARs." (Supported by Comparison Studies 1 & 2)
Non-inferiority to Reference device (MIM) for 30 non-overlapping OARs (measured by DICE score)	Achieved: "DV.Target is non-inferior to the reference device MIM on the 30 non-overlapping OARs." (Supported by Comparison Studies 3a & 3b)
Performance of non-overlapping OARs similar to overlapping OARs (measured by DICE score)	Achieved: "The performance of DV.Target on the non-overlapping OARs is similar to its performance on the overlapping OARs." (Supported by Comparison Study 3b)

2. Sample Sizes Used for the Test Set and Data Provenance

The study utilized two independent datasets for testing:

Public Validation Dataset:
- Data Provenance: "a large medical images archive --- TCIA" (The Cancer Imaging Archive). 64% of this data is from the US.
- Approximate Sample Size (implied): This dataset was used for Comparison Study 1 and Comparison Study 3. While a specific number of cases isn't given, it's described as a "public validation dataset" used for evaluating 19 overlapping OARs and 30 non-overlapping OARs, implying a substantial dataset for statistical analysis across multiple organs.
- Retrospective/Prospective: Retrospective (implied, as it's from an archive).
In-house Clinical Dataset:
- Data Provenance: "retrospectively from the City of Hope (our primary validation site)."
- Approximate Sample Size (implied): This dataset was used for Comparison Study 2 for evaluating overlapping OARs. Similar to the public dataset, a specific number of cases isn't given, but it's used for statistical evaluation.
- Retrospective/Prospective: Retrospective.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications

Public Validation Dataset:
- Number of Experts: Three.
- Qualifications: "three board-certified physicians."
In-house Clinical Dataset:
- Number of Experts: Not specified, but the ground truth was "based on actual clinical contouring results," implying it was established by clinical personnel.
- Qualifications: Not specified, but would align with standard clinical practice for contouring.

4. Adjudication Method for the Test Set

Public Validation Dataset: "The ground truth OARs contours on the public validation data were generated from the consensus of three board-certified physicians." This indicates an expert consensus method, likely implying that all three experts agreed on the contours.
In-house Clinical Dataset: "The ground truth contours on the in-house clinical data (collected retrospectively) were based on actual clinical contouring results." This implies adjudication through established clinical practice, but no specific multi-expert adjudication method (like 2+1 or 3+1) is mentioned.

5. If a Multi Reader Multi Case (MRMC) Comparative Effectiveness Study Was Done

No, an MRMC comparative effectiveness study was not done. The studies focused on comparing the algorithm's performance (DV.Target) against other algorithms (predicate and reference devices), and against ground truth established by human experts. There is no mention of human readers using the AI and their performance being compared to human readers without the AI assistance to measure reader improvement.

6. If a Standalone (Algorithm Only Without Human-in-the-Loop Performance) Study Was Done

Yes, a standalone study was done. The entire set of "Comparison Studies" (Studies 1, 2, and 3) involved evaluating the "auto-contouring accuracy" of the DV.Target device. The text explicitly states, "Once data is routed to DV.Target auto-contouring workflows, no user interaction is required, nor provided." This confirms that the reported performance metrics (DICE scores) are solely based on the algorithm's output without human intervention.

7. The Type of Ground Truth Used

Public Validation Dataset: Expert consensus (from three board-certified physicians).
In-house Clinical Dataset: Actual clinical contouring results. While derived from clinical practice, this can be considered a form of "expert" ground truth, as it represents the accepted clinical standard for those cases.

8. The Sample Size for the Training Set

The sample size for the training set is not provided in the given text. The document only mentions that the "validation data used in these studies... were invisible in model training."

9. How the Ground Truth for the Training Set Was Established

The method for establishing ground truth for the training set is not specified in the provided text.

Ask a Question

Ask a specific question about this device

Page 1 of 1