(101 days)
AATMA™ is a medical image processing library intended to produce derived data sets for use as input into radiation therapy treatment planning systems or other intermediate pre-treatment-planning applications. AATMA™ does not provide a user interface and is designed to be accessed through its application programming interface (API) by other devices. The data sets created by AATMA™ must be reviewed and validated by a qualified healthcare professional prior to clinical use.
AATMA™ is an optional accessory to treatment planning systems and intermediate pre-treatment planning applications. The auto-segmentation algorithm in AATMA™ is based on machine-learning convolutional neural networks and includes pre-trained models that will be used to automatically segment image sets. The algorithm itself functions as a computational engine and does not store any input data, output data, or logs. The available models have been pre-trained on specific datasets that exhibit similar characteristics (e.g., body site and imaging modality).
As a medical image processing library, AATMA™ is designed to produce derived datasets in standard formats (e.g., DICOM) that can be utilized by other applications. AATMA™ does not have a user interface and, as such, calling applications must execute the auto-segmentation algorithms via AATMA™'s application programming interface (API).
AATMA™ must be used in conjunction with appropriate software to review and edit results generated automatically by the auto-segmentation algorithm. A pre-treatment planning system or treatment planning system must be used to facilitate the review and edit of contours generated by the auto-segmentation algorithm within AATMA™.
The provided text describes the 510(k) premarket notification for Elekta Solutions AB's Advanced Algorithms for Treatment Applications (AATMA™). This device is a medical image processing library designed to produce derived datasets for radiation therapy treatment planning systems or intermediate pre-treatment planning applications, primarily through auto-segmentation using machine learning convolutional neural networks.
Here's an analysis of the acceptance criteria and the study that proves the device meets them, based solely on the provided text:
Acceptance Criteria and Reported Device Performance
The provided text implicitly defines acceptance criteria through the successful attainment of a stated DICE coefficient for model performance.
Criterion Type | Acceptance Criterion | Reported Device Performance |
---|---|---|
Software Validation | The device "meets the user needs and requirements" and is "substantially equivalent to those of the listed predicate device," demonstrating "compliance with the requirements of CFR 21 Part 820 and in adherence to the DICOM standard" and "does not introduce any new potential safety risks." | "The results of performance, functional and algorithmic testing demonstrate that AATMA™ meets the user needs and requirements of the device, which are demonstrated to be substantially equivalent to those of the listed predicate device." "Verification and Validation for AATMA™ has been carried out in compliance with the requirements of CFR 21 Part 820 and in adherence to the DICOM standard." "AATMA™ meets the requirements for safety and effectiveness as applicable to radiological image processing software and does not introduce any new potential safety risks." |
Head & Neck Model | The average DICE coefficient over all structures must meet the defined acceptance criteria (specific numerical threshold not explicitly stated, but implied to be met). | For verification: "the average DICE coefficient over all structures was determined to be 0.84 which met the defined acceptance criteria." For validation: "A different set of 13 3D CT image sets were used for validation and these met the acceptance criteria as well." |
Male Pelvis Model | The average DICE coefficient over all structures must meet the defined acceptance criteria (specific numerical threshold not explicitly stated, but implied to be met). | For verification: "the average DICE coefficient over all structures was determined to be 0.93 which met the defined acceptance criteria." For validation: "A different set of 20 3D CT image sets were used for validation and these met the acceptance criteria as well." |
Clinical Use Requirement | The data sets created by AATMA™ must be reviewed and validated by a qualified healthcare professional prior to clinical use. (This is a constraint on use, rather than a performance metric of the device itself, but it's an important part of the acceptance for safe use). | The device's "Indications for Use" and "Intended Use" state this requirement: "The data sets created by AATMA™ must be reviewed and validated by a qualified healthcare professional prior to clinical use." Additionally, "AATMA™ must be used in conjunction with appropriate software to review and edit results generated automatically by the auto-segmentation algorithm." |
Study Proving Device Meets Acceptance Criteria (Non-Clinical Performance Testing):
The document details non-clinical performance testing for two specific models: Head & Neck and Male Pelvis.
-
Sample sizes used for the test set and the data provenance:
- Head & Neck Model:
- Verification Set: 6 unique patient 3D CT image sets.
- Validation Set: 13 unique 3D CT image sets.
- Data Provenance: The training data (from which these test sets are distinct but of similar characteristics) came "from a variety of institutions and equipment." The document does not specify the country of origin or whether the data was retrospective or prospective, but the nature of the training implies existing, likely retrospective, clinical data.
- Male Pelvis Model:
- Verification Set: 5 unique patient CT image sets.
- Validation Set: 20 unique 3D CT image sets.
- Data Provenance: The training data (from which these test sets are distinct) came "from a global variety of institutions and equipment from patients undergoing RT." Again, the document does not specify the exact countries or whether it was retrospective/prospective, but implies existing clinical data.
- Head & Neck Model:
-
Number of experts used to establish the ground truth for the test set and the qualifications of those experts:
- The text states that the verification sets for both models had "expert contours." However, it does not specify the number of experts or their qualifications (e.g., "radiologist with 10 years of experience").
-
Adjudication method (e.g., 2+1, 3+1, none) for the test set:
- The document mentions "expert contours" were used for the verification sets. It does not specify an adjudication method used if multiple experts were involved (e.g., 2+1, 3+1). If only one expert reviewed each, then no adjudication would be necessary.
-
If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:
- No, an MRMC comparative effectiveness study was not done. The document explicitly states: "No animal or clinical tests were performed to establish substantial equivalence with the predicate device." The study focused on the algorithm's performance against expert contours, not on human reader improvement with AI assistance.
-
If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:
- Yes, a standalone (algorithm only) performance assessment was done. The described testing ("average DICE coefficient over all structures was determined") measures the algorithm's output (auto-segmented contours) directly against the established ground truth (expert contours), without human intervention in the loop during the performance measurement itself. The device is designed as an API-only computational engine.
-
The type of ground truth used (expert consensus, pathology, outcomes data, etc.):
- The ground truth for the test (verification) sets was established using "expert contours." It is not specified if this was a single expert per case or expert consensus.
-
The sample size for the training set:
- Head & Neck Model: Trained on 66 unique clinical patient 3D CT image sets.
- Male Pelvis Model: Trained on 205 unique patient 3D CT image sets.
-
How the ground truth for the training set was established:
- The document states the models were "pre-trained on specific datasets." It does not explicitly describe how the ground truth within these training datasets was established. It is implied that these datasets contained "expert contours" (similar to the verification data), but this is not explicitly stated for the training data itself.
§ 892.2050 Medical image management and processing system.
(a)
Identification. A medical image management and processing system is a device that provides one or more capabilities relating to the review and digital processing of medical images for the purposes of interpretation by a trained practitioner of disease detection, diagnosis, or patient management. The software components may provide advanced or complex image processing functions for image manipulation, enhancement, or quantification that are intended for use in the interpretation and analysis of medical images. Advanced image manipulation functions may include image segmentation, multimodality image registration, or 3D visualization. Complex quantitative functions may include semi-automated measurements or time-series measurements.(b)
Classification. Class II (special controls; voluntary standards—Digital Imaging and Communications in Medicine (DICOM) Std., Joint Photographic Experts Group (JPEG) Std., Society of Motion Picture and Television Engineers (SMPTE) Test Pattern).