(59 days)
The Aurora system is a medical tool intended for use by appropriately trained healthcare professionals to aid detecting, localizing, diagnosing of diseases and in the assessment of organ function for the evaluation of diseases, trauma, abnormalities, and disorders such as, but not limited to, cardiovascular disease, neurological disorders and cancer. The system output can also be used by the physician for staging and restaging of tumors; and planning, guiding, and monitoring therapy, including the nuclear medicine part of theragnostic procedures.
GEHC's Aurora is a SPECT-CT system that combines an all-purpose Nuclear Medicine imaging system and the commercially available Revolution Ascend system. It is intended for general purpose Nuclear Medicine imaging procedures as well as head, whole body, cardiac and vascular CT applications and CT-based corrections and anatomical localization of SPECT images. Aurora does not introduce any new Intended Use.
Aurora consists of two back-to-back gantries (i.e. one for the NM sub-system and another for the CT subsystem), patient table, power distribution unit (PDU), operator console with a computer for both the NM acquisition and SmartConsole software and another for the CT software, interconnecting cables, and associated accessories (e.g. NM collimator carts, cardiac trigger monitor, head holder). The CT sub-system main components include the CT gantry, PDU, and CT operator console. All components are from the commercially available GEHC Revolution Ascend CT system.
Here's a breakdown of the acceptance criteria and study details for the Aurora system's deep-learning Automatic Kidney Segmentation algorithm, based on the provided FDA 510(k) clearance letter:
Acceptance Criteria and Reported Device Performance
Acceptance Criteria | Reported Device Performance |
---|---|
Bench Testing: Average DICE similarity score above predefined success criteria (specific score not provided) | Bench Testing: The DL Automatic kidney produced an average DICE score above the predefined success criteria. |
Clinical Testing: Generated segmentation is of acceptable utility, requires minimal user interaction. | Clinical Testing: Readers' evaluation demonstrated that generated segmentation was of acceptable utility and required minimal user interaction. |
Clinical Testing: Quality of kidneys' segmentation generated by the algorithm was acceptable. | Clinical Testing: All readers attested that the quality of the kidneys' segmentation generated by the algorithm was acceptable. |
Study Details for Deep-Learning Automatic Kidney Segmentation Algorithm
1. Sample sized used for the test set and the data provenance:
* Sample Size: 70 planar NM renal studies.
* Data Provenance: Acquired using GEHC systems from:
* 2 hospitals in the United States
* 1 hospital in Europe
* Nature: Retrospective (the studies were "segregated, and not used in any stage of the algorithm development," implying they were pre-existing data).
* Diversity: Served a diverse patient population including a range of ethnicities and demographics, encompassing a range of dynamic renal clinical scenarios, detection technologies, collimators, tracers, scan parameters, and patient age.
2. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:
* Number of Experts for Bench Testing Ground Truth: One (1).
* Qualifications: "An experienced Nuclear Medicine physician."
* Number of Experts for Clinical Testing Evaluation: Three (3) qualified U.S. readers.
* Qualifications: "Qualified U.S. readers" (further specific qualifications like years of experience or board certification are not detailed).
3. Adjudication method for the test set:
* For Bench Testing Ground Truth: The ground truth contours were reviewed and confirmed by a single experienced Nuclear Medicine physician. This suggests a form of expert consensus, but without multiple experts, it's not a multi-expert adjudication like 2+1 or 3+1. It's best described as single expert confirmation.
* For Clinical Testing: The three qualified U.S. readers independently assessed the quality of segmentation using a 4-point Likert scale. There is no mention of an adjudication process among these three readers, implying their individual assessments contributed to the overall evaluation.
4. If a multi-reader multi-case (MRMC) comparative effectiveness study was done:
* No, a multi-reader multi-case (MRMC) comparative effectiveness study comparing human readers with AI assistance vs. without AI assistance was not explicitly described.
* The clinical testing involved multiple readers evaluating the quality of the algorithm's segmentation itself, rather than assessing their own diagnostic performance with and without AI. The focus was on the utility and acceptability of the AI output for the readers.
5. Effect size of how much human readers improve with AI vs without AI assistance:
* This information is not provided as a comparative effectiveness study was not explicitly conducted. The study assessed the acceptability of the AI's output, not the improvement in human reader performance.
6. If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:
* Yes, a standalone performance evaluation of the algorithm was done. This is described as "Bench Testing" where the algorithm's generated contours were compared directly against the ground truth (GT) contours using the DICE similarity score. The "clinical testing" involved human readers evaluating the AI output, but the bench testing was algorithm-only.
7. The type of ground truth used:
* Expert Consensus: The ground truth for the bench testing (GT contours) was established by an "experienced Nuclear Medicine physician." While only one physician is mentioned, it's considered an expert-derived ground truth.
8. The sample size for the training set:
* The document does not explicitly state the sample size used for the training set of the deep learning algorithm. It only mentions that the 70 test studies "were segregated, and not used in any stage of the algorithm development," which implies they were distinct from the training data.
9. How the ground truth for the training set was established:
* The document does not explicitly state how the ground truth for the training set was established. It is only mentioned for the test set.
§ 892.1200 Emission computed tomography system.
(a)
Identification. An emission computed tomography system is a device intended to detect the location and distribution of gamma ray- and positron-emitting radionuclides in the body and produce cross-sectional images through computer reconstruction of the data. This generic type of device may include signal analysis and display equipment, patient and equipment supports, radionuclide anatomical markers, component parts, and accessories.(b)
Classification. Class II.