(133 days)
syngo.CT Extended Functionality is intended to provide advanced visualization tools to prepare and process medical images for diagnostic purpose. The software package is designed to support technicians and physicians in qualitative and quantitative measurements and in the analysis of clinical data that was acquired and reconstructed by Computed Tomography (CT) scanners, and possibly other medical imaging modalities (e.g. MR scanners).
An interface shall enable the connection between the syngo.CT Extended Functionality software package and the interconnected CT Scanner system.
Resulting images created with the syngo.CT Extended Functionality software package can be used to assist trained technicians or physicians in diagnosis.
syngo.CT Extended Functionality is a software bundle that offers tools to support special clinical evaluations. The "tools" are represented by the so-called Extensions. syngo.CT Extended Functionality can be used to create advanced visualizations and measurements on clinical data that was acquired and reconstructed by Computed Tomography (CT) scanners or other medical imaging modalities (e.g. MR scanners) by using the Extensions. Advanced visualizations and measurements are listed as follows. The subject device in the current software version SOMARIS/8 VB51 has been extended by the Extension Pulmonary Density.
This feature provides the possibility to segment opacity regions of CT images of the lungs using an Al algorithm. Pulmonary Density counts image voxels inside opacity regions and calculates the percentages of these voxels relative to the total number of voxels per lobe, lung and in total. Afterwards, each of the five lung lobes is assigned a score ranging from 0 to 4 based on the percentage of opacity as follows: 0 (0%), 1 (1%-25%), 2 (26%-50%), 3 (51%-75%), or 4 (76%-100%). Then a summation of the five lobe scores (range of possible scores, 0–20) are generated in the device outputs.
Here's a breakdown of the acceptance criteria and study information for the syngo.CT Extended Functionality device, specifically focusing on the new "Pulmonary Density" feature.
1. Table of Acceptance Criteria and Reported Device Performance
The acceptance criteria provided in the document are somewhat implicit, primarily relying on achieving "equivalent performance in comparison to the secondary predicate device" and demonstrating performance within established statistical limits.
Acceptance Criterion (Implicit) | Reported Device Performance |
---|---|
Lung Lobe Segmentation Algorithm: | |
High accuracy in computing segmentation masks of the five lung lobes (RUL, RML, RLL, LUL, LLL). Compared against a "secondary predicate device." | Average DICE coefficients ranged from 0.94 to 0.96. This indicates a very high degree of overlap between the algorithm's segmentation and the ground truth. The document states this demonstrates "equivalent performance in comparison to the secondary predicate device." |
Identification of Opaque Regions (AI-based) Algorithm: | |
Agreement with human reads for the percentage of opacity (PO) on a lung lobe level, established through 95%-Limits of Agreement (LoA). Compared against a "secondary predicate device." | 93.0% of the PO values were found within the 95%-Limits of Agreement (LoA) established when comparing the algorithm's performance against human reads. This indicates a strong agreement with human expert assessment. The document states this demonstrates "equivalent performance in comparison to the secondary predicate device." Additionally, "consistent performance has been found for both algorithms across all subgroups" when additional analysis was performed for population-specific subgroups and various technical parameters. |
Overall equivalent performance to the secondary predicate device for segmentation and lung parenchyma categorization. | The detailed performance metrics (DICE coefficients and % within LoA) are presented as evidence of this equivalence. |
Conformance with special controls for medical devices containing software. | Non-clinical tests (integration and functional) were conducted and found acceptable. |
Meeting voluntary standards listed in the document. | Siemens certifies the device will meet standards such as DICOM, IEC 62304, ISO 14971, and IEC 62366-1. |
2. Sample Size Used for the Test Set and Data Provenance
- Lung Lobe Segmentation Test Set Size: 250 datasets
- Identification of Opaque Regions Test Set Size: 150 datasets
- Data Provenance: "multiple sites across the US and Europe." The document implicitly suggests these are retrospective, as they are referred to as "clinical data."
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications
The document states that the ground truth for the "Identification of opaque regions" algorithm's validation was established by "human reads" which were then used to assess "inter-reader variability" and establish "95%-Limits of Agreement (LoA)." However, the number of experts and their specific qualifications (e.g., number of years of experience, specific subspecialty) are not explicitly stated in the provided text. It can be inferred that these were qualified medical professionals, given the context of "human reads" for diagnostic imagery, but specific details are missing.
4. Adjudication Method for the Test Set
The document mentions "inter-reader variability" was assessed for the opaque region identification, implying that multiple human readers provided assessments. However, the specific adjudication method (e.g., 2+1, 3+1 consensus, averaging, etc.) used to establish the final ground truth from these multiple readers, or how the comparison to "human reads" was performed beyond establishing LoA, is not explicitly detailed.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done
The document mentions assessing "inter-reader variability" of the percentage of opacity against "human reads" and comparing the algorithm's performance to these human reads. This suggests an element of comparing AI output to human performance. However, it does not explicitly describe a traditional MRMC comparative effectiveness study designed to measure the effect size of human readers improving with AI vs. without AI assistance. The study focuses on the standalone performance of the AI compared to human consensus/variability, rather than the "human-in-the-loop" improvement.
6. If a Standalone Study Was Done
Yes, a standalone study was done. The performance metrics (DICE coefficients for segmentation, and percentage of PO values within LoA compared to human reads for opacity detection) represent the performance of the algorithm itself without human intervention in the interpretation of the AI's output for diagnosis. The study validates the algorithm's output against a defined ground truth derived from human experts.
7. The Type of Ground Truth Used
- Lung Lobe Segmentation: The ground truth for lung lobe segmentation is developed by "segmentation of lung lobes" using an algorithm similar to a previously cleared device but "trained with more data." The DICE coefficient comparison later implies a ground truth mask against which the algorithm's mask is compared. While not explicitly stated, this would typically involve expert-annotated segmentation masks.
- Identification of Opaque Regions: The ground truth for opaque region identification was established through "human reads." This suggests expert consensus or expert interpretation as the basis for the ground truth. "Inter-reader variability" was also assessed, further supporting expert-derived ground truth.
8. The Sample Size for the Training Set
The document states that the lung segmentation algorithm has been "trained with more data" than the secondary predicate device and mentions a "Training cohort: size and properties of data used for training." However, the specific sample size for the training set is not provided in the excerpt.
9. How the Ground Truth for the Training Set Was Established
The document mentions a section for "Description of ground truth / annotations generation" for the training cohort. However, the specific details of how the ground truth for the training set were established are not provided in this excerpt. It is only generally mentioned that there was a description of its generation. Given the nature of the algorithms, it would typically involve expert manual annotation or labelling of images.
§ 892.1750 Computed tomography x-ray system.
(a)
Identification. A computed tomography x-ray system is a diagnostic x-ray system intended to produce cross-sectional images of the body by computer reconstruction of x-ray transmission data from the same axial plane taken at different angles. This generic type of device may include signal analysis and display equipment, patient and equipment supports, component parts, and accessories.(b)
Classification. Class II.