Search Results

Prostate MR AI is a plug-in Radiological Computer Assisted Detection and Diagnosis Software device intended to be used · with a separate hosting application · as a concurrent reading aid to assist radiologists in the interpretation of a prostate MRI examination acquired according to the PI-RADS standard · in adult men (40 years and older) with suspected cancer in treatment naïve prostate glands The plug-in software analyzes non-contrast T2 weighted (T2W) and diffusion weighted image (DWI) series to segment the prostate gland and to provide an automatic detection and segmentation of regions suspicious for cancer. For each suspicious region detected, the algorithm moreover provides a lesion Score, by way of PI-RADS interpretation suggestion. Outputs of the device should be interpreted consistently with ACR recommendations using all available MR data (e.g., dynamic contrast enhanced images [if available]). Patient management decisions should not be made solely based on analysis by the Prostate MR AI algorithm.

Device Description

This premarket notification addresses the Siemens Healthineers Prostate MR AI (VA10A) Radiological Computer Assisted Detection and Diagnosis Software (CADe/CADx). Prostate MR AI is a Computer Assisted Detection and Diagnosis algorithm designed to plug into a hosting workflow that assists radiologists in the detection of suspicious lesions and their classification. It is used as a concurrent reading aid to assist radiologists in the interpretation of a prostate MRI examination acquired according to the PI-RADS standard. The automatic lesion detection requires transversal T2W and DWI series as inputs. The device automatically exports a list of detected prostate regions that are suspicious for cancer (each list entry consists of contours and a classification by Score and Level of Suspicion (LoS)), a computed suspicion map, and a per-case LoS. The results of the Prostate MR AI plug-in (with the case-level LoS, lesion center points, lesion diameters, lesion ADC median, lesion 10th percentile, suspicion map, and non-PZ segmentation considered optional) are to be shown in a hosting application that allows the radiologist to view the original case, as well as confirm, reject, or edit lesion candidates with their contours and Scores as generated by the Prostate MR AI plug-in. Moreover, the radiologist can add lesions with contours and PI-RADS scores and finalize the case. In addition, the outputs include an automatically computed prostate segmentation, as well as sub-segmentations of the peripheral zone and the rest of the prostate (non-PZ). The algorithm will augment the prostate workflow of currently cleared syngo.MR General Engine if activated via a separate license on the General Engine.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:

Acceptance Criteria and Reported Device Performance

Acceptance Criteria	Reported Device Performance
Automatic Prostate Segmentation
Median Dice score between AI algorithm results and ground truth masks exceeds 0.9.	The median of the Dice score between the AI algorithm results and the corresponding ground truth masks exceeds the threshold of 0.9.
Median normalized volume difference between algorithm results and ground truth masks is within ±5%.	The median of the normalized volume difference between the algorithm results and the corresponding ground truth masks is within a ±5% range.
AI algorithm results are statistically non-inferior to individual reader variability (5% margin of error, 5% significance level).	The AI algorithm results as compared to any individual reader are statistically non-inferior based on variabilities that existed among the individual readers within the 5% margin of error and 5% significance level.
Prostate Lesion Detection and Classification
Case-level sensitivity of lesion detection ≥ 0.80 for both radiology and pathology ground truth.	The case-level sensitivity of the lesion detection is equal or greater than 0.80 for both radiology and pathology ground truth.
False positive rate per case of lesion detection < 1 false positive per case for radiology ground truth.	The false positive rate per case of the lesion detection is smaller than one false positive per case for radiology ground truth.
Accuracy of PI-RADS classification of radiology ground truth lesions (detected by algorithm) ≥ 0.8.	The accuracy of the PI-RADS classification of radiology ground truth lesions detected by the algorithm is equal or greater than 0.8.
Non-inferior performance in GE vs Siemens and African American vs non-African American cases, and in cases with peripheral zone vs non-peripheral lesions.	The non-inferior performance of the subject device in GE vs Siemens and African American vs non-African American cases, and in cases with peripheral zone vs non-peripheral lesions was demonstrated. (Note: Specific metrics for this non-inferiority are not explicitly stated as distinct numerical criteria but are stated as "met".)
Clinical Performance (Reader Study - Case-level discrimination of Gleason Grade Group ≥ 1)
Statistically significant improvement in AUROC for aided reading vs unaided reading.	Fully Inclusive Analysis: AUROC improved from 0.6758 (unaided) to 0.7010 (aided), difference of 0.0252 (95% C.I. [0.0011, 0.0493]; P=0.040). Maximally Restrictive Analysis: AUROC improved from 0.6579 (unaided) to 0.6948 (aided), difference of 0.0368 (95% C.I. [0.0108, 0.0628]; P=0.006). In both analyses, the improvement was statistically significant and the primary endpoint thus met.
Clinical Performance (Reader Study - Lesion-level reading performance)
Statistically significant improvement in AUwAFROC for aided reading vs unaided reading.	Fully Inclusive Analysis: AUwAFROC improved in aided reading by 0.0350 (95% C.I.:[0.0020, 0.0681], P=0.037). Maximally Restrictive Analysis: AUwAFROC improved in aided vs. unaided reading by 0.302 (95% C.I.: [0.0080,0.0520], P=0.008). In both analyses, the improvement was statistically significant and the secondary endpoint thus met.
Statistically significant improvement in Fleiss' Kappa for interreader agreement in per-case PI-RADS scores for aided reading vs unaided reading.	Fleiss' Kappa improved from 0.283 (unaided) to 0.371 (aided), with a difference of 0.087 (95% C.I. [0.051, 0.125]). The improvement was statistically significant (P<0.0001).

Study Information

2. Sample size used for the test set and the data provenance:

Automatic Prostate Segmentation: 222 transversal T2 series.
- Provenance: More than 10 clinical sites.
- Retrospective/Prospective: Not explicitly stated, but the description of comparing against ground truth generated implies retrospective use of existing scans.
Prostate Lesion Detection and Classification (Standalone Performance):
- 105 cases from 6 sites (against radiology ground truth).
- 115 cases from 6 sites (against pathology ground truth).
- 340 cases from the multi-reader multi-case study (used for evaluation, implied prospective for this part of the evaluation, but the cases themselves were retrospective for the reader study).
- Provenance: 6 sites (for 105 and 115 cases), and two US sites (for 340 cases).
- Retrospective/Prospective: The cases for the lesion detection and classification evaluation were used to compare against established ground truths, suggesting retrospective analysis of existing data. The cases for the reader study were retrospectively selected.
Multi-Reader Multi-Case (MRMC) Study: 340 cases.
- Provenance: Two US sites. Cases were consecutive and specifically included additional consecutive patient cases from men of African descent to ensure at least 13% Black or African American ethnicity.
- Retrospective/Prospective: Cases were selected retrospectively.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:

Automatic Prostate Segmentation: 3 expert radiologists. No specific years of experience or subspecialty beyond "radiologists" are mentioned but implied as "expert".
Prostate Lesion Detection and Classification (Radiology Ground Truth): 3 expert radiologists in prostate MRI reading.
MRMC Study (Lesion-level reference standard): 3 experienced radiologists acting as Truthers.

4. Adjudication method (e.g. 2+1, 3+1, none) for the test set:

Automatic Prostate Segmentation: Pixel-wise consensus among the 3 expert radiologists.
Prostate Lesion Detection and Classification (Radiology Ground Truth): Consensus reading of the 3 expert radiologists.
MRMC Study (Case-level reference standard): Biopsy results (Gleason Grade Group GGG ≥ 1), or for cases without biopsy, PSA density and follow-up data.
MRMC Study (Lesion-level reference standard): Consensus lesions with a consensus PI-RADS of at least 3 from majority voting among the 3 experienced radiologists. (This implies a form of consensus/majority vote).

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:

Yes, an MRMC study was done with a paired split-plot design, combining two fully-crossed MRMC sub-studies.

Case-level AUROC improvement (discriminating Gleason Score ≥ 1):
- Fully Inclusive Analysis: +0.0252 (from 0.6758 unaided to 0.7010 aided).
- Maximally Restrictive Analysis: +0.0368 (from 0.6579 unaided to 0.6948 aided).
Lesion-level AUwAFROC improvement:
- Fully Inclusive Analysis: +0.0350.
- Maximally Restrictive Analysis: +0.0302.
Fleiss' Kappa (interreader agreement in per-case PI-RADS scores) improvement: +0.087 (from 0.283 unaided to 0.371 aided).

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:

Yes, standalone performance was evaluated for:

Automatic Prostate Segmentation: Compared algorithm results to ground truth generated by radiologists.
Prostate Lesion Detection and Classification: Compared automatic detection and classification results to radiology ground truth and pathology ground truth.
MRMC Study (AI Standalone reference): The ROC curves shown graphically include a "grey curve [that] denotes AI standalone performance."

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.):

For Automatic Prostate Segmentation: Pixel-wise consensus from 3 expert radiologists.
For Prostate Lesion Detection and Classification:
- Consensus reading of 3 expert radiologists (radiology ground truth).
- Biopsy results for the same patient (pathology ground truth).
For MRMC Study (Case-level): Biopsy results (Gleason Grade Group GGG ≥ 1), and in cases where biopsy was unavailable, PSA density and follow-up (12 months negative by PSA or MRI).
For MRMC Study (Lesion-level): Consensus lesions with a consensus PI-RADS of at least 3 from majority voting among 3 experienced radiologists.

8. The sample size for the training set:

The document states: "The cases for the reader study were kept completely separate from those used for the training of the Prostate MR AI algorithm." However, it does not specify the sample size for the training set. It only mentions that the AI algorithm was "trained on a database of prostate MR image series acquired according to the PI-RADS standard (non-contrast T2W and DWI image series), and corresponding radiological and/or biopsy findings."

9. How the ground truth for the training set was established:

The ground truth for the training set was established based on "corresponding radiological and/or biopsy findings." Specific details on the adjudication method (e.g., number of experts, consensus process) for the training set are not provided in this document, only the source of the ground truth.

Ask a Question

Ask a specific question about this device

Page 1 of 1