Search Results

MammoScreen® 3 is a concurrent reading and reporting aid for physicians interpreting screening mammograms. It is intended for use with compatible full-field digital mammography and digital breast tomosynthesis systems. The device can also use compatible prior examinations in the analysis.

Output of the device includes graphical marks of findings as soft-tissue lesions or calcifications on mammograms along with their level of suspicion scores. The lesion type is characterized as mass/ asymmetry, distortion, or calcifications for each detected finding. The level of suspicion score is expressed at the finding level, for each breast, and overall for the mammogram.

The location of findings including quadrant, depth, and distance from the nipple, is also provided. This adjunctive information is intended to assist interpreting physicians during reporting.

Patient management decisions should not be made solely based on the analysis by MammoScreen 3.

Device Description

MammoScreen is a concurrent reading medical software device using artificial intelligence to assist radiologists in the interpretation of mammograms.

MammoScreen processes the mammogram(s) and detects findings suspicious for breast cancer. Each detected finding gets a score called the MammoScreen Score™. The score was designed such that findings with a low score have a very low level of suspicion. As the score increases, so does the level of suspicion. For each mammogram, MammoScreen outputs detected findings with their associated score, a score per breast, driven by the highest finding score for each breast, and a score per case, driven by the highest finding score overall. The MammoScreen Score goes from one to ten.

MammoScreen is available for 2D (FFDM images) and 3D processing (FFDM & DBT or 2DSM & DBT). Optionally, MammoScreen can use prior examinations in the analysis.

MammoScreen can also aid in the reporting process by populating an initial report with chosen findings, including lesion type and position (quadrant, depth and distance to nipple).

The results indicating potential breast cancer, identified by MammoScreen, are accessible via a dedicated user interface and can seamlessly integrate into DICOM viewers (using DICOM-SC and DICOM-SR). Reporting aid outputs can be incorporated into the practice's reporting system to generate a preliminary report. Additionally, certain outputs like the case score can be reported into the patient management worklist.

Note that the MammoScreen outputs should be used as complementary information by radiologists while interpreting mammograms. For all cases, the medical professional interpreting the mammogram remains the sole decision-maker.

AI/ML Overview

Here's a summary of the acceptance criteria and the study that proves the device meets them, based on the provided text:

Acceptance Criteria and Reported Device Performance

The acceptance criteria are not explicitly listed in a separate table within the document. However, the clinical and standalone performance studies establish benchmarks and demonstrate achievement of certain levels of accuracy, sensitivity, and specificity. The criteria are implied through the statement "MammoScreen 3 achieved superior performance compared to the predicate device" and the detailed statistical results provided.

Table of Performance Results

Given that specific "acceptance criteria" (e.g., "AUROC must be > X") are not explicitly stated, I will present the reported performance of MammoScreen 3 in both co-reading and standalone modes, along with improvements (effect sizes) in the co-reading scenario.

Performance Metric	Acceptance Criteria (Implied)	MammoScreen 3 (Co-reading with Radiologists)	MammoScreen 3 (Standalone)	Notes
Radiologist Performance (Co-reading)	Superior to unaided radiologist performance
Average AUROC (aided)	Higher than unaided	0.871 [0.829 - 0.912]	N/A	Unaided: 0.797 [0.752 - 0.843]
Average Sensitivity (aided)	Higher than unaided	0.793 [0.725 - 0.860]	N/A	Unaided: 0.706 [0.633 - 0.780]
Average Specificity (aided)	Higher than unaided	0.836 [0.805 - 0.867]	N/A	Unaided: 0.815 [0.782 - 0.848]
Standalone Performance (overall mammogram level)	Superior to unaided radiologists; Non-inferior to aided radiologists	N/A	0.883 [0.837 - 0.929]	Superior to unaided: ΔAUROC = +0.085 (p < 0.0001)
Standalone Sensitivity		N/A	0.833 [0.756 – 0.911]
Standalone Specificity		N/A	0.793 [0.728 – 0.858]
Standalone Performance (Detailed - Overall Mammogram Level)		N/A	0.927 (0.911, 0.942)	For breast cancer detection, overall.
Standalone Performance (Lesion Type Assessment)				Positive Percentage Agreement (PPA) & Negative Percentage Agreement (NPA)
Overall PPA		N/A	0.784, (0.758, 0.811)
Overall NPA		N/A	0.893, (0.880, 0.906)
Mass/asymmetry PPA		N/A	0.868, (0.838, 0.894)
Mass/asymmetry NPA		N/A	0.783, (0.752, 0.815)
Distortion PPA		N/A	0.544, (0.475, 0.611)
Distortion NPA		N/A	0.947, (0.932, 0.962)
Calcifications PPA		N/A	0.941, (0.911, 0.967)
Calcifications NPA		N/A	0.950, (0.934, 0.964)
Standalone Performance (CC quadrant assessment)				PPA & NPA
Overall PPA		N/A	0.765 (0.726, 0.810)
Overall NPA		N/A	0.963 (0.951, 0.965)
Standalone Performance (MLO quadrant assessment)				PPA & NPA
Overall PPA		N/A	0.471 (0.425, 0.523)
Overall NPA		N/A	0.889 (0.878, 0.902)
Standalone Performance (Depth assessment)				PPA & NPA
Overall PPA		N/A	0.617 (0.587, 0.644)
Overall NPA		N/A	0.943 (0.932, 0.953)

1. Sample sizes used for the test set and data provenance:

MRMC Study (AI-aided reading):
- Sample Size: 240 combined DBT/2D mammograms (DBT+FFDM or DBT+2DSM) with a prior.
- Data Provenance: Not explicitly stated, but the inclusion of "MQSA qualified and ABR certified radiologists" suggests US-based data or a study conducted under US regulatory standards. It's retrospective (pre-collected cases).
Standalone Performance Study:
- Sample Size: 7,544 exams from 4,429 patients.
- Data Provenance: Prospective, from 3 US centers. The demographics table provides a distribution of race, age, and imaging modalities used (Hologic only for manufacturer), explicitly confirming US origin. Exam dates range from 2005 - 2023.

2. Number of experts used to establish the ground truth for the test set and their qualifications:

MRMC Study (AI-aided reading): Not explicitly stated how the ground truth was established for the 240 cases, but it's implied that these were "truth" cases used to evaluate reader and system performance. Given the type of study, it's highly likely to have been based on a consensus of expert radiologists or pathology confirmation.
Standalone Performance Study: Not explicitly stated how the ground truth for the 7,544 cases was established, but the description "Cancer status: Malignant: 23% / Normal/benign: 77%" implies a confirmed ground truth, likely through pathology reports or long-term follow-up. The reference to "reference standard" for lesion type, quadrant, and depth assessment also suggests expert review or confirmed diagnoses.

3. Adjudication method for the test set:

MRMC Study: Not explicitly mentioned.
Standalone Performance Study: Not explicitly mentioned.

4. If a multi reader multi case (MRMC) comparative effectiveness study was done, and if so, what was the effect size of how much human readers improve with AI vs without AI assistance:

Yes, an MRMC study was done.
Effect Size of Improvement with AI:
- Average AUROC: Increase of +0.074 [0.047 - 0.101] (p-value < 0.001). (From 0.797 unaided to 0.871 aided).
- Average Sensitivity: Increase of +0.086 [0.040 - 0.133] (p-value < 0.001). (From 0.706 unaided to 0.793 aided).
- Average Specificity: Increase of +0.021 [0.006 - 0.036] (p-value 0.007). (From 0.815 unaided to 0.836 aided).

5. If a standalone (i.e., algorithm only without human-in-the loop performance) was done:

Yes, a standalone performance study was done.
- Overall AUROC at mammogram level: 0.883 [0.837 - 0.929].
- This was found to be superior to radiologists in unaided reading conditions (ΔAUROC = +0.085 [0.044 - 0.127], p-value <0.0001).
- It was also non-inferior to radiologists in aided reading conditions (ΔAUROC = +0.012 [-0.015 - 0.039], p-value <0.0001).
- Standalone sensitivity was 0.833 [0.756 – 0.911] and specificity was 0.793 [0.728 – 0.858].
- Detailed standalone performance by subgroup (density, race, source, age, lesion type, lesion size, lesion severity, imaging combination, prior image combination, prior time difference) was also provided, along with lesion type, quadrant, and depth assessment performance (PPA and NPA).

6. The type of ground truth used:

MRMC Study: Not explicitly detailed, but usually based on pathology or rigorous follow-up.
Standalone Performance Study: The ground truth for cancer status is indicated by "Malignant: 23% / Normal/benign: 77%," implying pathology confirmation or long-term follow-up. For lesion type, quadrant, and depth assessments, it refers to a "reference standard," which typically indicates expert consensus or pathology correlation.

7. The sample size for the training set:

The training set sample size is not explicitly stated in the provided text. The document mentions that the deep learning modules are "trained with large databases of biopsy-proven examples of breast cancer and normal tissue," but specific numbers are not given.

8. How the ground truth for the training set was established:

The ground truth for the training set was established using "large databases of biopsy-proven examples of breast cancer and normal tissue." This implies that the training data included cases with definitive diagnostic outcomes (e.g., via biopsy with histopathological confirmation).

Ask a Question

Ask a specific question about this device

Page 1 of 1