Search Results

MammoScreen® is intended for use as a concurrent reading aid for interpreting physicians, to help identify findings on screening FFDM and DBT acquired with compatible mammography systems and assess their level of suspicion. Output of the device includes marks placed on findings on the mammogram and level of suspicion scores. The findings could be soft tissue lesions or calcifications. The level of suspicion score is expressed at the finding level, for each breast and overall for the mammogram. Patient management decisions should not be made solely on the basis of analysis by MammoScreen®.

Device Description

MammoScreen 2.0 automatically processes the four views (one CC and one MLO per breast) of standard screening FFDM or DBT, and outputs a corresponding report on a separate screen, alongside the monitors used for reading. This report is designed to be easily readable with very few interactions required by providing an overall level of suspicion of each exam and giving explicit visual indications when highly suspicious exams are detected.

MammoScreen 2.0 detects and characterizes findings on a scale from one to ten, referred to as the MammoScreen score. The score was designed such that findings with a low score have a very low level of suspicion. As the score increases, so does the level of suspicion.

Furthermore, MammoScreen 2.0 provides a high level of interpretability. Results are by construction consistent at the finding, breast and mammogram level. A breast takes on the highest score of its detected findings, and the level of suspicion for the exam is driven by the breast(s) with the highest score. Therefore, it is always possible to track a high suspicion of malignancy for an exam to the corresponding breast(s), and to a specific finding within the breast(s).

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study that proves the device meets them based on the provided text:

1. Table of Acceptance Criteria and Reported Device Performance

Performance Metric	Acceptance Criteria (Implicit)	Reported Device Performance (FFDM)	Reported Device Performance (DBT)
Radiologist Performance with AID (AUC)	Superior to unaided radiologist performance	Increased from 0.77 to 0.80	Increased from 0.79 to 0.83
Standalone Performance (AUC)	Non-inferior to unaided radiologist performance	0.79 (non-inferior to 0.77 unaided)	0.84 (superior to 0.79 unaided)
Standalone Performance vs. Predicate (FFDM)	Non-inferior to predicate device	Achieved non-inferior performance	Not applicable

2. Sample Size Used for the Test Set and Data Provenance

Sample Size (FFDM & DBT): 240 cases (enriched sample set)
Data Provenance: Not explicitly stated regarding country of origin. The studies are described as "reader studies," implying prospective collection for the purpose of the study or a curated retrospective selection. The text doesn't specify if it's purely retrospective or prospective.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications

Number of Experts: 14 for the 2D (FFDM) study and 20 for the 3D (DBT) study.
Qualifications: "MOSA-qualified and ABR-certified readers." (MOSA and ABR are common certifications for radiologists in the US, suggesting a US context for the experts).

4. Adjudication Method for the Test Set

The provided text does not explicitly state the adjudication method used to establish the ground truth for the test set. It mentions "enriched sample set" and "MOSA-qualified and ABR-certified readers," suggesting expert consensus, but the specific process (e.g., 2+1, 3+1) is not detailed.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, and the Effect Size of How Much Human Readers Improve with AI vs. Without AI Assistance

Yes, an MRMC study was done. Clinical validation included two reader studies (one for FFDM and one for DBT) using a multi-reader multi-case (MRMC) cross-over design.
Effect Size of Improvement:
- FFDM: Average AUC for radiologists increased from 0.77 (without AI) to 0.80 (with AI). (Improvement: 0.03 AUC)
- DBT: Average AUC for radiologists increased from 0.79 (without AI) to 0.83 (with AI). (Improvement: 0.04 AUC)

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

Yes, standalone performance was evaluated. The objectives of the studies included determining: "Whether the performance of MammoScreen standalone is superior to unaided radiologist performance" and "Whether the performance of MammoScreen standalone is non-inferior to aided radiologist performance."
Standalone Performance Results:
- FFDM: AUC = 0.79 (found to be non-inferior to the average unaided radiologists' performance of 0.77).
- DBT: AUC = 0.84 (found to be superior to the average unaided radiologists' performance of 0.79).
- Additionally, standalone performance tests for MammoScreen 2.0 (FFDM) demonstrated non-inferiority compared to the predicate device.

7. The Type of Ground Truth Used

The text implicitly suggests expert consensus based on the mention of "MOSA-qualified and ABR-certified readers." It also references the training of deep learning modules with "biopsy-proven examples of breast cancer and normal tissue," indicating that biopsy (pathology) results were used as the ultimate ground truth to establish the benign/malignant status of lesions in the training data, and likely in the test set's ground truth development as well. The study assesses performance in the "detection of breast cancer," linking the ground truth directly to malignancy.

8. The Sample Size for the Training Set

The document states that the deep learning modules were "trained with very large databases of biopsy-proven examples of breast cancer and normal tissue." However, a specific numerical sample size for the training set is not provided.

9. How the Ground Truth for the Training Set Was Established

The ground truth for the training set was established using "biopsy-proven examples of breast cancer and normal tissue." This indicates that histopathological (pathology) results from biopsies served as the definitive ground truth for classifying cases as cancerous or normal during the training of the AI model.

Ask a Question

Ask a specific question about this device

Page 1 of 1