Search Results

MammoScreen™ is intended for use as a concurrent reading aid for interpreting physicians, to help identify findings on screening FFDM acquired with compatible mammography systems and assess their level of suspicion. Output of the device includes marks placed on findings on the mammogram and level of suspicion scores. The findings could be soft tissue lesions or calcifications. The level of suspicion score is expressed at the finding level, for each breast and overall for the mammogram. Patient management decisions should not be made solely on the basis of analysis by MammoScreen™.

Device Description

MammoScreen is a software-only device for aiding interpreting physicians in identifying focal findings suspicious for breast cancer in screening FFDM (full-field digital mammography) acquired with compatible mammography systems. The product consists of a processing server and a web interface. The software applies algorithms for recognition of suspicious calcifications and soft tissue lesions. These algorithms have been trained on large databases of biopsy proven examples of breast cancer, benign lesions and normal tissue. MammoScreen automatically processes FFDM and the output of the device can be used by radiologists concurrently with the reading of mammograms. The user interface of MammoScreen has several functions: a) Activation of computer aided detection (CAD) marks to highlight locations, known as findings, where the device detected calcifications or soft tissue lesions suspicious for cancer. b) Association of findings with a score, known as the MammoScreen Score, which characterizes findings on a 1-10 scale, with increasing level of suspicion. Only the most suspicious findings (with a MammoScreen score equal or greater than 5) are initially marked to limit the number of findings to review. The user shall also review findings with score of 4 or lower. c) Indication, with matching markers, when findings corresponding to the same findings are detected in multiple views of the FFDM. MammoScreen is configured as a DICOM Web compliant node in a network and receives its input images from another DICOM node, called "the DICOM Web Server". The MammoScreen output will be displayed on the screen of a personal computer compliant with requirements specified in the User Manual. The image analysis unit includes machine learning components trained to detect positive findings (calcifications and soft tissue lesions).

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided text:

Acceptance Criteria and Device Performance

The provided document defines acceptance criteria primarily through comparison with a predicate device and through the results of a clinical reader study. The core acceptance criterion for the clinical study appears to be an improvement in radiologist performance when using MammoScreen assistance compared to unaided reading.

Table of Acceptance Criteria and Reported Device Performance

Criterion Type	Specific Criterion	Reported Device Performance (MammoScreen)	Met?
Premarket Equivalence (vs. Predicate Device K181704 Transpara)
Classification Regulation	21 CFR 892.2090	SAME	Yes
Medical Device Class	Class II	SAME	Yes
Product Code	QDQ	SAME	Yes
Level of Concern	Moderate	SAME	Yes
Intended Use	Concurrent reading aid for physicians interpreting screening FFDM to identify findings and assess their level of suspicion.	SAME	Yes
Target patient population	Women undergoing FFDM screening mammography.	SAME	Yes
Target user population	Physicians interpreting FFDM screening mammograms.	SAME	Yes
Design	Software-only device.	SAME	Yes
Scoring System	While not identical, the principle (level of suspicion from low to high) should be substantially equivalent.	10-point scale vs. predicate's 1-100. Manufacturer claims interpretability benefits. Exam-level score provided. Deemed "substantially equivalent."	Yes
Finding Discovery	Reducing the number of findings the user has to review.	Default display for scores ≥ 5, user request for scores ≤ 4. Deemed "equivalent."	Yes
Performance Comparison	Overall performance gains should be comparable and not raise new safety/effectiveness questions.	AUC: unaided = 0.769, assisted = 0.798 (Difference: 0.028; P = 0.035). Predicate reported unaided = 0.866, assisted = 0.887. Deemed "still comparable."	Yes
Fundamental Scientific Technology	Involves medical image processing and machine learning, particularly deep learning for suspicious findings.	SAME	Yes
Clinical Performance (Reader Study)
Radiologist Performance	Radiologist performance with MammoScreen assistance is superior to unaided performance (main objective).	Average AUC improved from 0.769 (unaided) to 0.798 (with MammoScreen) (Difference = 0.028; P = 0.035).	Yes
Reading Time	Should not significantly increase.	Average reading time increased by 14% for scores > 4, but decreased by 2% for scores ≤ 4 in the second session. Overall, maximum increase did not exceed 15s.	Yes
Standalone Performance	Non-inferior to average unaided radiologist performance.	Standalone AUC = 0.790; Non-inferior to average unaided radiologist AUC = 0.770 (absence of statistical effect (p>0.05) and lower CI of diff > -0.03).	Yes
Sensitivity	Sensitivity of readers tended to increase with the use of MammoScreen without decreasing specificity (conclusion statement).	Reported overall performance improvement was statistically significant at breast (AUC) and lesion (pAUC) level, confirming trend. Specific values not explicitly in acceptance criteria here.	Yes

Study Details for Device Acceptance

Sample Size Used for the Test Set and Data Provenance:
- Test Set Size: 240 mammographic screening images (cases).
- Data Provenance: Acquired at a US center. The text states "US FFDM acquired on Hologic® devices, and performance comparison with FFDM acquired on GE® devices," indicating images from at least two major mammography system manufacturers in the US.
- Retrospective/Prospective: Retrospective. The study "collected" images after they were acquired, and "For each exam, the cancer status has been verified... and used as gold standard."
Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts:
- The document does not explicitly state the number of experts used to establish the ground truth or their specific qualifications (e.g., years of experience). It only states that "the cancer status has been verified by either biopsy results (for all cancer positive cases and some of the negative cases) or an adequate follow-up (for negative cases only) and used as gold standard." This implies clinical data and follow-up was the primary ground truth, not consensus of a specific number of experts.
Adjudication Method for the Test Set:
- The document does not explicitly describe an adjudication method for establishing ground truth from multiple expert reads. Ground truth was established via biopsy or adequate follow-up, which are objective clinical outcomes, not subjective reader interpretations.
Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:
- Yes, an MRMC study was performed.
- Effect Size of Human Readers Improvement with AI vs. Without AI Assistance:
  - Average AUC: Increased from 0.769 (unaided) to 0.798 (with MammoScreen assistance).
  - Difference: 0.028 (P = 0.035), indicating a statistically significant improvement.
  - The AUC was higher with MammoScreen aid for 11 of the 14 radiologists.
  - Performance improvement was also statistically significant at the breast (in terms of AUC) and lesion (in terms of pAUC) level.
Standalone (Algorithm Only Without Human-in-the-Loop Performance) Study:
- Yes, a standalone performance study was conducted.
- Standalone Performance: MammoScreen's standalone performance (AUC = 0.790) was found to be non-inferior to the average performance of unaided radiologists (AUC = 0.770). The lower confidence interval of the difference of AUC was equal to or superior to the effect size (-0.03), and the P-value was >0.05, confirming non-inferiority.
- Detailed standalone performance metrics were also provided for mammogram, breast, and finding levels (soft tissue lesions and calcifications), including ROC AUC, sensitivity, and specificity for Hologic, GE, and combined datasets.
Type of Ground Truth Used:
- Clinical Outcomes Data: The primary ground truth was established by:
  - Biopsy results (for all cancer-positive cases and some negative cases).
  - Adequate follow-up (for negative cases only).
Sample Size for the Training Set:
- The document states that the algorithms "have been trained on large databases of biopsy proven examples of breast cancer, benign lesions and normal tissue." However, it does not specify the exact sample size of the training set.
How the Ground Truth for the Training Set Was Established:
- The ground truth for the training set was established using "biopsy proven examples of breast cancer, benign lesions and normal tissue." This implies a similar methodology to the test set, relying on objective clinical outcomes (histopathology from biopsy) rather than expert consensus on images.

Ask a Question

Ask a specific question about this device

Page 1 of 1