(124 days)
MammoScreen® BD is a software application intended for use with compatible full field digital mammography and digital breast tomosynthesis systems. MammoScreen BD evaluates the breast tissue composition to provide an ACR BI-RADS 5th Edition breast density category. The device is intended to be used in the population of asymptomatic women undergoing screening mammography who are at least 40 years old.
MammoScreen BD only produces adjunctive information to aid interpreting physicians in the assessment of breast tissue composition. It is not a diagnostic software.
Patient management decisions should not be made solely based on analysis by MammoScreen BD.
MammoScreen BD is a software-only device (SaMD) using artificial intelligence to assist radiologists in the interpretation of mammograms. The MammoScreen BD software is to automatically process a mammogram to assess the density of the breasts.
For each examination, MammoScreen BD outputs the breast density in accordance with the American College of Radiology (ACR) Breast Imaging Reporting and Data System (BI-RADS) Atlas 5th Edition breast density categories "A" through "D".
MammoScreen BD takes as input a folder with images in DICOM formats and outputs a breast density assessment in a form of a JSON file. MammoScreen BD outputs can be integrated with compatible third-party software such as the MammoScreen Web-UI interface, PACS viewer (using DICOM Structured Report or DICOM Secondary Capture SOP Class UIDs), patient worklists, or within reporting software.
Here is a detailed breakdown of the acceptance criteria and study information for MammoScreen BD, based on the provided document:
1. Table of Acceptance Criteria and Reported Device Performance
The primary acceptance criteria for the initial clearance of MammoScreen BD were related to the accuracy and agreement with ground truth established by radiologists for classifying breast density into four BI-RADS categories.
Acceptance Criteria (from PCCP section for future modifications) | Primary Objective Reported Device Performance (4-class task) | Primary Objective Reported Device Performance (Binary task) |
---|---|---|
Quadratic Kappa on GE mammograms superior to 0.85 | Quadratic Cohen's Kappa: 89.03 (95% CI: 87.43 - 90.56) | Quadratic Cohen's Kappa: 84.50 (95% CI: 81.46, 87.36) |
Linear Kappa, Accuracy, and Density Bins (A, B, C, D) | Accuracy: 84.68 (95% CI: 82.68, 86.67) | Accuracy: 92.29 (95% CI: 90.82, 93.77) |
Note: The document explicitly states "Acceptance criteria of the updated device" under the PCCP for future modifications. While the document does not explicitly state the acceptance criteria for the initial clearance in a separate section, the reported performance metrics (Quadratic Cohen's Kappa and Accuracy for both 4-class and binary classification) are implicitly the metrics against which the device's performance was judged for its initial clearance, demonstrating its effectiveness based on comparison to the ground truth.
2. Sample Size Used for the Test Set and Data Provenance
- Sample Size for Test Set: 922 women/exams. (Total of 922 exams with 4 views each).
- Data Provenance: Retrospectively collected from two US screening centers and one French screening center.
- 52.6% of cases (485 patients) originated from the USA.
- 47.4% of cases (437 patients) originated from France.
- The provenance did not intersect any clinical centers used for algorithm development, mitigating a center-induced bias.
3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications
- Number of Experts: 5 breast radiologists.
- Qualifications of Experts: The document specifies "5 breast radiologists" but does not provide details on their years of experience or specific board certifications.
4. Adjudication Method for the Test Set
- Adjudication Method: "Consensus among the visual assessment of 5 breast radiologists." The exact method (e.g., majority vote, sequential review with tie-breaking) is not explicitly detailed beyond "consensus."
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- No, a multi-reader multi-case (MRMC) comparative effectiveness study evaluating human readers with AI assistance versus without AI assistance was not conducted or reported in this document. The study focuses on the standalone performance of the AI algorithm against expert consensus.
6. Standalone (Algorithm Only) Performance Study
- Yes, a standalone performance study was conducted. The "Primary Objectives" and "Performance Data" sections directly evaluate "the accuracy and the reproducibility of MammoScreen BD algorithm in assessing the breast density category" in terms of agreement with the ground truth established by the consensus of 5 radiologists.
- For the 4-class task, the algorithm achieved a quadratic Cohen's kappa of 89.03 and an accuracy of 84.68%.
- For the binary classification task (dense vs. non-dense), the algorithm achieved a quadratic Cohen's kappa of 84.50 and an accuracy of 92.29%.
7. Type of Ground Truth Used
- Type of Ground Truth: Expert Consensus. Specifically, "ground truth (GT) established by consensus among the visual assessment of 5 breast radiologists."
8. Sample Size for the Training Set
- Sample Size for Training Set: 32,368 patients, comprising 108,775 studies.
9. How Ground Truth for the Training Set Was Established
- The document states that the training data was derived from "De-identified screening mammograms... retrospectively collected from 32,368 patients in 2 different US sites."
- It does not explicitly state how the ground truth for the training set was established. It only describes the density distribution (A: 12.79%, B: 34.58%, C: 42.94%, D: 9.38%) within the training data, implying these were pre-existing labels. It's common for such labels to be derived from radiologist reports or existing clinical records, but the specific method of ground truth establishment for the training set is not detailed.
§ 892.2050 Medical image management and processing system.
(a)
Identification. A medical image management and processing system is a device that provides one or more capabilities relating to the review and digital processing of medical images for the purposes of interpretation by a trained practitioner of disease detection, diagnosis, or patient management. The software components may provide advanced or complex image processing functions for image manipulation, enhancement, or quantification that are intended for use in the interpretation and analysis of medical images. Advanced image manipulation functions may include image segmentation, multimodality image registration, or 3D visualization. Complex quantitative functions may include semi-automated measurements or time-series measurements.(b)
Classification. Class II (special controls; voluntary standards—Digital Imaging and Communications in Medicine (DICOM) Std., Joint Photographic Experts Group (JPEG) Std., Society of Motion Picture and Television Engineers (SMPTE) Test Pattern).