(182 days)
MammoScreen® 3 is a concurrent reading and reporting aid for physicians interpreting screening mammograms. It is intended for use with compatible full-field digital mammography and digital breast tomosynthesis systems. The device can also use compatible prior examinations in the analysis.
Output of the device includes graphical marks of findings as soft-tissue lesions or calcifications on mammograms along with their level of suspicion scores. The lesion type is characterized as mass/ asymmetry, distortion, or calcifications for each detected finding. The level of suspicion score is expressed at the finding level, for each breast, and overall for the mammogram.
The location of findings including quadrant, depth, and distance from the nipple, is also provided. This adjunctive information is intended to assist interpreting physicians during reporting.
Patient management decisions should not be made solely based on the analysis by MammoScreen 3.
MammoScreen is a concurrent reading medical software device using artificial intelligence to assist radiologists in the interpretation of mammograms.
MammoScreen processes the mammogram(s) and detects findings suspicious for breast cancer. Each detected finding gets a score called the MammoScreen Score™. The score was designed such that findings with a low score have a very low level of suspicion. As the score increases, so does the level of suspicion. For each mammogram, MammoScreen outputs detected findings with their associated score, a score per breast, driven by the highest finding score for each breast, and a score per case, driven by the highest finding score overall. The MammoScreen Score goes from one to ten.
MammoScreen is available for 2D (FFDM images) and 3D processing (FFDM & DBT or 2DSM & DBT). Optionally, MammoScreen can use prior examinations in the analysis.
MammoScreen can also aid in the reporting process by populating an initial report with chosen findings, including lesion type and position (quadrant, depth and distance to nipple).
The results indicating potential breast cancer, identified by MammoScreen, are accessible via a dedicated user interface and can seamlessly integrate into DICOM viewers (using DICOM-SC and DICOM-SR). Reporting aid outputs can be incorporated into the practice's reporting system to generate a preliminary report. Additionally, certain outputs like the case score can be reported into the patient management worklist.
Note that the MammoScreen outputs should be used as complementary information by radiologists while interpreting mammograms. For all cases, the medical professional interpreting the mammogram remains the sole decision-maker.
Here's a summary of the acceptance criteria and the study that proves the device meets them, based on the provided text:
Acceptance Criteria and Reported Device Performance
The acceptance criteria are not explicitly listed in a separate table within the document. However, the clinical and standalone performance studies establish benchmarks and demonstrate achievement of certain levels of accuracy, sensitivity, and specificity. The criteria are implied through the statement "MammoScreen 3 achieved superior performance compared to the predicate device" and the detailed statistical results provided.
Table of Performance Results
Given that specific "acceptance criteria" (e.g., "AUROC must be > X") are not explicitly stated, I will present the reported performance of MammoScreen 3 in both co-reading and standalone modes, along with improvements (effect sizes) in the co-reading scenario.
| Performance Metric | Acceptance Criteria (Implied) | MammoScreen 3 (Co-reading with Radiologists) | MammoScreen 3 (Standalone) | Notes |
|---|---|---|---|---|
| Radiologist Performance (Co-reading) | Superior to unaided radiologist performance | |||
| Average AUROC (aided) | Higher than unaided | 0.871 [0.829 - 0.912] | N/A | Unaided: 0.797 [0.752 - 0.843] |
| Average Sensitivity (aided) | Higher than unaided | 0.793 [0.725 - 0.860] | N/A | Unaided: 0.706 [0.633 - 0.780] |
| Average Specificity (aided) | Higher than unaided | 0.836 [0.805 - 0.867] | N/A | Unaided: 0.815 [0.782 - 0.848] |
| Standalone Performance (overall mammogram level) | Superior to unaided radiologists; Non-inferior to aided radiologists | N/A | 0.883 [0.837 - 0.929] | Superior to unaided: ΔAUROC = +0.085 (p < 0.0001) |
| Standalone Sensitivity | N/A | 0.833 [0.756 – 0.911] | ||
| Standalone Specificity | N/A | 0.793 [0.728 – 0.858] | ||
| Standalone Performance (Detailed - Overall Mammogram Level) | N/A | 0.927 (0.911, 0.942) | For breast cancer detection, overall. | |
| Standalone Performance (Lesion Type Assessment) | Positive Percentage Agreement (PPA) & Negative Percentage Agreement (NPA) | |||
| Overall PPA | N/A | 0.784, (0.758, 0.811) | ||
| Overall NPA | N/A | 0.893, (0.880, 0.906) | ||
| Mass/asymmetry PPA | N/A | 0.868, (0.838, 0.894) | ||
| Mass/asymmetry NPA | N/A | 0.783, (0.752, 0.815) | ||
| Distortion PPA | N/A | 0.544, (0.475, 0.611) | ||
| Distortion NPA | N/A | 0.947, (0.932, 0.962) | ||
| Calcifications PPA | N/A | 0.941, (0.911, 0.967) | ||
| Calcifications NPA | N/A | 0.950, (0.934, 0.964) | ||
| Standalone Performance (CC quadrant assessment) | PPA & NPA | |||
| Overall PPA | N/A | 0.765 (0.726, 0.810) | ||
| Overall NPA | N/A | 0.963 (0.951, 0.965) | ||
| Standalone Performance (MLO quadrant assessment) | PPA & NPA | |||
| Overall PPA | N/A | 0.471 (0.425, 0.523) | ||
| Overall NPA | N/A | 0.889 (0.878, 0.902) | ||
| Standalone Performance (Depth assessment) | PPA & NPA | |||
| Overall PPA | N/A | 0.617 (0.587, 0.644) | ||
| Overall NPA | N/A | 0.943 (0.932, 0.953) |
1. Sample sizes used for the test set and data provenance:
-
MRMC Study (AI-aided reading):
- Sample Size: 240 combined DBT/2D mammograms (DBT+FFDM or DBT+2DSM) with a prior.
- Data Provenance: Not explicitly stated, but the inclusion of "MQSA qualified and ABR certified radiologists" suggests US-based data or a study conducted under US regulatory standards. It's retrospective (pre-collected cases).
-
Standalone Performance Study:
- Sample Size: 7,544 exams from 4,429 patients.
- Data Provenance: Prospective, from 3 US centers. The demographics table provides a distribution of race, age, and imaging modalities used (Hologic only for manufacturer), explicitly confirming US origin. Exam dates range from 2005 - 2023.
2. Number of experts used to establish the ground truth for the test set and their qualifications:
- MRMC Study (AI-aided reading): Not explicitly stated how the ground truth was established for the 240 cases, but it's implied that these were "truth" cases used to evaluate reader and system performance. Given the type of study, it's highly likely to have been based on a consensus of expert radiologists or pathology confirmation.
- Standalone Performance Study: Not explicitly stated how the ground truth for the 7,544 cases was established, but the description "Cancer status: Malignant: 23% / Normal/benign: 77%" implies a confirmed ground truth, likely through pathology reports or long-term follow-up. The reference to "reference standard" for lesion type, quadrant, and depth assessment also suggests expert review or confirmed diagnoses.
3. Adjudication method for the test set:
- MRMC Study: Not explicitly mentioned.
- Standalone Performance Study: Not explicitly mentioned.
4. If a multi reader multi case (MRMC) comparative effectiveness study was done, and if so, what was the effect size of how much human readers improve with AI vs without AI assistance:
- Yes, an MRMC study was done.
- Effect Size of Improvement with AI:
- Average AUROC: Increase of +0.074 [0.047 - 0.101] (p-value < 0.001). (From 0.797 unaided to 0.871 aided).
- Average Sensitivity: Increase of +0.086 [0.040 - 0.133] (p-value < 0.001). (From 0.706 unaided to 0.793 aided).
- Average Specificity: Increase of +0.021 [0.006 - 0.036] (p-value 0.007). (From 0.815 unaided to 0.836 aided).
5. If a standalone (i.e., algorithm only without human-in-the loop performance) was done:
- Yes, a standalone performance study was done.
- Overall AUROC at mammogram level: 0.883 [0.837 - 0.929].
- This was found to be superior to radiologists in unaided reading conditions (ΔAUROC = +0.085 [0.044 - 0.127], p-value <0.0001).
- It was also non-inferior to radiologists in aided reading conditions (ΔAUROC = +0.012 [-0.015 - 0.039], p-value <0.0001).
- Standalone sensitivity was 0.833 [0.756 – 0.911] and specificity was 0.793 [0.728 – 0.858].
- Detailed standalone performance by subgroup (density, race, source, age, lesion type, lesion size, lesion severity, imaging combination, prior image combination, prior time difference) was also provided, along with lesion type, quadrant, and depth assessment performance (PPA and NPA).
6. The type of ground truth used:
- MRMC Study: Not explicitly detailed, but usually based on pathology or rigorous follow-up.
- Standalone Performance Study: The ground truth for cancer status is indicated by "Malignant: 23% / Normal/benign: 77%," implying pathology confirmation or long-term follow-up. For lesion type, quadrant, and depth assessments, it refers to a "reference standard," which typically indicates expert consensus or pathology correlation.
7. The sample size for the training set:
- The training set sample size is not explicitly stated in the provided text. The document mentions that the deep learning modules are "trained with large databases of biopsy-proven examples of breast cancer and normal tissue," but specific numbers are not given.
8. How the ground truth for the training set was established:
- The ground truth for the training set was established using "large databases of biopsy-proven examples of breast cancer and normal tissue." This implies that the training data included cases with definitive diagnostic outcomes (e.g., via biopsy with histopathological confirmation).
{0}------------------------------------------------
August 1, 2024
Image /page/0/Picture/1 description: The image shows the logos of the Department of Health & Human Services and the U.S. Food & Drug Administration (FDA). The Department of Health & Human Services logo is on the left, featuring a stylized human figure. The FDA logo is on the right, with the letters "FDA" in a blue square, followed by the words "U.S. FOOD & DRUG ADMINISTRATION" in blue text. The logos are placed side by side.
Therapixel Quentin De Snoeck RA/QA Director/CISO 455 Promenade des Anglais NICE. 06200 FRANCE
Re: K240301
Trade/Device Name: MammoScreen® (3) Regulation Number: 21 CFR 892.2090 Regulation Name: Radiological Computer Assisted Detection And Diagnosis Software Regulatory Class: Class II Product Code: QDQ Dated: July 2, 2024 Received: July 2, 2024
Dear Quentin De Snoeck:
We have reviewed your section 510(k) premarket notification of intent to market the device referenced above and have determined the device is substantially equivalent (for the indications for use stated in the enclosure) to legally marketed predicate devices marketed in interstate commerce prior to May 28, 1976, the enactment date of the Medical Device Amendments, or to devices that have been reclassified in accordance with the provisions of the Federal Food, Drug, and Cosmetic Act (the Act) that do not require approval of a premarket approval application (PMA). You may, therefore, market the device, subject to the general controls provisions of the Act. Although this letter refers to your product as a device, please be aware that some cleared products may instead be combination products. The 510(k) Premarket Notification Database available at https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfpmn/pmn.cfm identifies combination product submissions. The general controls provisions of the Act include requirements for annual registration, listing of devices, good manufacturing practice, labeling, and prohibitions against misbranding and adulteration. Please note: CDRH does not evaluate information related to contract liability warranties. We remind you, however, that device labeling must be truthful and not misleading.
If your device is classified (see above) into either class II (Special Controls) or class III (PMA), it may be subject to additional controls. Existing major regulations affecting your device can be found in the Code of Federal Regulations, Title 21, Parts 800 to 898. In addition, FDA may publish further announcements concerning your device in the Federal Register.
Additional information about changes that may require a new premarket notification are provided in the FDA guidance documents entitled "Deciding When to Submit a 510(k) for a Change to an Existing Device" (https://www.fda.gov/media/99812/download) and "Deciding When to Submit a 510(k) for a Software Change to an Existing Device" (https://www.fda.gov/media/99785/download).
{1}------------------------------------------------
Your device is also subject to, among other requirements, the Quality System (QS) regulation (21 CFR Part 820), which includes, but is not limited to, 21 CFR 820.30, Design controls; 21 CFR 820.90, Nonconforming product; and 21 CFR 820.100, Corrective and preventive action. Please note that regardless of whether a change requires premarket review, the QS regulation requires device manufacturers to review and approve changes to device design and production (21 CFR 820.30 and 21 CFR 820.70) and document changes and approvals in the device master record (21 CFR 820.181).
Please be advised that FDA's issuance of a substantial equivalence determination does not mean that FDA has made a determination that your device complies with other requirements of the Act or any Federal statutes and regulations administered by other Federal agencies. You must comply with all the Act's requirements, including, but not limited to: registration and listing (21 CFR Part 807); labeling (21 CFR Part 801); medical device reporting of medical device-related adverse events) (21 CFR Part 803) for devices or postmarketing safety reporting (21 CFR Part 4, Subpart B) for combination products (see https://www.fda.gov/combination-products/guidance-regulatory-information/postmarketing-safety-reportingcombination-products); good manufacturing practice requirements as set forth in the quality systems (QS) regulation (21 CFR Part 820) for devices or current good manufacturing practices (21 CFR Part 4, Subpart A) for combination products; and, if applicable, the electronic product radiation control provisions (Sections 531-542 of the Act); 21 CFR Parts 1000-1050.
Also, please note the regulation entitled, "Misbranding by reference to premarket notification" (21 CFR 807.97). For questions regarding the reporting of adverse events under the MDR regulation (21 CFR Part 803), please go to https://www.fda.gov/medical-device-safety/medical-device-reportingmdr-how-report-medical-device-problems.
For comprehensive regulatory information about mediation-emitting products, including information about labeling regulations, please see Device Advice (https://www.fda.gov/medicaldevices/device-advice-comprehensive-regulatory-assistance) and CDRH Learn (https://www.fda.gov/training-and-continuing-education/cdrh-learn). Additionally, you may contact the Division of Industry and Consumer Education (DICE) to ask a question about a specific regulatory topic. See the DICE website (https://www.fda.gov/medical-device-advice-comprehensive-regulatoryassistance/contact-us-division-industry-and-consumer-education-dice) for more information or contact DICE by email (DICE@fda.hhs.gov) or phone (1-800-638-2041 or 301-796-7100).
Sincerely,
Yanna S. Kang -S
Yanna Kang, Ph.D. Assistant Director Mammography and Ultrasound Team DHT8C: Division of Radiological Imaging and Radiation Therapy Devices OHT8: Office of Radiological Health Office of Product Evaluation and Quality Center for Devices and Radiological Health
{2}------------------------------------------------
DEPARTMENT OF HEALTH AND HUMAN SERVICES Food and Drug Administration
Indications for Use
Submission Number (if known)
Device Name
MammoScreen® (3)
Indications for Use (Describe)
MammoScreen® 3 is a concurrent reading and reporting aid for physicians interpreting screening mammograms. It is intended for use with compatible full-field digital mammography and digital breast tomosynthesis systems. The device can also use compatible prior examinations in the analysis.
Output of the device includes graphical marks of findings as soft-tissue lesions or calcifications on
mammograms along with their level of suspicion scores. The lesion type is characterized as mass/ asymmetry, distortion, or calcifications for each detected finding. The level of suspicion score is expressed at the finding level, for each breast, and overall for the mammogram.
The location of findings including quadrant, depth, and distance from the nipple, is also provided. This adjunctive information is intended to assist interpreting physicians during reporting.
Patient management decisions should not be made solely based on the analysis by MammoScreen 3.
Type of Use (Select one or both, as applicable)
Prescription Use (Part 21 CFR 801 Subpart D)
Over-The-Counter Use (21 CFR 801 Subpart C)
CONTINUE ON A SEPARATE PAGE IF NEEDED.
This section applies only to requirements of the Paperwork Reduction Act of 1995.
DO NOT SEND YOUR COMPLETED FORM TO THE PRA STAFF EMAIL ADDRESS BELOW.
The burden time for this collection of information is estimated to average 79 hours per response, including the time to review instructions, search existing data sources, gather and maintain the data needed and complete and review the collection of information. Send comments regarding this burden estimate or any other aspect of this information collection, including suggestions for reducing this burden, to:
Department of Health and Human Services Food and Drug Administration Office of Chief Information Officer Paperwork Reduction Act (PRA) Staff PRAStaff(@fda.hhs.gov
"An agency may not conduct or sponsor, and a person is not required to respond to, a collection of information unless it displays a currently valid OMB number."
Form Approved: OMB No. 0910-0120 Expiration Date: 07/31/2026 See PRA Statement below.
{3}------------------------------------------------
510(k) Summary
This 510(k) summary is prepared in accordance with the requirements of 21 CFR § 807.92.
Applicant Information: Therapixel 455 Promenade des Anglais, 06200 Nice France Phone: +33 9 72 55 20 39 Submission Correspondent: Quentin de Snoeck RAQA Director/CISO Email: qdesnoeck@therapixel.com Phone: +33 9 72 55 20 39 Date Summary Prepared: August 01, 2024 Device Information: Trade Name: MammoScreen® Model: 3 Common Name: Computer-Assisted Detection Device Device Classification Name: Radiological Computer Assisted Detection/Diagnosis Software For Lesions Suspicious For Cancer Regulation Number: 892.2090 Class II Regulation Class:
Predicate Device:
Product Code:
Submission type 510(k) number:
The predicate device is MammoScreen, cleared under K211541 (Product code QDQ).
Traditional 510(k)
QDQ
{4}------------------------------------------------
Device Description
MammoScreen is a concurrent reading medical software device using artificial intelligence to assist radiologists in the interpretation of mammograms.
MammoScreen processes the mammogram(s) and detects findings suspicious for breast cancer. Each detected finding gets a score called the MammoScreen Score™. The score was designed such that findings with a low score have a very low level of suspicion. As the score increases, so does the level of suspicion. For each mammogram, MammoScreen outputs detected findings with their associated score, a score per breast, driven by the highest finding score for each breast, and a score per case, driven by the highest finding score overall. The MammoScreen Score goes from one to ten.
MammoScreen is available for 2D (FFDM images) and 3D processing (FFDM & DBT or 2DSM & DBT). Optionally, MammoScreen can use prior examinations in the analysis. The following table describes the supported combinations:
| Current | ||||||
|---|---|---|---|---|---|---|
| FFDM | DBT | 2DSM | DBT+FFDM | DBT+2DSM | ||
| Prior | / | Supported | Unsupported | Unsupported | Supported | Supported |
| FFDM | Unsupported | Unsupported | Unsupported | Supported | Unsupported | |
| DBT | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | |
| 2DSM | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | |
| DBT + FFDM | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | |
| DBT + 2DSM | Unsupported | Unsupported | Unsupported | Unsupported | Supported |
MammoScreen can also aid in the reporting process by populating an initial report with chosen findings, including lesion type and position (quadrant, depth and distance to nipple).
The results indicating potential breast cancer, identified by MammoScreen, are accessible via a dedicated user interface and can seamlessly integrate into DICOM viewers (using DICOM-SC and DICOM-SR). Reporting aid outputs can be incorporated into the practice's reporting system to generate a preliminary report. Additionally, certain outputs like the case score can be reported into the patient management worklist.
Note that the MammoScreen outputs should be used as complementary information by radiologists while interpreting mammograms. For all cases, the medical professional interpreting the mammogram remains the sole decision-maker.
Indication for Use
MammoScreen 3 is a concurrent reading and reporting aid for physicians interpreting screening mammograms. It is intended for use with compatible full-field digital mammography and digital breast tomosynthesis systems. The device can also use compatible prior examinations in the analysis.
{5}------------------------------------------------
Output of the device includes graphical marks of findings as soft-tissue lesions on mammograms along with their level of suspicion scores. The lesion type is characterized as mass/asymmetry, distortion, or calcifications for each detected finding. The level of suspicion score is expressed at the finding level, for each breast, and overall for the mammogram.
The location of findings, including quadrant, depth, and distance from the nipple, is also provided. This adjunctive information is intended to assist interpreting physicians during reporting.
Patient management decisions should not be made solely based on the analysis by MammoScreen 3.
| Predicate device (MammoScreen 2) | Subject device (MammoScreen 3) | ||
|---|---|---|---|
| Manufacturer | Therapixel | Therapixel | |
| Regulation number | 892.2090 | 892.2090 | |
| Product Code | QDQ | QDQ | |
| Intended Use | MammoScreen 2 is a concurrent reading and reporting aid for physicians interpreting screening mammograms. It is intended for use with compatible full-field digital mammography and digital breast tomosynthesis systems. Output of the device includes marks of findings on mammograms along with their type and level of suspicion scores. The level of suspicion score is expressed at the finding level, for each breast, and overall for the mammogram. | MammoScreen 3 is a concurrent reading and reporting aid for physicians interpreting screening mammograms. It is intended for use with compatible full-field digital mammography and digital breast tomosynthesis systems. The device can also use compatible prior examinations in the analysis. Output of the device includes graphical marks of findings as soft-tissue lesions or calcifications on mammograms along with their level of suspicion scores. The lesion type is characterized as mass/asymmetry, distortion, or calcifications for each detected finding. The level of suspicion score is expressed at the finding level, for each breast, and overall for the mammogram. The location of findings, including quadrant, depth, and distance from the nipple, is also provided. This adjunctive information is intended to assist interpreting physicians during reporting. Patient management decisions should not be made solely based on the analysis by MammoScreen 3. | |
| Predicate device (MammoScreen 2) | Subject device (MammoScreen 3) | ||
| Intended | user population | Physicians qualified to read mammograms. | Physicians qualified to read mammograms. |
| Intended | patient population | Women undergoing mammography. | Women undergoing mammography. |
| Anatomical Location | Breast | Breast | |
| Design | Software-only device | Software-only device |
Summary of Substantial Equivalence
{6}------------------------------------------------
Predicate Device Comparison
The indication for use of MammoScreen 3 is similar to that of the predicate device. Both devices are intended for concurrent use by physicians interpreting breast images to help them with localizing and characterizing findings. The devices are not intended as a replacement for the review of a physician or their clinical judgment.
The predicate device and the subject device are two software versions of MammoScreen. They both rely on the same fundamental scientific technology. For both devices, a choice of medical image processing and machine learning techniques are implemented. The systems includes 'deep learning' modules for the detection of suspicious findings. These modules are trained with large databases of biopsy-proven examples of breast cancer and normal tissue.
Compared to the predicate device, MammoScreen 3 provides more detailed categories of finding characterization and a more detailed location of findings as a reporting aid component.
In addition, the algorithmic components have been updated to improve detection accuracy for the analysis. Moreover some additional inputs have been added to the MammoScreen 3 device compare to the predicate device. MammoScreen 3 takes into account current mammogram that can be FFDM or FFDM & DBT or 2DSM & DBT. MammoScreen 3 can also incorporate in its analysis one prior mammogram, acquired between 6 and 60 months before the current mammogram, whether it is FFDM or 2DSM & DBT.
The design changes of this new version of MammoScreen have been assessed and do not raise different questions of safety and effectiveness than the previous version.
| Technologicalcharacteristic | Predicate(MammoScreen 2)device | Subject(MammoScreen 3)device |
|---|---|---|
| Type of artificialintelligence | MammoScreen 2 is powered by artificialintelligence/machine learning-based software algorithm | Same. |
| Level of suspicion | MammoScreen 2 outputs a level ofsuspicion at the finding, breast andcase level. | Same. |
| Lesion type | For each detected findingMammoScreen 2 classifies them assoft tissue lesion or calcifications. | For each detected findingMammoScreen 3 classifies them asmass/asymmetry, distortion orcalcifications. |
{7}------------------------------------------------
| Technologicalcharacteristic | Predicate(MammoScreen 2) | Subject(MammoScreen 3) |
|---|---|---|
| Localization | MammoScreen 2 doesn't provide alocalization. | For each finding MammoScreen 3provides a quadrant, a depth and adistance to the nipple. |
| Inputs | FFMD or DBT | FFDM or 2DSM & DBT or FFDM& DBT, with an optional prior(FFDM or 2DSM & DBT) expectfor the former. |
Non-Clinical Testing
MammoScreen is a software-only device.
Tests have been performed in compliance with the following recognized consensus standards:
- IEC 62304:2006/A1:2016- Medical device software Software life-cycle processes ●
- IEC 62366-1:2015+AMD1:2020- Medical devices Application of usability engineering to medical ● devices.
MammoScreen 3 has successfully completed integration testing and beta validation. In addition, potential hazards have been evaluated and mitigated, and have acceptable levels.
Clinical Performance Data
A multi-reader multi-case study (MRMC) has been conducted to establish the clinical performance of MammoScreen 3. The study applied a fully crossed design, so that each case was red by each reader both with and without the aid of MammoScreen 3.
The study aimed to determine:
- . Whether the performance of radiologists when using MammoScreen 3 is superior to unaided radiologist for interpretation of screening mammograms (primary objective).
- . Whether the performance of MammoScreen 3 standalone is superior to unaided radiologist performance.
- . Whether the performance of MammoScreen 3 standalone is non-inferior to aided radiologist performance.
The dataset included 240 combined DBT/2D mammograms (DBT+FFDM or DBT+2DSM) with a prior. Readers were all MQSA qualified and ABR certified radiologists (18 breast specialists and 5 general
{8}------------------------------------------------
radiologists) All performances were evaluated at mammogram, breast and finding level and assessed in terms of Area Under the Receiver Operating Characteristic Curve (AUROC).
Results:
- . Radiologists improved their diagnostic performance in breast cancer detection when using MammoScreen.
- o The average value of AUC went from 0.797 [0.752 - 0.843] in unaided reading conditions to 0.871 [0.829 - 0.912] in reading conditions assisted by MammoScreen, leading to a statistically significant difference of +0.074 [0.047 - 0.101] (p-value < 0.001).
- The average sensitivity went from 0.706 [0.633 0.780] in unaided reading conditions to O 0.793 [0.725 - 0.860] in reading conditions assisted by MammoScreen, leading to a statistically significant difference of +0.086 [0.040 - 0.133] (p-value < 0.001).
- The average specificity went from 0.815 [0.782 0.848] in unaided reading conditions to o 0.836 [0.805 - 0.867] in reading conditions assisted by MammoScreen, leading to a statistically significant difference of +0.021 [0.006 - 0.036] (p-value 0.007).
- . The AUC value at the mammogram level of MammoScreen as a standalone system was 0.883 [0.837 - 0.929] and was found to be superior to radiologists in unaided reading conditions (AAUC=+0.085 [0.044 - 0.127], p-value <0.0001) and non-inferior to radiologists in aided reading conditions (AAUC=+0.012 [-0.015 - 0.039], p-value <0.0001). The sensitivity and specificity values at the mammogram level of MammoScreen as a standalone system were respectively 0.833 [0.756 – 0.911] and 0.793 [0.728 – 0.858].
Gains in performance have been established overall and for all the considered subgroups. Subgroups analysis were performed for breast density, lesion type, patient age, lesion size, lesion severity, current image combination (DBT+2DSM, DBT+FFDM), prior image combination (current DBT+2DSM with prior DBT+2DSM, current DBT+FFDM with prior FFDM), data sources, reader experience, reader background, and prior time difference. The performance in assessing descriptive characteristics of lessions (i.e., lesion type, quadrant and depth) has also been evaluated.
{9}------------------------------------------------
Standalone Performance Data
| Cancer status | Normal/benign: 77% |
|---|---|
| Malignant: 23% | |
| Breast density | A: 10% |
| B: 44% | |
| C: 31% | |
| D: 15% | |
| Age (years old) | Minimum: 32 |
| Maximum: 93 | |
| Mean: 58 | |
| 25th percentile: 49 | |
| 75th percentile: 65 | |
| Race | White: 37% |
| Black: 16% | |
| Asian: 18% | |
| American Indian / Alaska native: 0.8% | |
| Native Hawaiian / Pacific islander: 0.2% | |
| Unspecified: 28% | |
| Imaging modality | DBT (inc. FFDM or 2DSM): 64% |
| FFDM: 36% | |
| Modality manufacturer | Hologic: 100% |
| Exam dates (range) | 2005 - 2023 |
7,544 exams from 4,429 patients screened in 3 US centers were used during this study. Demographic information about the test population is provided in the table below:
- . Standalone performances in breast cancer detection at the mammogram level overall and for all subgroups evaluated are summarized in the following table:
{10}------------------------------------------------
| Subgroup | Number ofpositive cases | Number ofnegative cases | AUROC(95% CI) | Sensitivity(95% CI) | Specificity(95% CI) | |
|---|---|---|---|---|---|---|
| Overall | 1,053 | 6,420 | 0.927 (0.911, 0.942) | |||
| By density | A | 71 | 658 | 0.953 (0.924, 0.976) | 0.958 (0.901, 1.000) | 0.739 (0.701, 0.778) |
| B | 489 | 2,682 | 0.933 (0.922, 0.945) | 0.933 (0.910, 0.954) | 0.690 (0.670, 0.709) | |
| C | 415 | 1,822 | 0.921 (0.905, 0.936) | 0.896 (0.867, 0.925) | 0.727 (0.702, 0.750) | |
| D | 77 | 1,005 | 0.891 (0.846, 0.929) | 0.816 (0.727, 0.896) | 0.815 (0.787, 0.841) | |
| By race | White | 211 | 2,780 | 0.901 (0.876, 0.924) | 0.857 (0.806, 0.900) | |
| Black | 77 | 1,085 | 0.897 (0.862, 0.929) | 0.896 (0.831, 0.961) | 0.680 (0.650, 0.708) | |
| Asian | 41 | 1,406 | 0.899 (0.849, 0.940) | 0.854 (0.732, 0.951) | 0.763 (0.736, 0.789) | |
| American Indian or AlaskaNative / Native Hawaiian orother Pacific Islander | 3 | 76 | 0.886 (0.701, 1.000) | 0.663 (0.000, 1.000) | 0.672 (0.556, 0.784) | |
| By source | US 1 | 75 | 2,616 | 0.862 (0.814, 0.905) | 0.772 (0.667, 0.867) | 0.752 (0.732, 0.771) |
| US 2 | 717 | 915 | 0.951 (0.941, 0.961) | 0.935 (0.917, 0.952) | 0.787 (0.757, 0.817) | |
| US 3 | 261 | 2,889 | 0.908 (0.888, 0.925) | 0.889 (0.851, 0.927) | 0.689 (0.670, 0.709) | |
| By age | ≤ 50 years old | 273 | 3,879 | 0.912 (0.894, 0.929) | 0.879 (0.839, 0.916) | 0.720 (0.704, 0.736) |
| 50 < age ≤ 65 | 397 | 1,630 | 0.942 (0.929, 0.954) | 0.932 (0.907, 0.957) | 0.756 (0.732, 0.780) | |
| > 65 | 382 | 911 | 0.921 (0.902, 0.938) | 0.914 (0.885, 0.942) | 0.705 (0.669, 0.740) | |
| By lesion type | Mass | 180 | 6,420 | 0.945 (0.927, 0.961) | 0.939 (0.902, 0.973) | 0.727 (0.715, 0.740) |
| Calcifications | 271 | 6,420 | 0.926 (0.909, 0.942) | 0.912 (0.878, 0.941) | 0.727 (0.715, 0.740) | |
| Asymmetries | 117 | 6,420 | 0.884 (0.854, 0.911) | 0.835 (0.768, 0.897) | 0.727 (0.715, 0.740) | |
| Focal asymmetries | 262 | 6,420 | 0.921 (0.901, 0.938) | 0.902 (0.861, 0.936) | 0.727 (0.715, 0.740) | |
| Distortions | 223 | 6,420 | 0.945 (0.929, 0.960) | 0.938 (0.906, 0.969) | 0.727 (0.715, 0.740) | |
| By lesion size | < 20 mm | 380 | 6,420 | 0.905 (0.887, 0.922) | 0.887 (0.852, 0.924) | 0.727 (0.715, 0.740) |
| 20 ≤ size ≤ 30 mm | 355 | 6,420 | 0.933 (0.915, 0.948) | 0.916 (0.881, 0.946) | 0.727 (0.715, 0.740) | |
| > 30 mm | 318 | 6,420 | 0.959 (0.947, 0.971) | 0.959 (0.934, 0.983) | 0.727 (0.715, 0.740) | |
| By lesionseverity | BI-RADS 4 | 525 | 6,420 | 0.924 (0.910, 0.936) | 0.910 (0.884, 0.935) | 0.727 (0.715, 0.740) |
| BI-RADS 5 | 178 | 6,420 | 0.972 (0.960, 0.982) | 0.983 (0.961, 1.000) | 0.727 (0.715, 0.740) | |
| By currentimagecombination | FFDM | 886 | 4,303 | 0.908 (0.896, 0.919) | 0.915 (0.895, 0.933) | 0.632 (0.615, 0.647) |
| DBT + FFDM | 823 | 3,744 | 0.929 (0.919, 0.939) | 0.925 (0.905, 0.942) | 0.710 (0.694, 0.725) | |
| DBT + 2DSM | 161 | 2,114 | 0.904 (0.878, 0.927) | 0.858 (0.806, 0.903) | 0.762 (0.739, 0.785) | |
| By prior imagecombination | FFDM | 782 | 4,216 | 0.926 (0.916, 0.936) | 0.901 (0.880, 0.921) | 0.761 (0.746, 0.776) |
| DBT + FFDM | 534 | 1833 | 0.930 (0.918, 0.942) | 0.907 (0.882, 0.930) | 0.775 (0.755, 0.795) | |
| DBT + 2DSM | 80 | 921 | 0.903 (0.861, 0.937) | 0.838 (0.750, 0.912) | 0.767 (0.735, 0.797) | |
| By current &prior imagecombination | No prior | 1,037 | 6,211 | 0.925 (0.915, 0.933) | 0.933 (0.918, 0.948) | 0.667 (0.653, 0.681) |
| Current FFDM – Prior FFDM | 679 | 2,135 | 0.905 (0.891, 0.920) | 0.908 (0.887, 0.929) | 0.664 (0.643, 0.683) | |
| Current DBT+FFDMPrior FFDM | 630 | 1,717 | 0.932 (0.920, 0.943) | 0.914 (0.894, 0.935) | 0.767 (0.746, 0.788) | |
| Subgroup | Number ofpositive cases | Number ofnegative cases | AUROC(95% CI) | Sensitivity(95% CI) | Specificity(95% CI) | |
| Current DBT+FFDM | 482 | 1,586 | 0.928 (0.913, 0.941) | 0.906 (0.880, 0.932) | 0.771 (0.750, 0.791) | |
| Prior DBT+FFDM | ||||||
| Current DBT+FFDMPrior DBT+2DSM | 13 | 3 | 0.945 (0.769, 1.000) | 1.000 (1.000, 1.000) | 0.663 (0.000, 1.000) | |
| Current DBT+2DSMPrior FFDM | 120 | 1,196 | 0.908 (0.875, 0.936) | 0.875 (0.817, 0.933) | 0.764 (0.737, 0.789) | |
| Current DBT+2DSMPrior DBT+FFDM | 75 | 164 | 0.939 (0.905, 0.967) | 0.935 (0.867, 0.987) | 0.755 (0.680, 0.823) | |
| Current DBT+2DSMPrior DBT+2DSM | 65 | 878 | 0.890 (0.844, 0.931) | 0.817 (0.723, 0.908) | 0.770 (0.738, 0.799) | |
| Current FFDM - No prior | 878 | 4,167 | 0.899 (0.887, 0.911) | 0.911 (0.892, 0.930) | 0.611 (0.594, 0.628) | |
| Current DBT+FFDMNo prior | 807 | 3,606 | 0.927 (0.917, 0.937) | 0.941 (0.924, 0.957) | 0.659 (0.642, 0.676) | |
| Current DBT+2DSMNo prior | 158 | 2,040 | 0.902 (0.874, 0.928) | 0.893 (0.842, 0.943) | 0.689 (0.665, 0.712) | |
| By prior timedifference | 6 months ≤ prior < 18 months | 237 | 1,866 | 0.913 (0.893, 0.932) | 0.869 (0.823, 0.912) | 0.774 (0.755, 0.793) |
| 18 ≤ prior < 30 months | 436 | 1,735 | 0.931 (0.916, 0.944) | 0.917 (0.889, 0.942) | 0.752 (0.728, 0.777) | |
| 30 months ≤ prior < 60 months | 177 | 628 | 0.935 (0.912, 0.955) | 0.910 (0.865, 0.949) | 0.754 (0.718, 0.789) |
{11}------------------------------------------------
{12}------------------------------------------------
- . Standalone performances in lesion type assessment were measured in terms of Positive Percentage Agreement (PPA) and Negative Percentage Agreement (NPA) against the reference standard. Results are summarized in the following table:
| Counts | PPA (95% CI) | NPA (95% CI) | |
|---|---|---|---|
| Overall | 1,039 | 0.784, (0.758, 0.811) | 0.893, (0.880, 0.906) |
| Mass/asymmetry | 548 | 0.868, (0.838, 0.894) | 0.783, (0.752, 0.815) |
| Distortion | 221 | 0.544, (0.475, 0.611) | 0.947, (0.932, 0.962) |
| Calcifications | 270 | 0.941, (0.911, 0.967) | 0.950, (0.934, 0.964) |
The associated confusion matrix is provided for completeness:
Image /page/12/Figure/3 description: This image is a confusion matrix that visualizes the performance of a classification model. The matrix compares the true labels (rows) against the predicted labels (columns) for three classes: 'mass/asym.', 'distortion', and 'calcification'. The diagonal elements show the number of correctly classified instances for each class, with 476, 120, and 254 instances respectively. Off-diagonal elements indicate misclassifications, such as 94 instances of 'distortion' being misclassified as 'mass/asym'.
- Standalone performances in CC/MLO quadrant assessment were measured in terms of Positive Percentage Agreement (PPA) and Negative Percentage Agreement (NPA) against the reference standard. Results are provided in the following tables separately for CC and MLO quadrants:
| CC quadrant: | |
|---|---|
| Counts | PPA (95% CI) | NPA (95% CI) | |
|---|---|---|---|
| Overall | 736 | 0.765 (0.726, 0.810) | 0.963 (0.951, 0.965) |
| Inner / Medial | 149 | 0.866 (0.812, 0.919) | 0.978 (0.966, 0.988) |
| Central | 179 | 0.782 (0.715, 0.844) | 0.925 (0.903, 0.943) |
| Outer / Lateral | 372 | 0.884 (0.852, 0.914) | 0.953 (0.929, 0.973) |
| Retroareolar | 36 | 0.528 (0.361, 0.667) | 0.994 (0.987, 0.999) |
Associated confusion matrix of CC quadrant assessment:
{13}------------------------------------------------
Image /page/13/Figure/0 description: The image is a confusion matrix, which is a table that is used to evaluate the performance of a classification model. The matrix shows the counts of the true vs predicted labels. The matrix has labels such as inner/medial, central, outer/lateral, retroareolar, and unassigned. For example, the matrix shows that 129 instances were correctly classified as inner/medial.
MLO quadrant:
| Counts | PPA (95% CI) | NPA (95% CI) | |
|---|---|---|---|
| Overall | 749 | 0.471 (0.425, 0.523) | 0.889 (0.878, 0.902) |
| Lower / Inferior | 130 | 0.623 (0.538, 0.708) | 0.861 (0.835, 0.884) |
| Central | 151 | 0.351 (0.272, 0.430) | 0.766 (0.731, 0.796) |
| Upper / Superior | 428 | 0.458 (0.300, 0.600) | 0.941 (0.910, 0.966) |
| Retroareolar | 40 | 0.450 (0.300, 0.575) | 0.990 (0.983, 0.997) |
Associated confusion matrix of MLO quadrant assessment:
Image /page/13/Figure/4 description: This image is a confusion matrix, which is a table that is used to evaluate the performance of a classification model. The rows of the matrix represent the true labels of the data, and the columns represent the predicted labels. The diagonal elements of the matrix represent the number of correctly classified instances, while the off-diagonal elements represent the number of misclassified instances. For example, the matrix shows that 81 instances of the 'lower/inferior' class were correctly classified, while 14 instances were misclassified as 'central'.
- Standalone performances in depth assessment were measured in terms of Positive Percentage ● Agreement (PPA) and Negative Percentage Agreement (NPA) against the reference standard. Results are provided in the following table:
{14}------------------------------------------------
| Counts | PPA (95% CI) | NPA (95% CI) | |
|---|---|---|---|
| Overall | 775 | 0.617 (0.587, 0.644) | 0.943 (0.932, 0.953) |
| Posterior | 346 | 0.907 (0.873, 0.936) | 0.891 (0.865, 0.918) |
| Middle | 355 | 0.852 (0.814, 0.887) | 0.904 (0.879, 0.931) |
| Anterior | 74 | 0.699 (0.595, 0.797) | 0.992 (0.984, 0.997) |
Associated confusion matrix:
Image /page/14/Figure/2 description: This image is a confusion matrix that visualizes the performance of a classification model. The matrix has four categories: posterior, middle, anterior, and unassigned. The matrix shows the counts of true vs predicted labels, with the diagonal elements representing correct classifications. For example, 314 posterior labels were correctly predicted, while 45 middle labels were incorrectly predicted as posterior.
{15}------------------------------------------------
Conclusions
Standalone performance tests on FFDM and DBT demonstrated that MammoScreen 3 achieved superior performance compared to the predicate device.
Therapixel applied a risk management process following FDA-recognized standards to identify, evaluate, and mitigate all known hazards related to MammoScreen 3. These hazards may occur when the accuracy of diagnosis is potentially affected, causing either false positives or false negatives. All identified risks are effectively mitigated, and it can be concluded that the benefits outweigh the residual risk.
The indication for use of MammoScreen 3 is substantially the same as the predicate devices are intended for use by physicians in interpreting mammograms to help them identify and characterize findings. The devices are not intended as a replacement for the review of a physician or their clinical judgment. The overall design of MammoScreen 3 is the same as the predicate devices detect and characterize findings in radiological breast images and provide information about the presence, location, and characteristics of the findings to the user.
The new features introduced in MammoScreen 3 do not raise different questions of safety and effectiveness compared to the predicate device. The standalone performances, clinical validation, risk analysis, and usability studies confirm that MammoScreen 3 maintains its intended use and risk profile, and demonstrates substantial equivalence to the predicate device, MammoScreen 2 (K211541).
§ 892.2090 Radiological computer-assisted detection and diagnosis software.
(a)
Identification. A radiological computer-assisted detection and diagnostic software is an image processing device intended to aid in the detection, localization, and characterization of fracture, lesions, or other disease-specific findings on acquired medical images (e.g., radiography, magnetic resonance, computed tomography). The device detects, identifies, and characterizes findings based on features or information extracted from images, and provides information about the presence, location, and characteristics of the findings to the user. The analysis is intended to inform the primary diagnostic and patient management decisions that are made by the clinical user. The device is not intended as a replacement for a complete clinician's review or their clinical judgment that takes into account other relevant information from the image or patient history.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Design verification and validation must include:
(i) A detailed description of the image analysis algorithm, including a description of the algorithm inputs and outputs, each major component or block, how the algorithm and output affects or relates to clinical practice or patient care, and any algorithm limitations.
(ii) A detailed description of pre-specified performance testing protocols and dataset(s) used to assess whether the device will provide improved assisted-read detection and diagnostic performance as intended in the indicated user population(s), and to characterize the standalone device performance for labeling. Performance testing includes standalone test(s), side-by-side comparison(s), and/or a reader study, as applicable.
(iii) Results from standalone performance testing used to characterize the independent performance of the device separate from aided user performance. The performance assessment must be based on appropriate diagnostic accuracy measures (
e.g., receiver operator characteristic plot, sensitivity, specificity, positive and negative predictive values, and diagnostic likelihood ratio). Devices with localization output must include localization accuracy testing as a component of standalone testing. The test dataset must be representative of the typical patient population with enrichment made only to ensure that the test dataset contains a sufficient number of cases from important cohorts (e.g., subsets defined by clinically relevant confounders, effect modifiers, concomitant disease, and subsets defined by image acquisition characteristics) such that the performance estimates and confidence intervals of the device for these individual subsets can be characterized for the intended use population and imaging equipment.(iv) Results from performance testing that demonstrate that the device provides improved assisted-read detection and/or diagnostic performance as intended in the indicated user population(s) when used in accordance with the instructions for use. The reader population must be comprised of the intended user population in terms of clinical training, certification, and years of experience. The performance assessment must be based on appropriate diagnostic accuracy measures (
e.g., receiver operator characteristic plot, sensitivity, specificity, positive and negative predictive values, and diagnostic likelihood ratio). Test datasets must meet the requirements described in paragraph (b)(1)(iii) of this section.(v) Appropriate software documentation, including device hazard analysis, software requirements specification document, software design specification document, traceability analysis, system level test protocol, pass/fail criteria, testing results, and cybersecurity measures.
(2) Labeling must include the following:
(i) A detailed description of the patient population for which the device is indicated for use.
(ii) A detailed description of the device instructions for use, including the intended reading protocol and how the user should interpret the device output.
(iii) A detailed description of the intended user, and any user training materials or programs that address appropriate reading protocols for the device, to ensure that the end user is fully aware of how to interpret and apply the device output.
(iv) A detailed description of the device inputs and outputs.
(v) A detailed description of compatible imaging hardware and imaging protocols.
(vi) Warnings, precautions, and limitations must include situations in which the device may fail or may not operate at its expected performance level (
e.g., poor image quality or for certain subpopulations), as applicable.(vii) A detailed summary of the performance testing, including test methods, dataset characteristics, results, and a summary of sub-analyses on case distributions stratified by relevant confounders, such as anatomical characteristics, patient demographics and medical history, user experience, and imaging equipment.