Search Results
Found 6 results
510(k) Data Aggregation
(266 days)
MammoScreen BD
MammoScreen® BD is a software application intended for use with compatible full-field digital mammography and digital breast tomosynthesis systems. MammoScreen BD evaluates the breast tissue composition to provide an ACR BI-RADS 5th Edition breast density category. The device is intended to be used in the population of asymptomatic women undergoing screening mammography who are at least 40 years old.
MammoScreen BD only produces adjunctive information to aid interpreting physicians in the assessment of breast tissue composition. It is not a diagnostic software.
Patient management decisions should not be made solely based on analysis by MammoScreen BD.
MammoScreen BD is a software-only device (SaMD) using artificial intelligence to assist radiologists in the interpretation of mammograms. The purpose of the MammoScreen BD software is to automatically process a mammogram to assess the density of the breasts.
MammoScreen BD processes the 2D-mammograms standard views (CC and/or MLO of FFDM and/or the 2DSM from the DBT) to assess breast density.
For each examination, MammoScreen BD outputs the breast density following the ACR BI-RADS 5th Edition breast density category.
MammoScreen BD outputs can be integrated with compatible third-party software such as MammoScreen Suite. Results may be displayed in a web UI, as a DICOM Structured Report, a DICOM Secondary Capture Image, or within patient worklists by the third-party software.
MammoScreen BD takes as input a folder with images in DICOM formats and outputs breast density assessment in a form of a JSON file.
Note that the MammoScreen BD outputs should be used as complementary information by radiologists while interpreting breast density. Patient management decisions should not be made solely on the basis of analysis by MammoScreen BD, the medical professional interpreting the mammogram remains the sole decision-maker.
Here's a breakdown of the acceptance criteria and the study that proves MammoScreen BD meets them, based on the provided FDA 510(k) clearance letter:
Acceptance Criteria and Device Performance Study
The study primarily focuses on the standalone performance of MammoScreen BD in assessing breast density against an expert consensus Ground Truth. The key metric for performance is the quadratically weighted Cohen's Kappa (${\kappa}$).
1. Table of Acceptance Criteria and Reported Device Performance
Acceptance Criteria | Reported Device Performance |
---|---|
Primary Objective: Superiority in standalone performance for density assignment of MammoScreen BD compared to a pre-determined reference value (${\kappa_{\text{reference}} = 0.85}$). | Hologic: ${\kappa_{\text{quadratic}} = 89.03}$ [95% CI: 87.43 – 90.56] |
Acceptance Criteria (Statistical): The one-sided p-value for the test $H_0: \kappa \leq 0.85$ is less than the significance level ($\alpha=0.05$) AND the lower bound of the 95% confidence interval for Kappa $> 0.85$, indicating that the observed weighted Kappa is statistically significantly greater than 0.85. | Hologic Envision: ${\kappa_{\text{quadratic}} = 89.54}$ [95% CI: 86.88 – 91.69] |
GE: ${\kappa_{\text{quadratic}} = 93.19}$ [95% CI: 90.50 – 94.92] |
All reported Kappa values exceed the reference value of 0.85, and their 95% confidence intervals' lower bounds are also above 0.85, satisfying the acceptance criteria.
2. Sample Size and Data Provenance
Test Set:
- Hologic (original dataset): 922 patients / 1,155 studies
- Hologic Envision (new system for subject device): 500 patients / 500 studies
- GE (new system for subject device): 376 patients / 490 studies
Data Provenance:
- Hologic (original dataset):
- USA: 658 studies (distributed as A:85, B:269, C:241, D:63)
- EU: 447 studies (distributed as A:28, B:169, C:214, D:86)
- Hologic Envision: USA: 500 studies (distributed as A:50, B:200, C:200, D:50)
- GE:
- USA: 359 studies (distributed as A:38, B:155, C:139, D:31)
- EU: 129 studies (distributed as A:4, B:45, C:61, D:19)
All data for the test sets appears to be retrospective, as it's stated that the "Data used for the standalone performance testing only belongs to the test group" and is distinct from the training data.
3. Number of Experts and Qualifications for Ground Truth
- Number of Experts: 5 breast radiologists
- Qualifications: At least 10 years of experience in breast imaging interpretation.
4. Adjudication Method for the Test Set
The ground truth was established by majority rule among the assessment of the 5 breast radiologists. This implies a 3-out-of-5 or more agreement for a given breast density category to be assigned as ground truth.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
There is no mention of an MRMC comparative effectiveness study being performed to assess how much human readers improve with AI vs. without AI assistance. The study focuses solely on the standalone performance of the AI algorithm. The device is described as "adjunctive information to aid interpreting physicians," but its effect on radiologist performance isn't quantified in this document.
6. Standalone Performance (Algorithm Only)
Yes, a standalone performance study was explicitly conducted. The results for the quadratically weighted Cohen's Kappa presented in the table above (89.03 for Hologic, 89.54 for Hologic Envision, and 93.19 for GE) are all for the algorithm's performance only ("MammoScreen BD against the radiologist consensus assessment").
7. Type of Ground Truth Used
The ground truth used was expert consensus based on the visual assessment of 5 breast radiologists.
8. Sample Size for the Training Set
- Total number of studies: 108,775
- Total number of patients: 32,368
9. How the Ground Truth for the Training Set was Established
The document states that the training modules are "trained with very large databases of annotated mammograms." While "annotated" implies ground truth was established, the specific method for establishing ground truth for the training set is not detailed in the provided text. It only specifies the ground truth establishment method for the test set (majority rule of 5 radiologists). It's common for training data to use various methods for annotation, which might differ from the rigorous expert consensus used for the test set.
Ask a specific question about this device
(216 days)
MammoScreen® (4)
MammoScreen® 4 is a concurrent reading and reporting aid for physicians interpreting screening mammograms. It is intended for use with compatible full-field digital mammography and digital breast tomosynthesis systems. The device can also use compatible prior examinations in the analysis.
Output of the device includes graphical marks of findings as soft-tissue lesions or calcifications on mammograms along with their level of suspicion scores. The lesion type is characterized as mass/asymmetry, distortion, or calcifications for each detected finding. The level of suspicion score is expressed at the finding level, for each breast, and overall for the mammogram.
The location of findings, including quadrant, depth, and distance from the nipple, is also provided. This adjunctive information is intended to assist interpreting physicians during reporting.
Patient management decisions should not be made solely based on the analysis by MammoScreen 4.
MammoScreen 4 is a concurrent reading medical software device using artificial intelligence to assist radiologists in the interpretation of mammograms.
MammoScreen 4 processes the mammogram(s) and detects findings suspicious for breast cancer. Each detected finding gets a score called the MammoScreen Score™. The score was designed such that findings with a low score have a very low level of suspicion. As the score increases, so does the level of suspicion. For each mammogram, MammoScreen 4 outputs the detected findings with their associated score, a score per breast, driven by the highest finding score for each breast, and a score per case, driven by the highest finding score overall. The MammoScreen Score goes from one to ten.
MammoScreen 4 is available for 2D (FFDM images) and 3D processing (FFDM & DBT or 2DSM & DBT). Optionally, MammoScreen 4 can use prior examinations in the analysis.
The results indicating potential breast cancer, identified by MammoScreen 4, are accessible via a dedicated user interface and can seamlessly integrate into DICOM viewers (using DICOM-SC and DICOM-SR). Reporting aid outputs can be incorporated into the practice's reporting system to generate a preliminary report.
Note that the MammoScreen 4 outputs should be used as complementary information by radiologists while interpreting mammograms. For all cases, the medical professional interpreting the mammogram remains the sole decision-maker.
The provided text describes the acceptance criteria and a study to prove that MammoScreen® 4 meets these criteria. Here is a breakdown of the requested information:
Acceptance Criteria and Device Performance
1. Table of Acceptance Criteria and Reported Device Performance
Rationale for using "MammoScreen 2" data for comparison: The document states that the standalone testing for MammoScreen 4 compared its performance against "MammoScreen 2 on Dimension". While MammoScreen 3 is the predicate device, the provided performance data in the standalone test section specifically refers to MammoScreen 2. The PCCP section later references performance targets for MammoScreen versions 1, 2, and 3, but the actual "Primary endpoint" results for the current device validation are given in comparison to MammoScreen 2. Therefore, the table below uses the reported performance against MammoScreen 2 as per the "Primary endpoint" section.
Metric | Acceptance Criteria | Reported Device Performance (MammoScreen 4 vs. MammoScreen 2) |
---|---|---|
Primary Objective | Non-inferiority in standalone cancer detection performance compared to the previous version of MammoScreen (specifically MammoScreen 2 on Dimension). | Achieved. |
AUC at the mammogram level | Positive lower bound of the 95% CI of the difference in endpoints between MammoScreen 4 and MammoScreen 2. | MS4: 0.894 (0.870, 0.919) |
MS2: 0.867 (0.839, 0.896) | ||
Δ: 0.027 (0.002, 0.052), p |
Ask a specific question about this device
(124 days)
MammoScreen BD
MammoScreen® BD is a software application intended for use with compatible full field digital mammography and digital breast tomosynthesis systems. MammoScreen BD evaluates the breast tissue composition to provide an ACR BI-RADS 5th Edition breast density category. The device is intended to be used in the population of asymptomatic women undergoing screening mammography who are at least 40 years old.
MammoScreen BD only produces adjunctive information to aid interpreting physicians in the assessment of breast tissue composition. It is not a diagnostic software.
Patient management decisions should not be made solely based on analysis by MammoScreen BD.
MammoScreen BD is a software-only device (SaMD) using artificial intelligence to assist radiologists in the interpretation of mammograms. The MammoScreen BD software is to automatically process a mammogram to assess the density of the breasts.
For each examination, MammoScreen BD outputs the breast density in accordance with the American College of Radiology (ACR) Breast Imaging Reporting and Data System (BI-RADS) Atlas 5th Edition breast density categories "A" through "D".
MammoScreen BD takes as input a folder with images in DICOM formats and outputs a breast density assessment in a form of a JSON file. MammoScreen BD outputs can be integrated with compatible third-party software such as the MammoScreen Web-UI interface, PACS viewer (using DICOM Structured Report or DICOM Secondary Capture SOP Class UIDs), patient worklists, or within reporting software.
Here is a detailed breakdown of the acceptance criteria and study information for MammoScreen BD, based on the provided document:
1. Table of Acceptance Criteria and Reported Device Performance
The primary acceptance criteria for the initial clearance of MammoScreen BD were related to the accuracy and agreement with ground truth established by radiologists for classifying breast density into four BI-RADS categories.
Acceptance Criteria (from PCCP section for future modifications) | Primary Objective Reported Device Performance (4-class task) | Primary Objective Reported Device Performance (Binary task) |
---|---|---|
Quadratic Kappa on GE mammograms superior to 0.85 | Quadratic Cohen's Kappa: 89.03 (95% CI: 87.43 - 90.56) | Quadratic Cohen's Kappa: 84.50 (95% CI: 81.46, 87.36) |
Linear Kappa, Accuracy, and Density Bins (A, B, C, D) | Accuracy: 84.68 (95% CI: 82.68, 86.67) | Accuracy: 92.29 (95% CI: 90.82, 93.77) |
Note: The document explicitly states "Acceptance criteria of the updated device" under the PCCP for future modifications. While the document does not explicitly state the acceptance criteria for the initial clearance in a separate section, the reported performance metrics (Quadratic Cohen's Kappa and Accuracy for both 4-class and binary classification) are implicitly the metrics against which the device's performance was judged for its initial clearance, demonstrating its effectiveness based on comparison to the ground truth.
2. Sample Size Used for the Test Set and Data Provenance
- Sample Size for Test Set: 922 women/exams. (Total of 922 exams with 4 views each).
- Data Provenance: Retrospectively collected from two US screening centers and one French screening center.
- 52.6% of cases (485 patients) originated from the USA.
- 47.4% of cases (437 patients) originated from France.
- The provenance did not intersect any clinical centers used for algorithm development, mitigating a center-induced bias.
3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications
- Number of Experts: 5 breast radiologists.
- Qualifications of Experts: The document specifies "5 breast radiologists" but does not provide details on their years of experience or specific board certifications.
4. Adjudication Method for the Test Set
- Adjudication Method: "Consensus among the visual assessment of 5 breast radiologists." The exact method (e.g., majority vote, sequential review with tie-breaking) is not explicitly detailed beyond "consensus."
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- No, a multi-reader multi-case (MRMC) comparative effectiveness study evaluating human readers with AI assistance versus without AI assistance was not conducted or reported in this document. The study focuses on the standalone performance of the AI algorithm against expert consensus.
6. Standalone (Algorithm Only) Performance Study
- Yes, a standalone performance study was conducted. The "Primary Objectives" and "Performance Data" sections directly evaluate "the accuracy and the reproducibility of MammoScreen BD algorithm in assessing the breast density category" in terms of agreement with the ground truth established by the consensus of 5 radiologists.
- For the 4-class task, the algorithm achieved a quadratic Cohen's kappa of 89.03 and an accuracy of 84.68%.
- For the binary classification task (dense vs. non-dense), the algorithm achieved a quadratic Cohen's kappa of 84.50 and an accuracy of 92.29%.
7. Type of Ground Truth Used
- Type of Ground Truth: Expert Consensus. Specifically, "ground truth (GT) established by consensus among the visual assessment of 5 breast radiologists."
8. Sample Size for the Training Set
- Sample Size for Training Set: 32,368 patients, comprising 108,775 studies.
9. How Ground Truth for the Training Set Was Established
- The document states that the training data was derived from "De-identified screening mammograms... retrospectively collected from 32,368 patients in 2 different US sites."
- It does not explicitly state how the ground truth for the training set was established. It only describes the density distribution (A: 12.79%, B: 34.58%, C: 42.94%, D: 9.38%) within the training data, implying these were pre-existing labels. It's common for such labels to be derived from radiologist reports or existing clinical records, but the specific method of ground truth establishment for the training set is not detailed.
Ask a specific question about this device
(182 days)
MammoScreen® (3)
MammoScreen® 3 is a concurrent reading and reporting aid for physicians interpreting screening mammograms. It is intended for use with compatible full-field digital mammography and digital breast tomosynthesis systems. The device can also use compatible prior examinations in the analysis.
Output of the device includes graphical marks of findings as soft-tissue lesions or calcifications on mammograms along with their level of suspicion scores. The lesion type is characterized as mass/ asymmetry, distortion, or calcifications for each detected finding. The level of suspicion score is expressed at the finding level, for each breast, and overall for the mammogram.
The location of findings including quadrant, depth, and distance from the nipple, is also provided. This adjunctive information is intended to assist interpreting physicians during reporting.
Patient management decisions should not be made solely based on the analysis by MammoScreen 3.
MammoScreen is a concurrent reading medical software device using artificial intelligence to assist radiologists in the interpretation of mammograms.
MammoScreen processes the mammogram(s) and detects findings suspicious for breast cancer. Each detected finding gets a score called the MammoScreen Score™. The score was designed such that findings with a low score have a very low level of suspicion. As the score increases, so does the level of suspicion. For each mammogram, MammoScreen outputs detected findings with their associated score, a score per breast, driven by the highest finding score for each breast, and a score per case, driven by the highest finding score overall. The MammoScreen Score goes from one to ten.
MammoScreen is available for 2D (FFDM images) and 3D processing (FFDM & DBT or 2DSM & DBT). Optionally, MammoScreen can use prior examinations in the analysis.
MammoScreen can also aid in the reporting process by populating an initial report with chosen findings, including lesion type and position (quadrant, depth and distance to nipple).
The results indicating potential breast cancer, identified by MammoScreen, are accessible via a dedicated user interface and can seamlessly integrate into DICOM viewers (using DICOM-SC and DICOM-SR). Reporting aid outputs can be incorporated into the practice's reporting system to generate a preliminary report. Additionally, certain outputs like the case score can be reported into the patient management worklist.
Note that the MammoScreen outputs should be used as complementary information by radiologists while interpreting mammograms. For all cases, the medical professional interpreting the mammogram remains the sole decision-maker.
Here's a summary of the acceptance criteria and the study that proves the device meets them, based on the provided text:
Acceptance Criteria and Reported Device Performance
The acceptance criteria are not explicitly listed in a separate table within the document. However, the clinical and standalone performance studies establish benchmarks and demonstrate achievement of certain levels of accuracy, sensitivity, and specificity. The criteria are implied through the statement "MammoScreen 3 achieved superior performance compared to the predicate device" and the detailed statistical results provided.
Table of Performance Results
Given that specific "acceptance criteria" (e.g., "AUROC must be > X") are not explicitly stated, I will present the reported performance of MammoScreen 3 in both co-reading and standalone modes, along with improvements (effect sizes) in the co-reading scenario.
Performance Metric | Acceptance Criteria (Implied) | MammoScreen 3 (Co-reading with Radiologists) | MammoScreen 3 (Standalone) | Notes |
---|---|---|---|---|
Radiologist Performance (Co-reading) | Superior to unaided radiologist performance | |||
Average AUROC (aided) | Higher than unaided | 0.871 [0.829 - 0.912] | N/A | Unaided: 0.797 [0.752 - 0.843] |
Average Sensitivity (aided) | Higher than unaided | 0.793 [0.725 - 0.860] | N/A | Unaided: 0.706 [0.633 - 0.780] |
Average Specificity (aided) | Higher than unaided | 0.836 [0.805 - 0.867] | N/A | Unaided: 0.815 [0.782 - 0.848] |
Standalone Performance (overall mammogram level) | Superior to unaided radiologists; Non-inferior to aided radiologists | N/A | 0.883 [0.837 - 0.929] | Superior to unaided: ΔAUROC = +0.085 (p |
Ask a specific question about this device
(191 days)
MammoScreen 2.0
MammoScreen® is intended for use as a concurrent reading aid for interpreting physicians, to help identify findings on screening FFDM and DBT acquired with compatible mammography systems and assess their level of suspicion. Output of the device includes marks placed on findings on the mammogram and level of suspicion scores. The findings could be soft tissue lesions or calcifications. The level of suspicion score is expressed at the finding level, for each breast and overall for the mammogram. Patient management decisions should not be made solely on the basis of analysis by MammoScreen®.
MammoScreen 2.0 automatically processes the four views (one CC and one MLO per breast) of standard screening FFDM or DBT, and outputs a corresponding report on a separate screen, alongside the monitors used for reading. This report is designed to be easily readable with very few interactions required by providing an overall level of suspicion of each exam and giving explicit visual indications when highly suspicious exams are detected.
MammoScreen 2.0 detects and characterizes findings on a scale from one to ten, referred to as the MammoScreen score. The score was designed such that findings with a low score have a very low level of suspicion. As the score increases, so does the level of suspicion.
Furthermore, MammoScreen 2.0 provides a high level of interpretability. Results are by construction consistent at the finding, breast and mammogram level. A breast takes on the highest score of its detected findings, and the level of suspicion for the exam is driven by the breast(s) with the highest score. Therefore, it is always possible to track a high suspicion of malignancy for an exam to the corresponding breast(s), and to a specific finding within the breast(s).
Here's a breakdown of the acceptance criteria and the study that proves the device meets them based on the provided text:
1. Table of Acceptance Criteria and Reported Device Performance
Performance Metric | Acceptance Criteria (Implicit) | Reported Device Performance (FFDM) | Reported Device Performance (DBT) |
---|---|---|---|
Radiologist Performance with AID (AUC) | Superior to unaided radiologist performance | Increased from 0.77 to 0.80 | Increased from 0.79 to 0.83 |
Standalone Performance (AUC) | Non-inferior to unaided radiologist performance | 0.79 (non-inferior to 0.77 unaided) | 0.84 (superior to 0.79 unaided) |
Standalone Performance vs. Predicate (FFDM) | Non-inferior to predicate device | Achieved non-inferior performance | Not applicable |
2. Sample Size Used for the Test Set and Data Provenance
- Sample Size (FFDM & DBT): 240 cases (enriched sample set)
- Data Provenance: Not explicitly stated regarding country of origin. The studies are described as "reader studies," implying prospective collection for the purpose of the study or a curated retrospective selection. The text doesn't specify if it's purely retrospective or prospective.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications
- Number of Experts: 14 for the 2D (FFDM) study and 20 for the 3D (DBT) study.
- Qualifications: "MOSA-qualified and ABR-certified readers." (MOSA and ABR are common certifications for radiologists in the US, suggesting a US context for the experts).
4. Adjudication Method for the Test Set
The provided text does not explicitly state the adjudication method used to establish the ground truth for the test set. It mentions "enriched sample set" and "MOSA-qualified and ABR-certified readers," suggesting expert consensus, but the specific process (e.g., 2+1, 3+1) is not detailed.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, and the Effect Size of How Much Human Readers Improve with AI vs. Without AI Assistance
- Yes, an MRMC study was done. Clinical validation included two reader studies (one for FFDM and one for DBT) using a multi-reader multi-case (MRMC) cross-over design.
- Effect Size of Improvement:
- FFDM: Average AUC for radiologists increased from 0.77 (without AI) to 0.80 (with AI). (Improvement: 0.03 AUC)
- DBT: Average AUC for radiologists increased from 0.79 (without AI) to 0.83 (with AI). (Improvement: 0.04 AUC)
6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done
- Yes, standalone performance was evaluated. The objectives of the studies included determining: "Whether the performance of MammoScreen standalone is superior to unaided radiologist performance" and "Whether the performance of MammoScreen standalone is non-inferior to aided radiologist performance."
- Standalone Performance Results:
- FFDM: AUC = 0.79 (found to be non-inferior to the average unaided radiologists' performance of 0.77).
- DBT: AUC = 0.84 (found to be superior to the average unaided radiologists' performance of 0.79).
- Additionally, standalone performance tests for MammoScreen 2.0 (FFDM) demonstrated non-inferiority compared to the predicate device.
7. The Type of Ground Truth Used
The text implicitly suggests expert consensus based on the mention of "MOSA-qualified and ABR-certified readers." It also references the training of deep learning modules with "biopsy-proven examples of breast cancer and normal tissue," indicating that biopsy (pathology) results were used as the ultimate ground truth to establish the benign/malignant status of lesions in the training data, and likely in the test set's ground truth development as well. The study assesses performance in the "detection of breast cancer," linking the ground truth directly to malignancy.
8. The Sample Size for the Training Set
The document states that the deep learning modules were "trained with very large databases of biopsy-proven examples of breast cancer and normal tissue." However, a specific numerical sample size for the training set is not provided.
9. How the Ground Truth for the Training Set Was Established
The ground truth for the training set was established using "biopsy-proven examples of breast cancer and normal tissue." This indicates that histopathological (pathology) results from biopsies served as the definitive ground truth for classifying cases as cancerous or normal during the training of the AI model.
Ask a specific question about this device
(173 days)
MammoScreen
MammoScreen™ is intended for use as a concurrent reading aid for interpreting physicians, to help identify findings on screening FFDM acquired with compatible mammography systems and assess their level of suspicion. Output of the device includes marks placed on findings on the mammogram and level of suspicion scores. The findings could be soft tissue lesions or calcifications. The level of suspicion score is expressed at the finding level, for each breast and overall for the mammogram. Patient management decisions should not be made solely on the basis of analysis by MammoScreen™.
MammoScreen is a software-only device for aiding interpreting physicians in identifying focal findings suspicious for breast cancer in screening FFDM (full-field digital mammography) acquired with compatible mammography systems. The product consists of a processing server and a web interface. The software applies algorithms for recognition of suspicious calcifications and soft tissue lesions. These algorithms have been trained on large databases of biopsy proven examples of breast cancer, benign lesions and normal tissue. MammoScreen automatically processes FFDM and the output of the device can be used by radiologists concurrently with the reading of mammograms. The user interface of MammoScreen has several functions: a) Activation of computer aided detection (CAD) marks to highlight locations, known as findings, where the device detected calcifications or soft tissue lesions suspicious for cancer. b) Association of findings with a score, known as the MammoScreen Score, which characterizes findings on a 1-10 scale, with increasing level of suspicion. Only the most suspicious findings (with a MammoScreen score equal or greater than 5) are initially marked to limit the number of findings to review. The user shall also review findings with score of 4 or lower. c) Indication, with matching markers, when findings corresponding to the same findings are detected in multiple views of the FFDM. MammoScreen is configured as a DICOM Web compliant node in a network and receives its input images from another DICOM node, called "the DICOM Web Server". The MammoScreen output will be displayed on the screen of a personal computer compliant with requirements specified in the User Manual. The image analysis unit includes machine learning components trained to detect positive findings (calcifications and soft tissue lesions).
Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided text:
Acceptance Criteria and Device Performance
The provided document defines acceptance criteria primarily through comparison with a predicate device and through the results of a clinical reader study. The core acceptance criterion for the clinical study appears to be an improvement in radiologist performance when using MammoScreen assistance compared to unaided reading.
Table of Acceptance Criteria and Reported Device Performance
Criterion Type | Specific Criterion | Reported Device Performance (MammoScreen) | Met? |
---|---|---|---|
Premarket Equivalence (vs. Predicate Device K181704 Transpara) | |||
Classification Regulation | 21 CFR 892.2090 | SAME | Yes |
Medical Device Class | Class II | SAME | Yes |
Product Code | QDQ | SAME | Yes |
Level of Concern | Moderate | SAME | Yes |
Intended Use | Concurrent reading aid for physicians interpreting screening FFDM to identify findings and assess their level of suspicion. | SAME | Yes |
Target patient population | Women undergoing FFDM screening mammography. | SAME | Yes |
Target user population | Physicians interpreting FFDM screening mammograms. | SAME | Yes |
Design | Software-only device. | SAME | Yes |
Scoring System | While not identical, the principle (level of suspicion from low to high) should be substantially equivalent. | 10-point scale vs. predicate's 1-100. Manufacturer claims interpretability benefits. Exam-level score provided. Deemed "substantially equivalent." | Yes |
Finding Discovery | Reducing the number of findings the user has to review. | Default display for scores ≥ 5, user request for scores ≤ 4. Deemed "equivalent." | Yes |
Performance Comparison | Overall performance gains should be comparable and not raise new safety/effectiveness questions. | AUC: unaided = 0.769, assisted = 0.798 (Difference: 0.028; P = 0.035). Predicate reported unaided = 0.866, assisted = 0.887. Deemed "still comparable." | Yes |
Fundamental Scientific Technology | Involves medical image processing and machine learning, particularly deep learning for suspicious findings. | SAME | Yes |
Clinical Performance (Reader Study) | |||
Radiologist Performance | Radiologist performance with MammoScreen assistance is superior to unaided performance (main objective). | Average AUC improved from 0.769 (unaided) to 0.798 (with MammoScreen) (Difference = 0.028; P = 0.035). | Yes |
Reading Time | Should not significantly increase. | Average reading time increased by 14% for scores > 4, but decreased by 2% for scores ≤ 4 in the second session. Overall, maximum increase did not exceed 15s. | Yes |
Standalone Performance | Non-inferior to average unaided radiologist performance. | Standalone AUC = 0.790; Non-inferior to average unaided radiologist AUC = 0.770 (absence of statistical effect (p>0.05) and lower CI of diff > -0.03). | Yes |
Sensitivity | Sensitivity of readers tended to increase with the use of MammoScreen without decreasing specificity (conclusion statement). | Reported overall performance improvement was statistically significant at breast (AUC) and lesion (pAUC) level, confirming trend. Specific values not explicitly in acceptance criteria here. | Yes |
Study Details for Device Acceptance
-
Sample Size Used for the Test Set and Data Provenance:
- Test Set Size: 240 mammographic screening images (cases).
- Data Provenance: Acquired at a US center. The text states "US FFDM acquired on Hologic® devices, and performance comparison with FFDM acquired on GE® devices," indicating images from at least two major mammography system manufacturers in the US.
- Retrospective/Prospective: Retrospective. The study "collected" images after they were acquired, and "For each exam, the cancer status has been verified... and used as gold standard."
-
Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts:
- The document does not explicitly state the number of experts used to establish the ground truth or their specific qualifications (e.g., years of experience). It only states that "the cancer status has been verified by either biopsy results (for all cancer positive cases and some of the negative cases) or an adequate follow-up (for negative cases only) and used as gold standard." This implies clinical data and follow-up was the primary ground truth, not consensus of a specific number of experts.
-
Adjudication Method for the Test Set:
- The document does not explicitly describe an adjudication method for establishing ground truth from multiple expert reads. Ground truth was established via biopsy or adequate follow-up, which are objective clinical outcomes, not subjective reader interpretations.
-
Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:
- Yes, an MRMC study was performed.
- Effect Size of Human Readers Improvement with AI vs. Without AI Assistance:
- Average AUC: Increased from 0.769 (unaided) to 0.798 (with MammoScreen assistance).
- Difference: 0.028 (P = 0.035), indicating a statistically significant improvement.
- The AUC was higher with MammoScreen aid for 11 of the 14 radiologists.
- Performance improvement was also statistically significant at the breast (in terms of AUC) and lesion (in terms of pAUC) level.
-
Standalone (Algorithm Only Without Human-in-the-Loop Performance) Study:
- Yes, a standalone performance study was conducted.
- Standalone Performance: MammoScreen's standalone performance (AUC = 0.790) was found to be non-inferior to the average performance of unaided radiologists (AUC = 0.770). The lower confidence interval of the difference of AUC was equal to or superior to the effect size (-0.03), and the P-value was >0.05, confirming non-inferiority.
- Detailed standalone performance metrics were also provided for mammogram, breast, and finding levels (soft tissue lesions and calcifications), including ROC AUC, sensitivity, and specificity for Hologic, GE, and combined datasets.
-
Type of Ground Truth Used:
- Clinical Outcomes Data: The primary ground truth was established by:
- Biopsy results (for all cancer-positive cases and some negative cases).
- Adequate follow-up (for negative cases only).
- Clinical Outcomes Data: The primary ground truth was established by:
-
Sample Size for the Training Set:
- The document states that the algorithms "have been trained on large databases of biopsy proven examples of breast cancer, benign lesions and normal tissue." However, it does not specify the exact sample size of the training set.
-
How the Ground Truth for the Training Set Was Established:
- The ground truth for the training set was established using "biopsy proven examples of breast cancer, benign lesions and normal tissue." This implies a similar methodology to the test set, relying on objective clinical outcomes (histopathology from biopsy) rather than expert consensus on images.
Ask a specific question about this device
Page 1 of 1