K Number
K210670
Device Name
BU-CAD
Date Cleared
2021-12-21

(291 days)

Product Code
Regulation Number
892.2090
Panel
RA
Reference & Predicate Devices
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

BU-CAD is a software application indicated to assist trained interpreting physicians in analyzing the breast ultrasound images of patients with soft tissue breast lesions suspicious for breast cancer who are being referred for further diagnostic ultrasound examination.

Output of the device includes regions of interest (ROIs) and lesion contours placed on breast ultrasound images assisting physicians to identify suspicious soft tissue lesions from up to two orthogonal views of a single lesion, and region-based analysis of lesion malignancy upon the physician's query. The region-based analysis indicates the score of lesion characteristics (SLC), and corresponding BI-RADS categories in user-selected ROIs or ROIs automatically identified by the software. In addition, BU-CAD also automatically classifies lesion shape, orientation, margin, echo pattern, and posterior features according to BI-RADS descriptors.

BU-CAD may also be used as an image viewer of multi-modality digital images, including ultrasound and mammography. The software includes tools that allow users to adjust, measure and document images, and output into a structured report (SR).

Patient management decisions should not be made solely on the basis of analysis by BU-CAD.

Limitations: BU-CAD is not to be used on sites of post-surgical excision, or images with Doppler, elastography, or other overlays present in them. BU-CAD is not intended for the primary interpretation of digital mammography images. BU-CAD is not intended for use on mobile devices.

Device Description

BU-CAD developed by TaiHao Medical Inc. is a software system designed to assist users in analyzing breast ultrasound images including identification of regions suspicious for breast cancer and assessment of their malignancy. The system consists of a Viewer, a Lesion Identification Module, and a Lesion Analysis Module. The Viewer loads breast ultrasound and mammography images from local storage or PACS for review, and includes tools for measurement and image adjustment. The Lesion Identification Module identifies automated ROIs and generates lesion contours on breast ultrasound images. The Lesion Analysis Module analyzes given ROIs and generates a score of lesion characteristics (SLC), BI-RADS category, and BI-RADS descriptors. Users can replace automated ROIs with re-delineated rectangular ROIs for analysis. The last analysis results are displayed and modifiable by the user. BU-CAD also supports exporting CAD results to third-party reporting software.

AI/ML Overview

The provided text describes the acceptance criteria and the study proving the device meets these criteria for the BU-CAD system.

Here's the breakdown of the information requested:

1. Table of Acceptance Criteria and Reported Device Performance

The document implies acceptance criteria by demonstrating performance gains in a comparative study against human readers. While explicit quantitative acceptance criteria for each metric are not stated, the success is determined by statistically significant improvement over unaided human performance and comparable performance to predicate devices. The primary acceptance criterion appears to be superiority of aided performance (AUC_LROC) over unaided performance.

Metric / FeatureAcceptance Criteria (Implied)Reported Device Performance (BU-CAD)
MRMC Study (Aided vs. Unaided)
AUC_LROC (Mean Shift)Statistically significant improvement over unaided performance.+0.0374 (95% CI: 0.0190, 0.0557), p-value: 0.0001 (Unaided AUC: 0.7786, Aided AUC: 0.8160)
Sensitivity (Aided vs. Unaided)Higher in aided scenario.Unaided: 0.9225 (0.8896, 0.9554), Aided: 0.9353 (0.9050, 0.9655)
Specificity (Aided vs. Unaided)Higher in aided scenario.Unaided: 0.3165 (0.2694, 0.3636), Aided: 0.3611 (0.3124, 0.4098)
NPV (unadjusted) (Aided vs. Unaided)Higher in aided scenario.Unaided: 0.8623 (0.8048, 0.9198), Aided: 0.8945 (0.8456, 0.9434)
PPV (unadjusted) (Aided vs. Unaided)Higher in aided scenario.Unaided: 0.4876 (0.4433, 0.5319), Aided: 0.5056 (0.4607, 0.5505)
False Positive (Unaided to True Negative)Positive net benefit (reduction in FPs).Total Net Benefit: +267 events across 16 readers (790 FP→TN vs 523 TN→FP transitions for benign cases).
Interpretation TimeDecrease in interpretation time.Demonstrated statistically significant decrease in readers' interpretation times (~40%).
BI-RADS Descriptors AccuracyImprovement in determination for at least one subcategory.Improved readers' determination for Shape, Orientation, Margin, Echo Pattern, and Posterior Features for at least one or more subcategories for each descriptor (compared to unaided). Unaided vs. Aided Accuracy: Shape (78.14% vs 78.92%), Orientation (82.15% vs 82.20%), Margin (79.22% vs 77.34%), Echo Pattern (76.49% vs 66.52%), Posterior Features (66.51% vs 67.53%). Note: Aided Margin and Echo Pattern accuracy decreased but combined with other improved descriptors, overall benefit claimed.
Standalone Study
AUC_LROC (628 Reader Study Cases)Higher than unaided reading performance for the same cases.0.7987 (0.7626, 0.8348) (Unaided for same cases: 0.7786)
AUC_LROC (1139 Standalone Study Cases)Achieve acceptable discrimination (AUC>0.7) and robust performance.0.8203 (0.7947, 0.8458). Overall "excellent" or "outstanding" discrimination (AUC LROC > 0.8 or > 0.9) across most subgroups, with some "acceptable" (0.7 to 0.8).
Lesion Identification Module (CADe) AccuracyHigh accuracy for automated ROI identification.93.24% (1062/1139) met objective performance criteria (auto ROI center within ground truth ROI with >=50% overlap).
Robustness of Lesion Analysis Module (CADx)Stable AUC despite ROI variations.AUC remained stable (0.840-0.846) with 20% random ROI shifts. AUC remained >0.8 with systematic ROI shrinking up to 16%.

2. Sample Size Used for the Test Set and Data Provenance

  • Test Set Sample Size:
    • MRMC Reader Study: 628 cases
    • Standalone Study: 1139 cases (which includes the 628 reader study cases plus 511 additional extended cases).
  • Data Provenance:
    • MRMC Reader Study: 456 cases from the United States, 172 cases from Taiwan.
    • Standalone Study: 531 cases from North America, 36 cases from Europe, 572 cases from Taiwan.
  • Retrospective or Prospective: The study is clearly stated as a retrospective study ("fully crossed multi-reader multi-case receiver operating characteristic (MRMC-ROC) retrospective study").

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications

The document does not explicitly state the number of experts used to establish the ground truth for the test set. However, it mentions an "expert panel" in the context of defining ground truth ROIs for the robustness experiments.

Qualifications of Experts (Readers in MRMC study, likely similar for GT):

  • 16 Readers participated in the MRMC study.
  • Specialties: 14 Radiologists, 2 Breast Surgeons.
  • Experience: Ranged from 1 year to >30 years of experience (as a radiologist/breast surgeon).
  • Certifications/Training: Most radiologists (13/14) were MQSA certified. 4/14 radiologists had received Breast Image Fellowship training.

4. Adjudication Method for the Test Set

The document does not explicitly detail the adjudication method for establishing the definitive ground truth for the test set cases (e.g., how malignancy/benignity or BI-RADS categories were finalized if there were disagreements among initial assessments). However, it mentions "Pathology proof benign," "Two-year follow-up benign," and specific malignant pathology types (DCIS, IDC, ILC, Other cancer types) as the basis for benign/malignant case classification in the dataset demographics. This suggests pathology and clinical follow-up as the primary ground truth, not necessarily a reader adjudication process per se, for the malignancy outcome.

For the reader study itself, it was a "fully crossed" MRMC-ROC study where readers evaluate cases independently, both unaided and aided. The performance metrics (AUC, sensitivity, specificity, etc.) are derived from comparing individual reader's interpretations against the established ground truth.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, If So, What Was the Effect Size of How Much Human Readers Improve With AI vs Without AI Assistance

  • Yes, an MRMC comparative effectiveness study was done.
  • Effect Size of Improvement:
    • Primary Objective (AUC_LROC Shift): The mean AUC_LROC shift was +0.0374. (Unaided AUC: 0.7786, Aided AUC: 0.8160). This improvement was statistically significant (p-value = 0.0001).
    • Comparison to Predicates: This shift (+0.0374) was stated to be "similar" to Koios DS for Breast (+0.037) and TransparaTM (+0.02).
    • Other Metrics (Aided vs Unaided):
      • Sensitivity: Increased from 0.9225 to 0.9353.
      • Specificity: Increased from 0.3165 to 0.3611.
      • NPV (unadjusted): Increased from 0.8623 to 0.8945.
      • PPV (unadjusted): Increased from 0.4876 to 0.5056.
      • Reduction in False Positives: A net benefit of +267 events (FP to TN) indicating reduction of false positives across all readers for benign cases.
      • Interpretation Time: Decreased by approximately 40%.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

  • Yes, a standalone study was done.
  • Standalone Performance (AUC_LROC):
    • On the 628 reader study cases: 0.7987 (95% CI: 0.7626, 0.8348)
    • On the larger 1,139 standalone study cases: 0.8203 (95% CI: 0.7947, 0.8458)
  • Standalone Sensitivity & Specificity (using BI-RADS 4A as cutoff):
    • Sensitivity: 88.33% (439/497)
    • Specificity: 57.94% (372/642)
  • Lesion Identification Module (CADe) Accuracy: 93.24% (1062/1139) detected and localized correctly.

7. The Type of Ground Truth Used (Expert Consensus, Pathology, Outcomes Data, etc.)

The ground truth for the test set (both reader study and standalone study cases) was primarily established by:

  • Pathology Proof: For malignant cases (Ductal carcinomas in situ (DCIS), invasive ductal carcinoma (IDC), Invasive lobular carcinoma (ILC), and other cancer types) and some benign cases.
  • Two-year Follow-up: For other benign cases.

This indicates a strong, objective ground truth based on definitive clinical outcomes.

8. The Sample Size for the Training Set

The document does not provide the sample size for the training set. It only states that the "testing dataset was not used for training of BU-CAD algorithms."

9. How the Ground Truth for the Training Set Was Established

The document does not provide information on how the ground truth for the training set was established. Since the test set ground truth was largely based on pathology and follow-up, it is highly probable that the training data followed similar rigorous ground truth establishment methods, but this is not explicitly stated.

§ 892.2090 Radiological computer-assisted detection and diagnosis software.

(a)
Identification. A radiological computer-assisted detection and diagnostic software is an image processing device intended to aid in the detection, localization, and characterization of fracture, lesions, or other disease-specific findings on acquired medical images (e.g., radiography, magnetic resonance, computed tomography). The device detects, identifies, and characterizes findings based on features or information extracted from images, and provides information about the presence, location, and characteristics of the findings to the user. The analysis is intended to inform the primary diagnostic and patient management decisions that are made by the clinical user. The device is not intended as a replacement for a complete clinician's review or their clinical judgment that takes into account other relevant information from the image or patient history.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Design verification and validation must include:
(i) A detailed description of the image analysis algorithm, including a description of the algorithm inputs and outputs, each major component or block, how the algorithm and output affects or relates to clinical practice or patient care, and any algorithm limitations.
(ii) A detailed description of pre-specified performance testing protocols and dataset(s) used to assess whether the device will provide improved assisted-read detection and diagnostic performance as intended in the indicated user population(s), and to characterize the standalone device performance for labeling. Performance testing includes standalone test(s), side-by-side comparison(s), and/or a reader study, as applicable.
(iii) Results from standalone performance testing used to characterize the independent performance of the device separate from aided user performance. The performance assessment must be based on appropriate diagnostic accuracy measures (
e.g., receiver operator characteristic plot, sensitivity, specificity, positive and negative predictive values, and diagnostic likelihood ratio). Devices with localization output must include localization accuracy testing as a component of standalone testing. The test dataset must be representative of the typical patient population with enrichment made only to ensure that the test dataset contains a sufficient number of cases from important cohorts (e.g., subsets defined by clinically relevant confounders, effect modifiers, concomitant disease, and subsets defined by image acquisition characteristics) such that the performance estimates and confidence intervals of the device for these individual subsets can be characterized for the intended use population and imaging equipment.(iv) Results from performance testing that demonstrate that the device provides improved assisted-read detection and/or diagnostic performance as intended in the indicated user population(s) when used in accordance with the instructions for use. The reader population must be comprised of the intended user population in terms of clinical training, certification, and years of experience. The performance assessment must be based on appropriate diagnostic accuracy measures (
e.g., receiver operator characteristic plot, sensitivity, specificity, positive and negative predictive values, and diagnostic likelihood ratio). Test datasets must meet the requirements described in paragraph (b)(1)(iii) of this section.(v) Appropriate software documentation, including device hazard analysis, software requirements specification document, software design specification document, traceability analysis, system level test protocol, pass/fail criteria, testing results, and cybersecurity measures.
(2) Labeling must include the following:
(i) A detailed description of the patient population for which the device is indicated for use.
(ii) A detailed description of the device instructions for use, including the intended reading protocol and how the user should interpret the device output.
(iii) A detailed description of the intended user, and any user training materials or programs that address appropriate reading protocols for the device, to ensure that the end user is fully aware of how to interpret and apply the device output.
(iv) A detailed description of the device inputs and outputs.
(v) A detailed description of compatible imaging hardware and imaging protocols.
(vi) Warnings, precautions, and limitations must include situations in which the device may fail or may not operate at its expected performance level (
e.g., poor image quality or for certain subpopulations), as applicable.(vii) A detailed summary of the performance testing, including test methods, dataset characteristics, results, and a summary of sub-analyses on case distributions stratified by relevant confounders, such as anatomical characteristics, patient demographics and medical history, user experience, and imaging equipment.