K Number
K193229
Device Name
Transpara
Date Cleared
2020-03-05

(104 days)

Product Code
Regulation Number
892.2090
Panel
RA
Reference & Predicate Devices
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

Transpara™ software is intended for use as a concurrent reading aid for physicians interpreting screening full-field digital mammography exams and digital breast tomosynthesis exams from compatible FFDM and DBT systems, to identify regions suspicious for breast cancer and assess their likelihood of malignancy. Output of the device includes locations of calcifications groups and soft-tissue regions, with scores indicating the likelihood that cancer is present, and an exam score indicating the likelihood that cancer is present in the exam. Patient management decisions should not be made solely on the basis of analysis by Transpara™.

Device Description

Transpara™ is a software only application designed to be used by physicians to improve interpretation of digital mammography and digital breast tomosynthesis. The system is intended to be used as a concurrent reading aid to help readers with detection and characterization of potential abnormalities suspicious for breast cancer and to improve workflow. 'Deep learning' algorithms are applied to FFDM images and DBT slices for recognition of suspicious calcifications and soft tissue lesions (including densities, masses, architectural distortions, and asymmetries). Algorithms are trained with a large database of biopsy-proven examples of breast cancer, benign abnormalities, and examples of normal tissue.

Transpara™ offers the following functions which may be used at any time during reading (concurrent use):

  • a) Computer aided detection (CAD) marks to highlight locations where the device detected suspicious calcifications or soft tissue lesions.
  • b) Decision support is provided by region scores on a scale ranging from 0-100, with higher scores indicating a higher level of suspicion.
  • c) Links between corresponding regions in different views of the breast, which may be utilized to enhance user interfaces and workflow.
  • d) An exam score which categorizes exams on a scale of 1-10 with increasing likelihood of cancer. The score is calibrated in such a way that approximately 10 percent of mammograms in a population of mammograms without cancer falls in each category.

Results of Transpara™ are computed in processing server which accepts mammograms or DBT exams in DICOM format as input, processes them, and sends the processing output to a destination using the DICOM protocol in a standardized mammography CAD DICOM format. Use of the device is supported for images from the following modality manufacturers: FFDM (Hologic, Siemens, General Electric, Philips, Fujifilm) and DBT (Hologic, Siemens). Common destinations are medical workstations, PACS and RIS. Transpara™ is offered as a virtual machine and runs on pre-selected standard PC hardware as well as a dedicated virtual machine cluster. The system can be configured using a service interface. Implementation of a user interface for end users in a medical workstation is to be provided by third parties.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided text:

1. A table of acceptance criteria and the reported device performance

The document doesn't explicitly present a formal "acceptance criteria" table with numerical cutoffs for specific metrics. Instead, it describes its objectives in terms of "superior" or "non-inferior" performance compared to a baseline (unaided human reading or a previous device version).

Acceptance Criteria (Stated Objective)Reported Device Performance
Pivotal Reader Study (DBT)
Superior breast-level Area Under the Receiver OperatingAverage AUC increased from 0.833 to 0.863 (P = 0.0025) with Transpara™ assistance. This demonstrates statistically significant improvement.
Characteristic curve (AUC) between conditions
Reading time reductionReading time was significantly reduced with Transpara™ assistance. (Specific reduction not quantified in text).
Non-inferior or higher sensitivitySuperior sensitivity was obtained with Transpara™ assistance. (Specific values not quantified in text).
Non-inferior or higher specificity(No specific mention of specificity performance beyond "non-inferior or higher," but the AUC improvement implies a balanced performance gain).
Reading time reduction on normal examsReading time reduction on normal exams was a secondary objective that was met. (Specific reduction not quantified in text).
Standalone AUC performance non-inferior to average AUC of readersThe text states it was tested if standalone AUC performance of Transpara™ was non-inferior to the average AUC performance of the readers, and statistical analysis showed all pre-specified endpoints were met. This implies non-inferiority was achieved. (Specific AUC not stated for standalone).
Standalone Performance Testing (FFDM)
Non-inferior or better detection accuracy compared to Transpara 1.3.0Validation testing confirmed that algorithm performance is non-inferior or better in comparison to Transpara 1.3.0 for the four manufacturers cleared for the predicate device.
Non-inferior performance for Fujifilm FFDM systemsValidation testing confirmed that for Fujifilm, performance was non-inferior to the performance achieved on the pooled test data of devices cleared for use with the predicate device.

2. Sample sizes used for the test set and the data provenance

  • Test Set (Pivotal Reader Study for DBT):

    • Sample Size: 240 Siemens Mammomat DBT exams. This included 65 exams with breast cancer, 65 exams with benign abnormalities, and 110 normal exams.
    • Data Provenance: The text states the data were "acquired from multiple centers." It also specifies they were "Siemens Mammomat DBT exams." The country of origin is not explicitly stated, but the manufacturer is based in Germany. The study was retrospective.
  • Test Set (Standalone Performance Testing for FFDM):

    • Sample Size: "Independent multi-vendor test-set of mammography and DBT exams." Specific number not provided, but it included exams from five manufacturers: Hologic, GE, Philips, Siemens, and Fujifilm.
    • Data Provenance: The data were "acquired from multiple centers." The study was retrospective.
  • Training Set:

    • Sample Size: "Algorithms are trained with a large database of biopsy-proven examples of breast cancer, benign abnormalities, and examples of normal tissue." No specific number provided.
    • Data Provenance: Not specified, but likely diverse given the mention of a "large database."

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

The document doesn't explicitly detail the process or number of experts used to establish the ground truth for the test set before the reader study. It mentions the reader study itself involved 18 radiologists, but these radiologists were participating in the evaluation of the device, not necessarily establishing an independent ground truth for the test cases prior to the study.

However, the training data used "biopsy-proven examples," which implies ground truth confirmation by pathology. For the reader study, the cases were "enriched," meaning they had known outcomes (cancer, benign, normal). The underlying ground truth for these clinical cases would typically be established by clinical diagnosis, pathology reports from biopsies, and follow-up.

4. Adjudication method for the test set

The document does not explicitly describe an adjudication method (like 2+1 or 3+1 consensus) for establishing the ground truth of the test set cases. The term "enriched sample" suggests that cases with known outcomes (cancer, benign, normal) were selected.

5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

  • Yes, a multi-reader multi-case (MRMC) comparative effectiveness study was done.

    • Design: "fully-crossed, multi-reader multi-case retrospective study."
    • Participants: 18 MQSA qualified radiologists.
  • Effect Size of Human Reader Improvement (with AI vs. without AI assistance):

    • Average AUC: Increased from 0.833 (unaided) to 0.863 (with Transpara™ assistance).
    • P-value: P = 0.0025, indicating statistical significance.
    • Sensitivity: "Superior sensitivity was obtained with Transpara™." (Specific values not provided).
    • Reading Time: "reading time was significantly reduced." (Specific reduction not provided).

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

  • Yes, standalone performance testing was done.
    • Type of Testing: "determining stand-alone performance of the algorithms in Transpara 1.6.0."
    • Context for FFDM: Focused on non-inferiority compared to the predicate device (Transpara 1.3.0) and for new manufacturers (Fujifilm).
    • Context for DBT: It was a secondary objective of the pivotal reader study to "test if standalone AUC performance of Transpara™ was non-inferior to the average AUC performance of the readers." The study results indicated this objective was met. (Specific standalone AUC not provided in the text).

7. The type of ground truth used

  • For training data: "biopsy-proven examples of breast cancer, benign abnormalities, and examples of normal tissue." This implies pathology and clinical follow-up for normality.
  • For the pivotal reader study test set: The "enriched sample" of exams (65 with cancer, 65 benign, 110 normal) suggests ground truth was based on clinical diagnosis, pathology results, and follow-up exams. While not explicitly stated as "expert consensus," these are considered robust forms of ground truth for breast imaging studies.

8. The sample size for the training set

  • The training set was described as a "large database." No specific numerical sample size was provided in the document.

9. How the ground truth for the training set was established

  • The algorithms were "trained with a large database of biopsy-proven examples of breast cancer, benign abnormalities, and examples of normal tissue." This indicates the ground truth was established through histopathological confirmation (biopsy results) for cancerous and benign cases, and likely clinical follow-up for normal cases to ensure no underlying malignancy was missed.

§ 892.2090 Radiological computer-assisted detection and diagnosis software.

(a)
Identification. A radiological computer-assisted detection and diagnostic software is an image processing device intended to aid in the detection, localization, and characterization of fracture, lesions, or other disease-specific findings on acquired medical images (e.g., radiography, magnetic resonance, computed tomography). The device detects, identifies, and characterizes findings based on features or information extracted from images, and provides information about the presence, location, and characteristics of the findings to the user. The analysis is intended to inform the primary diagnostic and patient management decisions that are made by the clinical user. The device is not intended as a replacement for a complete clinician's review or their clinical judgment that takes into account other relevant information from the image or patient history.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Design verification and validation must include:
(i) A detailed description of the image analysis algorithm, including a description of the algorithm inputs and outputs, each major component or block, how the algorithm and output affects or relates to clinical practice or patient care, and any algorithm limitations.
(ii) A detailed description of pre-specified performance testing protocols and dataset(s) used to assess whether the device will provide improved assisted-read detection and diagnostic performance as intended in the indicated user population(s), and to characterize the standalone device performance for labeling. Performance testing includes standalone test(s), side-by-side comparison(s), and/or a reader study, as applicable.
(iii) Results from standalone performance testing used to characterize the independent performance of the device separate from aided user performance. The performance assessment must be based on appropriate diagnostic accuracy measures (
e.g., receiver operator characteristic plot, sensitivity, specificity, positive and negative predictive values, and diagnostic likelihood ratio). Devices with localization output must include localization accuracy testing as a component of standalone testing. The test dataset must be representative of the typical patient population with enrichment made only to ensure that the test dataset contains a sufficient number of cases from important cohorts (e.g., subsets defined by clinically relevant confounders, effect modifiers, concomitant disease, and subsets defined by image acquisition characteristics) such that the performance estimates and confidence intervals of the device for these individual subsets can be characterized for the intended use population and imaging equipment.(iv) Results from performance testing that demonstrate that the device provides improved assisted-read detection and/or diagnostic performance as intended in the indicated user population(s) when used in accordance with the instructions for use. The reader population must be comprised of the intended user population in terms of clinical training, certification, and years of experience. The performance assessment must be based on appropriate diagnostic accuracy measures (
e.g., receiver operator characteristic plot, sensitivity, specificity, positive and negative predictive values, and diagnostic likelihood ratio). Test datasets must meet the requirements described in paragraph (b)(1)(iii) of this section.(v) Appropriate software documentation, including device hazard analysis, software requirements specification document, software design specification document, traceability analysis, system level test protocol, pass/fail criteria, testing results, and cybersecurity measures.
(2) Labeling must include the following:
(i) A detailed description of the patient population for which the device is indicated for use.
(ii) A detailed description of the device instructions for use, including the intended reading protocol and how the user should interpret the device output.
(iii) A detailed description of the intended user, and any user training materials or programs that address appropriate reading protocols for the device, to ensure that the end user is fully aware of how to interpret and apply the device output.
(iv) A detailed description of the device inputs and outputs.
(v) A detailed description of compatible imaging hardware and imaging protocols.
(vi) Warnings, precautions, and limitations must include situations in which the device may fail or may not operate at its expected performance level (
e.g., poor image quality or for certain subpopulations), as applicable.(vii) A detailed summary of the performance testing, including test methods, dataset characteristics, results, and a summary of sub-analyses on case distributions stratified by relevant confounders, such as anatomical characteristics, patient demographics and medical history, user experience, and imaging equipment.