K Number
K240417
Manufacturer
Date Cleared
2024-11-08

(269 days)

Product Code
Regulation Number
892.2090
Panel
RA
Reference & Predicate Devices
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

ProFound Detection V4.0 is a computer-assisted detection and diagnosis (CAD) software device intended to be used concurrently by interpreting physicians while reading digital breast tomosynthesis (DBT) exams from compatible DBT system detects soft tissue densities (masses, architectural distortions and asymmetries) and calcifications in the 3D DBT slices. The detections and Certainty of Finding and Case Scores assist interpreting physicians in identifying soft tissue densities and calcifications that may be confirmed or dismissed by the interpreting Physician.

Device Description

ProFound Detection V4.0 is a computer-assisted detection and diagnosis (CAD) software device that detects malignant soft-tissue densities and calcifications in digital breast tomosynthesis (DBT) images. The ProFound Detection V4.0 software allows an interpreting physician to quickly identify suspicious soft tissue densities and calcifications by marking the detected areas in the tomosynthesis images. When the ProFound Detection V4.0 marks are displayed by a user, the marks will appear as overlays on the tomosynthesis images. Each detected finding will also be assigned a "score" that corresponds to the ProFound Detection V4.0 algorithm's confidence that the detected finding is a cancer (Certainty of Finding). Certainty of Finding scores are a percentage in range of 0% to 100% to indicate CAD's confidence that the finding is malignant. ProFound Detection V4.0 also assigns a score to each case (Case Score) as a percentage in range of 0% to 100% to indicate CAD's confidence that the case has malignant findings. The higher the Certainty of Finding or Case Score, the higher the confidence that the detected finding is a cancer or that the case has malignant findings.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:

Acceptance Criteria and Device Performance

The core acceptance criterion is non-inferiority to the predicate device (ProFound AI V3.0) on key performance metrics.

Table of Acceptance Criteria and Reported Device Performance

MetricAcceptance Criteria (Non-inferior to Predicate)Reported ProFound Detection V4.0 Performance (with priors)Reported ProFound Detection V4.0 Performance (without priors)Reported Predicate Performance (ProFound AI V3.0)
SensitivityNot inferior to 0.87250.9004 (0.8633-0.9374)0.9004 (0.8633-0.9374)0.8725 (0.8312-0.9138)
SpecificityNot inferior to 0.52780.6205 (0.5846-0.6565)0.5863 (0.5498-0.6228)0.5278 (0.4909-0.5648)
AUCNot inferior to 0.82300.8753 (0.8475-0.9032)0.8714 (0.8423-0.9007)0.8230 (0.7878-0.8570)

Summary of Performance vs. Criteria:
The study demonstrated that ProFound Detection V4.0, particularly when using prior images, achieved superior performance across all three metrics (Sensitivity, Specificity, and AUC) compared to the predicate device, thus meeting the non-inferiority acceptance criteria and additionally showing superiority in specificity.

Study Details

2. Sample size used for the test set and the data provenance:

  • Sample Size: 952 cases
    • 251 biopsy-proven cancer cases (with 256 malignant lesions)
    • 701 non-cancer cases
  • Data Provenance:
    • Country of Origin: U.S. image acquisition sites
    • Retrospective or Prospective: Retrospectively collected
    • Independence: The data was collected from sites independent of those included in the training and development sets. iCAD ensured this independence by sequestering the data.
    • Manufacturer: 100% Hologic DBT system exam data.
    • Exam Dates: 2018 - 2022.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:

  • Number of Experts: The text states, "Each cancer case was a biopsy proven positive, truthed by an expert breast imaging radiologist". While it explicitly mentions "an expert breast imaging radiologist" in the singular for truthing, it does not specify the exact number of unique "expert breast imaging radiologists" involved in truthing the entire dataset or their specific years of experience.

4. Adjudication method (e.g., 2+1, 3+1, none) for the test set:

  • The text does not specify a formal adjudication method (like 2+1 or 3+1) for establishing ground truth from multiple readers. Ground truth was established based on clinical data including radiology report, follow-up biopsy, and pathology data, and then truthed by an expert radiologist.

5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, if so, what was the effect size of how much human readers improve with AI vs. without AI assistance:

  • No, an MRMC comparative effectiveness study was NOT done. The study described is a standalone performance assessment of the AI algorithm itself, comparing it to a predicate AI algorithm. It does not evaluate the performance of human readers, either with or without AI assistance.

6. If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:

  • Yes, a standalone study was done. The text explicitly states: "A standalone study was conducted, which evaluated the performance of ProFound Detection version 4.0 without an interpreting physician." This study directly compared the algorithm's performance (V4.0) against the predicate (V3.0) on an independent test set.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.):

  • The ground truth was a combination of biopsy-proven pathology data and clinical data, including radiology reports and follow-up data. Specifically, "These reference standards were derived from clinical data including radiology report, follow-up biopsy and pathology data. Each cancer case was a biopsy proven positive, truthed by an expert breast imaging radiologist who outlined the location and extent of cancer lesions in the case."

8. The sample size for the training set:

  • The sample size for the training set is not provided. The text only refers to the test set being "independent of those included in the training and development" and that iCAD "ensures the independence of this dataset by sequestering the data and keeping it separate from the test and development datasets."

9. How the ground truth for the training set was established:

  • How the ground truth for the training set was established is not explicitly detailed. The text mentions that the test set's ground truth was established by "biopsy proven cancer cases" and "truthed by an expert breast imaging radiologist." While it implies a similar process would likely be used for training data, the specific method for the training set's ground truth establishment is not provided in the submitted document.

§ 892.2090 Radiological computer-assisted detection and diagnosis software.

(a)
Identification. A radiological computer-assisted detection and diagnostic software is an image processing device intended to aid in the detection, localization, and characterization of fracture, lesions, or other disease-specific findings on acquired medical images (e.g., radiography, magnetic resonance, computed tomography). The device detects, identifies, and characterizes findings based on features or information extracted from images, and provides information about the presence, location, and characteristics of the findings to the user. The analysis is intended to inform the primary diagnostic and patient management decisions that are made by the clinical user. The device is not intended as a replacement for a complete clinician's review or their clinical judgment that takes into account other relevant information from the image or patient history.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Design verification and validation must include:
(i) A detailed description of the image analysis algorithm, including a description of the algorithm inputs and outputs, each major component or block, how the algorithm and output affects or relates to clinical practice or patient care, and any algorithm limitations.
(ii) A detailed description of pre-specified performance testing protocols and dataset(s) used to assess whether the device will provide improved assisted-read detection and diagnostic performance as intended in the indicated user population(s), and to characterize the standalone device performance for labeling. Performance testing includes standalone test(s), side-by-side comparison(s), and/or a reader study, as applicable.
(iii) Results from standalone performance testing used to characterize the independent performance of the device separate from aided user performance. The performance assessment must be based on appropriate diagnostic accuracy measures (
e.g., receiver operator characteristic plot, sensitivity, specificity, positive and negative predictive values, and diagnostic likelihood ratio). Devices with localization output must include localization accuracy testing as a component of standalone testing. The test dataset must be representative of the typical patient population with enrichment made only to ensure that the test dataset contains a sufficient number of cases from important cohorts (e.g., subsets defined by clinically relevant confounders, effect modifiers, concomitant disease, and subsets defined by image acquisition characteristics) such that the performance estimates and confidence intervals of the device for these individual subsets can be characterized for the intended use population and imaging equipment.(iv) Results from performance testing that demonstrate that the device provides improved assisted-read detection and/or diagnostic performance as intended in the indicated user population(s) when used in accordance with the instructions for use. The reader population must be comprised of the intended user population in terms of clinical training, certification, and years of experience. The performance assessment must be based on appropriate diagnostic accuracy measures (
e.g., receiver operator characteristic plot, sensitivity, specificity, positive and negative predictive values, and diagnostic likelihood ratio). Test datasets must meet the requirements described in paragraph (b)(1)(iii) of this section.(v) Appropriate software documentation, including device hazard analysis, software requirements specification document, software design specification document, traceability analysis, system level test protocol, pass/fail criteria, testing results, and cybersecurity measures.
(2) Labeling must include the following:
(i) A detailed description of the patient population for which the device is indicated for use.
(ii) A detailed description of the device instructions for use, including the intended reading protocol and how the user should interpret the device output.
(iii) A detailed description of the intended user, and any user training materials or programs that address appropriate reading protocols for the device, to ensure that the end user is fully aware of how to interpret and apply the device output.
(iv) A detailed description of the device inputs and outputs.
(v) A detailed description of compatible imaging hardware and imaging protocols.
(vi) Warnings, precautions, and limitations must include situations in which the device may fail or may not operate at its expected performance level (
e.g., poor image quality or for certain subpopulations), as applicable.(vii) A detailed summary of the performance testing, including test methods, dataset characteristics, results, and a summary of sub-analyses on case distributions stratified by relevant confounders, such as anatomical characteristics, patient demographics and medical history, user experience, and imaging equipment.