K Number
K193417
Date Cleared
2020-07-30

(234 days)

Product Code
Regulation Number
892.2090
Panel
RA
Reference & Predicate Devices
N/A
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

FractureDetect (FX) is a computer-assisted detection and diagnosis (CAD) software device to assist clinicians in detecting fractures during the review of radiographs of the musculoskeletal system. FX is indicated for adults only.

FX is indicated for radiographs of the following industry-standard radiographic views and study types.

| Study Type
(Anatomic Area
of Interest⁺) | Radiographic View(s)
Supported* |
|-----------------------------------------------|------------------------------------|
| Ankle | Frontal, Lateral, Oblique |
| Clavicle | Frontal |
| Elbow | Frontal, Lateral |
| Femur | Frontal, Lateral |
| Forearm | Frontal, Lateral |
| Hip | Frontal, Frog Leg Lateral |
| Humerus | Frontal, Lateral |
| Knee | Frontal, Lateral |
| Pelvis | Frontal |
| Shoulder | Frontal, Lateral, Axillary |
| Tibia / Fibula | Frontal, Lateral |
| Wrist | Frontal, Lateral, Oblique |

*For the purposes of this table, "Frontal" is considered inclusive of both posteroanterior (PA) and anteroposterior (AP) views.

+Definitions of anatomic area of interest and radiographic views are consistent with the American College of Radiology (ACR) standards and guidelines.

Device Description

FractureDetect (FX) is a computer-assisted detection and diagnosis (CAD) software device designed to assist clinicians in detecting fractures during the review of commonly acquired adult radiographs. FX does this by analyzing radiographs and providing relevant annotations, assisting clinicians in the detection of fractures within their diagnostic process at the point of care. FX was developed using robust scientific principles and industry-standard deep learning algorithms for computer vision.

FX creates, as its output, a DICOM overlay with annotations indicating the presence or absence of fractures. If any fracture is detected by FX, the output overlay is composed to include the text annotation "Fracture: DETECTED" and to include one or more bounding boxes surrounding any fracture site(s). If no fracture is detected by FX, the output overlay is composed to include the text annotation "Fracture: NOT DETECTED" and no bounding box is included. Whether or not a fracture is detected, the overlay includes a text annotation identifying the radiograph as analyzed by FX and instructions for users to access labeling. The FX overlay can be toggled on or off by the clinicians within their PACS viewer, allowing for uninhibited concurrent review of the original radiograph.

AI/ML Overview

Here's a detailed breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:

Acceptance Criteria and Device Performance

Acceptance CriteriaReported Device Performance
Standalone Performance
Overall Sensitivity0.951 (95% Wilson's CI: 0.940, 0.960)
Overall Specificity0.893 (95% Wilson's CI: 0.886, 0.898)
Overall Area Under the Curve (AUC)0.982 (95% Bootstrap CI: 0.9790, 0.9850)
AUC per Study Type: Ankle0.983 (0.972, 0.991)
AUC per Study Type: Clavicle0.962 (0.948, 0.975)
AUC per Study Type: Elbow0.964 (0.940, 0.982)
AUC per Study Type: Femur0.989 (0.983, 0.994)
AUC per Study Type: Forearm0.987 (0.977, 0.995)
AUC per Study Type: Hip0.982 (0.962, 0.995)
AUC per Study Type: Humerus0.983 (0.974, 0.991)
AUC per Study Type: Knee0.996 (0.993, 0.998)
AUC per Study Type: Pelvis0.982 (0.973, 0.989)
AUC per Study Type: Shoulder0.962 (0.938, 0.982)
AUC per Study Type: Tibia / Fibula0.994 (0.991, 0.997)
AUC per Study Type: Wrist0.992 (0.988, 0.996)
MRMC Comparative Effectiveness (Reader Performance with AI vs. without AI)
Reader AUC (FX-Aided) vs. (FX-Unaided)Improved from 0.912 to 0.952, a difference of 0.0406 (95% CI: 0.0127, 0.0685) (p=.0043)
Reader Sensitivity (FX-Aided) vs. (FX-Unaided)Improved from 0.819 (95% Wilson's CI: 0.794, 0.842) to 0.900 (95% Wilson's CI: 0.880, 0.917)
Reader Specificity (FX-Aided) vs. (FX-Unaided)Improved from 0.890 (95% Wilson's CI: 0.879, 0.900) to 0.918 (95% Wilson's CI: 0.908, 0.927)

Study Details

2. Sample Size Used for the Test Set and Data Provenance

  • Test Set Sample Size:
    • Standalone Study: 11,970 radiographs.
    • MRMC Reader Study: 175 cases.
  • Data Provenance: Not explicitly stated, but the experts establishing ground truth are specified as U.S. board-certified, suggesting the data is likely from the U.S. There is no indication whether the data was retrospective or prospective, but for an FDA submission of this nature, historical retrospective data is common.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications

  • Number of Experts: A panel of three experts was used for the MRMC study's ground truth.
  • Qualifications: "U.S. board-certified orthopedic surgeons or U.S. board-certified radiologists." Specific years of experience are not mentioned.

4. Adjudication Method for the Test Set

  • Adjudication Method: A "panel of three" experts assigned a ground truth binary label (presence or absence of fracture). This implies a consensus-based adjudication. While not explicitly stated (e.g., 2-out-of-3, or further adjudication if there was disagreement), the phrasing suggests a collective agreement to establish the "ground truth." This is analogous to a 3-expert consensus, where the majority rules.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

  • Was an MRMC study done? Yes.
  • Effect Size (Improvement with AI vs. without AI assistance):
    • Readers' AUC significantly improved by 0.0406 (from 0.912 to 0.952).
    • Readers' sensitivity improved by 0.081 (from 0.819 to 0.900).
    • Readers' specificity improved by 0.028 (from 0.890 to 0.918).

6. Standalone (Algorithm Only) Performance Study

  • Was a standalone study done? Yes.
  • Performance:
    • Sensitivity: 0.951
    • Specificity: 0.893
    • Overall AUC: 0.982
    • High accuracy across study types and potential confounders (image brightness, x-ray manufacturers).

7. Type of Ground Truth Used

  • Standalone Study: The ground truth for the standalone study is not explicitly detailed but given the MRMC study, it's highly probable it also leveraged expert consensus, similar to the MRMC setup, for fracture detection.
  • MRMC Study: Expert Consensus by a panel of three U.S. board-certified orthopedic surgeons or U.S. board-certified radiologists.

8. Sample Size for the Training Set

  • The document does not explicitly state the sample size for the training set. It only mentions "robust scientific principles and industry-standard deep learning algorithms for computer vision" were used for development.

9. How the Ground Truth for the Training Set Was Established

  • The document does not explicitly describe how the ground truth for the training set was established. It only mentions "Supervised Deep Learning" as the methodology, which implies labeled data was used for training, but the process of obtaining these labels is not detailed.

§ 892.2090 Radiological computer-assisted detection and diagnosis software.

(a)
Identification. A radiological computer-assisted detection and diagnostic software is an image processing device intended to aid in the detection, localization, and characterization of fracture, lesions, or other disease-specific findings on acquired medical images (e.g., radiography, magnetic resonance, computed tomography). The device detects, identifies, and characterizes findings based on features or information extracted from images, and provides information about the presence, location, and characteristics of the findings to the user. The analysis is intended to inform the primary diagnostic and patient management decisions that are made by the clinical user. The device is not intended as a replacement for a complete clinician's review or their clinical judgment that takes into account other relevant information from the image or patient history.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Design verification and validation must include:
(i) A detailed description of the image analysis algorithm, including a description of the algorithm inputs and outputs, each major component or block, how the algorithm and output affects or relates to clinical practice or patient care, and any algorithm limitations.
(ii) A detailed description of pre-specified performance testing protocols and dataset(s) used to assess whether the device will provide improved assisted-read detection and diagnostic performance as intended in the indicated user population(s), and to characterize the standalone device performance for labeling. Performance testing includes standalone test(s), side-by-side comparison(s), and/or a reader study, as applicable.
(iii) Results from standalone performance testing used to characterize the independent performance of the device separate from aided user performance. The performance assessment must be based on appropriate diagnostic accuracy measures (
e.g., receiver operator characteristic plot, sensitivity, specificity, positive and negative predictive values, and diagnostic likelihood ratio). Devices with localization output must include localization accuracy testing as a component of standalone testing. The test dataset must be representative of the typical patient population with enrichment made only to ensure that the test dataset contains a sufficient number of cases from important cohorts (e.g., subsets defined by clinically relevant confounders, effect modifiers, concomitant disease, and subsets defined by image acquisition characteristics) such that the performance estimates and confidence intervals of the device for these individual subsets can be characterized for the intended use population and imaging equipment.(iv) Results from performance testing that demonstrate that the device provides improved assisted-read detection and/or diagnostic performance as intended in the indicated user population(s) when used in accordance with the instructions for use. The reader population must be comprised of the intended user population in terms of clinical training, certification, and years of experience. The performance assessment must be based on appropriate diagnostic accuracy measures (
e.g., receiver operator characteristic plot, sensitivity, specificity, positive and negative predictive values, and diagnostic likelihood ratio). Test datasets must meet the requirements described in paragraph (b)(1)(iii) of this section.(v) Appropriate software documentation, including device hazard analysis, software requirements specification document, software design specification document, traceability analysis, system level test protocol, pass/fail criteria, testing results, and cybersecurity measures.
(2) Labeling must include the following:
(i) A detailed description of the patient population for which the device is indicated for use.
(ii) A detailed description of the device instructions for use, including the intended reading protocol and how the user should interpret the device output.
(iii) A detailed description of the intended user, and any user training materials or programs that address appropriate reading protocols for the device, to ensure that the end user is fully aware of how to interpret and apply the device output.
(iv) A detailed description of the device inputs and outputs.
(v) A detailed description of compatible imaging hardware and imaging protocols.
(vi) Warnings, precautions, and limitations must include situations in which the device may fail or may not operate at its expected performance level (
e.g., poor image quality or for certain subpopulations), as applicable.(vii) A detailed summary of the performance testing, including test methods, dataset characteristics, results, and a summary of sub-analyses on case distributions stratified by relevant confounders, such as anatomical characteristics, patient demographics and medical history, user experience, and imaging equipment.