K Number
K221241
Date Cleared
2022-09-01

(125 days)

Product Code
Regulation Number
892.2080
Reference & Predicate Devices
Predicate For
N/A
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

The DrAid™ for Radiology v1 is a radiological computer-assisted triage & notification software to aid the clinical assessment of adult Chest X-Ray cases with features suggestive of pneumothorax in medical care environment. DrAid™ analyzes cases using an artificial intelligence algorithm to features suggestive of suspected findings. It makes case-level output available to a PACS for worklist prioritization or triage.

As a passive notification for prioritization-only software tool with standard of care workflow, DrAid™ does not send a proactive alert directly to appropriately trained medical specialists. DrAid™ is not intended to direct attention to specific portions or anomalies of an image. Its results are not intended to be used on a stand-alone basis for clinical decisionmaking nor is it intended to rule out pneumothorax or otherwise preclude clinical assessment of X-Ray cases.

Device Description

DrAid™ for Radiology v1 (hereafter called DrAid™ or DrAid) is a a radiological computer-assisted triage & notification software product that automatically identifies suspected pneumothorax on frontal chest x-ray images and notifies PACS of the presence of pneumothorax in the scan. This notification enables prioritized review by the appropriately trained medical specialists who are qualified to interpret chest radiographs. The software does not alter the order or remove cases from the reading queue. The device's aim is to aid in the prioritization and triage of radiological medical images only.

Chest radiographs are automatically received from the user's image storage system (e.g. Picture Archiving and Communication System (PACS)) or other radiological imaging equipment (e.g. Xray systems) and processed by DrAid™ for analysis. Following receipt of chest radiographs, the software device de-identifies a copy of each chest radiographs in DICOM format (.dcm) and automatically analyzes each image to identify features suggestive of pneumothorax. Based on the analysis result, the software notifies PACS/workstation for the presence of Pneumothorax as indicating either "flag" or "(blank)". This would allow the appropriately trained medical specialists to group suspicious exams together that may potentially benefit for their prioritization. Chest radiographs without an identified anomaly are placed in the worklist for routine review, which is the current standard of care.

The DrAid™ device works in parallel to and in conjunction with the standard care of workflow. After a chest x-ray has been performed, a copy of the study is automatically retrieved and processed by the DrAid™ device, therefore, the analysis result can also be provided in the form of DICOM files containing information on the presence of suspicious Pneumothorax. In parallel, the algorithms produce an on-device notification indicating which cases were prioritized by DrAid™ in PACS. The on-device notification does not provide any diagnostic information and it is not intended to inform any clinical decision, prioritization, or action to who are qualified to interpret chest radiographs. It is meant as a tool to assist in improving workload prioritization of critical cases. The final diagnosis is provided by the radiologist after reviewing the scan itself.

The following modules compose the DrAid™:

Data input and validation: Following retrieval of a study, the validation feature assessed the input data (e.g. age, modality, view) to ensure compatibility for processing by the algorithm.

AI algorithm: Once a study has been validated, the AI algorithm analyzes the frontal chest x-ray for detection of suspected pneumothorax.

API Cognitive service: The study analysis and the results of a successful study analysis are provided through an API service, to then be sent to the PACS for triaging & notification.

Error codes feature: In the case of a study failure during data validation or the analysis by the algorithm, an error is provided to the system.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving DrAid for Radiology v1 meets them, based on the provided text:

1. Acceptance Criteria and Reported Device Performance

The document does not explicitly state "acceptance criteria" as a separate, pre-defined set of thresholds that the device must meet. Instead, it compares the performance of the DrAid™ for Radiology v1 device to its predicate device (HealthPNX, K190362) to demonstrate substantial equivalence. The predicate device's performance metrics effectively serve as the implicit "acceptance criteria" for demonstrating comparable safety and effectiveness.

Here's a table comparing the performance of DrAid™ for Radiology v1 (aggregate results) against its predicate:

MetricsDrAid™ for Radiology v1 Performance (Mean)DrAid™ for Radiology v1 (95% CI)Predicate HealthPNX Performance (Mean)Predicate HealthPNX (95% CI)
Sensitivity0.9461 (94.61%)[0.9216, 0.9676]93.15%[87.76%, 96.67%]
Specificity0.9758 (97.58%)[0.9636, 0.9865]92.99%[90.19%, 95.19%]
AUC0.9610 (96.10%)[0.9473, 0.9730]98.3%[97.40%, 99.02%]
Timing of Notification3.83 minutesN/A22.1 secondsN/A

The document concludes that the performance of DrAid™ for Radiology v1 is "substantially equivalent" to that of the predicate and satisfies the requirements for product code QFM. It notes that the timing performance, despite being longer, is also considered "substantially equivalent."

2. Sample Sizes Used for the Test Set and Data Provenance

The test set was composed of two separate datasets:

  • NIH Data Set:

    • Sample Size: 565 radiographs (386 negative, 179 positive pneumothorax cases).
    • Data Provenance: National Institute of Health (NIH), implicitly US. This dataset was used to demonstrate "generalizability of the device to the demographics of the US population."
    • Retrospective/Prospective: Not explicitly stated, but typically large public datasets like NIH are retrospective.
  • Vietnamese Data Set:

    • Sample Size: 285 radiographs (110 negative, 175 positive pneumothorax cases).
    • Data Provenance: Four Vietnamese hospitals (University Medical Center Hospital, Nam Dinh Lung Hospital, Hai Phong Lung Hospital, and Vinmec Hospital).
    • Retrospective/Prospective: Not explicitly stated, but likely retrospective from existing hospital archives.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications

  • Number of Experts: 3
  • Qualifications of Experts: US board-certified radiologists.

4. Adjudication Method for the Test Set

The adjudication method is not explicitly described beyond stating that the datasets were "truthed by a panel of 3 US board certified radiologists." This implies a consensus-based approach, but the specific dynamics (e.g., majority vote, discussion to achieve consensus, a designated tie-breaker) are not detailed. It is often referred to as an "expert consensus" ground truth.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No Multi-Reader Multi-Case (MRMC) comparative effectiveness study was mentioned. The study focused on the standalone performance of the AI algorithm. Therefore, there is no reported effect size of how much human readers improve with AI vs without AI assistance from this document.

6. Standalone Performance Study

Yes, a standalone performance study was done. The performance metrics (Sensitivity, Specificity, AUC) reported in the tables are for the algorithm only, without human-in-the-loop performance. This is further clarified by the section title "Performance Testing - Stand-Alone."

7. Type of Ground Truth Used

The ground truth used for the test sets (both NIH and Vietnamese) was expert consensus from a panel of 3 US board-certified radiologists.

8. Sample Size for the Training Set

The document mentions that the training data came from "a hospital system in Vietnam and the publicly available CheXpert data set." However, the specific sample size for the training set is not provided in the text.

9. How the Ground Truth for the Training Set Was Established

  • Hospital system in Vietnam: The method for establishing ground truth for this dataset is not explicitly stated.
  • Publicly available CheXpert data set: The CheXpert dataset typically derives its labels from automated natural language processing (NLP) of radiology reports, which are then often reviewed and further annotated, though the precise methodology for this specific training use is not detailed here.

The document emphasizes that there was no overlap between training and validation data sets, with data from different hospitals and no patient overlap confirmed.

§ 892.2080 Radiological computer aided triage and notification software.

(a)
Identification. Radiological computer aided triage and notification software is an image processing prescription device intended to aid in prioritization and triage of radiological medical images. The device notifies a designated list of clinicians of the availability of time sensitive radiological medical images for review based on computer aided image analysis of those images performed by the device. The device does not mark, highlight, or direct users' attention to a specific location in the original image. The device does not remove cases from a reading queue. The device operates in parallel with the standard of care, which remains the default option for all cases.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Design verification and validation must include:
(i) A detailed description of the notification and triage algorithms and all underlying image analysis algorithms including, but not limited to, a detailed description of the algorithm inputs and outputs, each major component or block, how the algorithm affects or relates to clinical practice or patient care, and any algorithm limitations.
(ii) A detailed description of pre-specified performance testing protocols and dataset(s) used to assess whether the device will provide effective triage (
e.g., improved time to review of prioritized images for pre-specified clinicians).(iii) Results from performance testing that demonstrate that the device will provide effective triage. The performance assessment must be based on an appropriate measure to estimate the clinical effectiveness. The test dataset must contain sufficient numbers of cases from important cohorts (
e.g., subsets defined by clinically relevant confounders, effect modifiers, associated diseases, and subsets defined by image acquisition characteristics) such that the performance estimates and confidence intervals for these individual subsets can be characterized with the device for the intended use population and imaging equipment.(iv) Stand-alone performance testing protocols and results of the device.
(v) Appropriate software documentation (
e.g., device hazard analysis; software requirements specification document; software design specification document; traceability analysis; description of verification and validation activities including system level test protocol, pass/fail criteria, and results).(2) Labeling must include the following:
(i) A detailed description of the patient population for which the device is indicated for use;
(ii) A detailed description of the intended user and user training that addresses appropriate use protocols for the device;
(iii) Discussion of warnings, precautions, and limitations must include situations in which the device may fail or may not operate at its expected performance level (
e.g., poor image quality for certain subpopulations), as applicable;(iv) A detailed description of compatible imaging hardware, imaging protocols, and requirements for input images;
(v) Device operating instructions; and
(vi) A detailed summary of the performance testing, including: test methods, dataset characteristics, triage effectiveness (
e.g., improved time to review of prioritized images for pre-specified clinicians), diagnostic accuracy of algorithms informing triage decision, and results with associated statistical uncertainty (e.g., confidence intervals), including a summary of subanalyses on case distributions stratified by relevant confounders, such as lesion and organ characteristics, disease stages, and imaging equipment.