(140 days)
NinesAl is a parallel workflow tool indicated for use by hospital networks and trained clinicians to identify images of specific patients to a radiologist, independent of standard of care workflow, to aid in prioritizing and performing the radiological review. NinesAl uses artificial intelligence algorithms to analyze head CT images for findings suggestive of a pre-specified emergent clinical condition.
The software automatically analyzes Digital Imaging and Communications in Medicine (DICOM) images as they arrive in the Picture Archive and Communication System (PACS) using machine learning algorithms. Identification of suspected findings is not for diagnostic use beyond notification. Specifically, the software analyzes head CT images of the brain to assess the suspected presence of intracranial hemorrhage and/or mass effect and identifies images with potential emergent findings in a radiologist's worklist.
NinesAl is intended to be used as a triage tool limited to analysis of imaging data and should not be used in-lieu of full patient evaluation or relied upon to make or confirm a diagnosis. Additionally, preview images displayed to the radiologist outside of the DICOM viewer are non-diagnostic quality and should only be used for informational purposes.
NinesAl notifies a radiologist of the presence of a suspected critical abnormality in a radiological image. The software system is a complete package comprised of image analysis software and a workstation module that is used to alert the radiologist. The image analysis can also be configured to send HL7 messages and DICOM secondary series.
The image analysis uses Artificial Intelligence (AI) technology to analyze non contrast CT Head scans for the presence of Intracranial Hemorrhage and/or Mass Effect. More specifically, the device utilizes two machine learning (ML) algorithms to detect each finding respectively.
NinesAl is a software device and does not come into contact with patients. All radiological studies are still reviewed by trained radiologists. NinesAl is meant to be used as an aid for case prioritization.
Here's a summary of the acceptance criteria and study details for the NinesAI device, based on the provided text:
Acceptance Criteria and Device Performance
The acceptance criteria are derived from the observed performance of the predicate device (Aidoc's BriefCase) and a baseline of 0.80 for both sensitivity and specificity for general emergent findings.
Finding | Acceptance Criteria (Sensitivity) | Reported Device Performance (Sensitivity) [95% CI] | Acceptance Criteria (Specificity) | Reported Device Performance (Specificity) [95% CI] |
---|---|---|---|---|
Intracranial Hemorrhage | >= 0.80 | 0.899 [0.837, 0.940] | >= 0.80 | 0.974 [0.974, 0.992] |
Mass Effect | >= 0.80 | 0.964 [0.916, 0.987] | >= 0.80 | 0.911 [0.856, 0.948] |
Time Benefit Analysis:
Metric | NinesAI Time-to-Notification (Mean [min] / Median [min]) | Standard of Care Time-to-Open-Dictation (Mean [min] / Median [min]) |
---|---|---|
Intracranial Hemorrhage (Time-Savings) | 0.23 [0.23, 0.24] / 0.24 | 159.4 [67.07, 251.7] / 6.0 |
Mass Effect (Time-Savings) | 0.23 [0.23, 0.24] / 0.24 | 28.5 [14.1, 42.8] / 7.5 |
Study Details
-
Sample Size and Data Provenance (Test Set):
- Sample Size: Not explicitly stated as a single number, but the text mentions "Head CT studies included in each of the test datasets were obtained from over 20 clinical sites."
- Data Provenance: Retrospective. The studies were obtained from "over 20 clinical sites" and included "a minimum of 3 scanner manufacturers and over 20 scanner models, and also reflected broad patient demographics," suggesting a diverse dataset. The country of origin for the data is not specified.
-
Number of Experts and Qualifications (Ground Truth for Test Set):
- Number of Experts: Not explicitly stated. The text mentions "agreement rate between labelers who determined ground truth for the test dataset studies." This implies multiple experts were involved in establishing the ground truth.
- Qualifications of Experts: Not explicitly stated, but the term "labelers" typically refers to trained medical professionals who are qualified to interpret medical images, such as radiologists.
-
Adjudication Method (Test Set):
- Not explicitly stated. The mention of "agreement rate between labelers who determined ground truth" suggests some form of consensus or agreement process, but the specific method (e.g., 2+1, 3+1) is not detailed.
-
Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:
- No, a specific MRMC comparative effectiveness study is not explicitly mentioned. The study focuses on the standalone performance of the AI algorithm and a time benefit analysis, which compares AI notification time to standard-of-care time-to-open-dictation, rather than comparing human reader performance with and without AI assistance.
-
Standalone Performance Study:
- Yes, a standalone (algorithm only) performance study was conducted. The algorithms were evaluated independently, and primary endpoints like sensitivity and specificity were measured for each algorithm.
-
Type of Ground Truth Used (Test Set):
- Expert Consensus: The text states, "agreement rate between labelers who determined ground truth for the test dataset studies." This indicates that human expert consensus was used to establish the ground truth.
-
Sample Size for Training Set:
- Not explicitly stated in the provided text. The text mentions, "The algorithms are trained using a database of radiological images," but does not give a specific number for the training set size.
-
How Ground Truth for Training Set was Established:
- Not explicitly stated in the provided text. It is generally inferred that similar expert labeling methods would be used for training data, but the document does not detail this.
§ 892.2080 Radiological computer aided triage and notification software.
(a)
Identification. Radiological computer aided triage and notification software is an image processing prescription device intended to aid in prioritization and triage of radiological medical images. The device notifies a designated list of clinicians of the availability of time sensitive radiological medical images for review based on computer aided image analysis of those images performed by the device. The device does not mark, highlight, or direct users' attention to a specific location in the original image. The device does not remove cases from a reading queue. The device operates in parallel with the standard of care, which remains the default option for all cases.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Design verification and validation must include:
(i) A detailed description of the notification and triage algorithms and all underlying image analysis algorithms including, but not limited to, a detailed description of the algorithm inputs and outputs, each major component or block, how the algorithm affects or relates to clinical practice or patient care, and any algorithm limitations.
(ii) A detailed description of pre-specified performance testing protocols and dataset(s) used to assess whether the device will provide effective triage (
e.g., improved time to review of prioritized images for pre-specified clinicians).(iii) Results from performance testing that demonstrate that the device will provide effective triage. The performance assessment must be based on an appropriate measure to estimate the clinical effectiveness. The test dataset must contain sufficient numbers of cases from important cohorts (
e.g., subsets defined by clinically relevant confounders, effect modifiers, associated diseases, and subsets defined by image acquisition characteristics) such that the performance estimates and confidence intervals for these individual subsets can be characterized with the device for the intended use population and imaging equipment.(iv) Stand-alone performance testing protocols and results of the device.
(v) Appropriate software documentation (
e.g., device hazard analysis; software requirements specification document; software design specification document; traceability analysis; description of verification and validation activities including system level test protocol, pass/fail criteria, and results).(2) Labeling must include the following:
(i) A detailed description of the patient population for which the device is indicated for use;
(ii) A detailed description of the intended user and user training that addresses appropriate use protocols for the device;
(iii) Discussion of warnings, precautions, and limitations must include situations in which the device may fail or may not operate at its expected performance level (
e.g., poor image quality for certain subpopulations), as applicable;(iv) A detailed description of compatible imaging hardware, imaging protocols, and requirements for input images;
(v) Device operating instructions; and
(vi) A detailed summary of the performance testing, including: test methods, dataset characteristics, triage effectiveness (
e.g., improved time to review of prioritized images for pre-specified clinicians), diagnostic accuracy of algorithms informing triage decision, and results with associated statistical uncertainty (e.g., confidence intervals), including a summary of subanalyses on case distributions stratified by relevant confounders, such as lesion and organ characteristics, disease stages, and imaging equipment.