(66 days)
JLK-ICH is a radiological computer-aided triage and notification software indicated for use in the analysis of non-contrast CT images. JLK-ICH is a notification-only, parallel workflow tool that is intended to assist hospital networks and trained clinicians to identify and communicate images of specific patients to specialists, independent of the standard of care workflow.
JLK-ICH uses an artificial intelligence algorithm to analyze images for findings suggestive of pre-specified clinical conditions and promptly notifies the appropriate medical specialists of these findings in parallel with the standard of care image interpretation. Identification of suspected findings is not for diagnostic use beyond notification. Specifically, the device analyzes non-contrast CT images of the head to detect intracranial hemorrhage (ICH). The system sends a notification to a clinician that a suspected ICH has been identified and recommends a review of those images. Images can be previewed and compressed through a mobile application.
Notified clinicians are responsible for viewing non-compressed images on a diagnostic viewer and engaging in appropriate patient evaluation and relevant discussion with a treating clinician before making care-related decisions or requests.
JLK-ICH is limited to the analysis of imaging data and should not be used in lieu of full patient evaluation or relied upon to make or confirm the diagnosis.
JLK-ICH is a radiological computer-assisted triage and notification (CADt) software that adheres to the DICOM standard. The device functions as a Non-Contrast Computed Tomography (NCCT) processing module, providing triage and notification for suspected hemispheric intracranial hemorrhage (ICH). This software acts as a notificationonly, parallel workflow tool for hospital networks and trained clinicians, enabling the identification and communication of suspected patient images to relevant specialists, independent of the standard care workflow. JLK-ICH processes non-contrast computed tomography (NCCT) scans, prioritizing triage and notification for suspected hemispheric intracranial hemorrhaqe (ICH). The system utilizes advanced artificial intelligence to automatically analyze NCCT scans for indicators of ICH and promptly notify appropriate medical specialists of potential cases.
JLK-ICH comprises an image analysis algorithm hosted on JLK servers and a mobile application for notification management. The Al/ML-based algorithm is designed to analyze NCCT of the head scans forwarded from CT scanners to the JLK servers. The mobile software module enables users to receive and toggle notifications for suspected ICH cases identified by the JLK-ICH Image Analysis Algorithm. Users can view a patient list and non-diagnostic CT scans through the mobile application. Image viewing through the mobile application interface is for non-diagnostic purposes only.
Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided text:
Acceptance Criteria and Device Performance
1. Table of Acceptance Criteria and Reported Device Performance
The document doesn't explicitly list "acceptance criteria" for the standalone performance in a table, but it states that the primary endpoints, sensitivity and specificity, both exceeded 80%. This implies that 80% or greater for both sensitivity and specificity were the target acceptance criteria. It also sets a time-to-notification target.
Here's a table summarizing the implicit acceptance criteria and the reported performance:
Acceptance Criterion (Implicit) | Reported Device Performance |
---|---|
Sensitivity $\ge$ 80% | 97.3% (95% CI: 94.8% to 99.5%) |
Specificity $\ge$ 80% | 97.9% (95% CI: 95.5% to 99.5%) |
Time-to-Notification $\le$ 0.49 ± 0.15 minutes (based on predicate) | 0.19 ± 0.04 minutes (95% CI: 0.186 to 0.197) |
2. Sample Size and Data Provenance for the Test Set
- Sample Size: 376 Non-Contrast CT (NCCT) scans. This included 188 ICH-positive and 188 ICH-negative cases.
- Data Provenance: Retrospective study. The scans were obtained from various regions in the U.S.
3. Number of Experts and Qualifications for Ground Truth
- Number of Experts: Two primary ground truthers, with a third used for adjudication in cases of disagreement.
- Qualifications of Experts: All truthers were U.S. board-certified neuroradiologists (specifically, American Board of Radiologists (ABR)-certified neuroradiologists).
4. Adjudication Method for the Test Set
The adjudication method used was a 2+1 scheme.
- Ground truth was initially determined by two ABR-certified neuroradiologists.
- In cases of disagreement between the first two truthers, a third neuroradiologist intervened to reach a consensus (act as a tie-breaker).
- The document notes that 30 cases were sent to the third truther due to disagreements.
5. Multi Reader Multi Case (MRMC) Comparative Effectiveness Study
No, a multi-reader multi-case (MRMC) comparative effectiveness study was not conducted. The study focused on standalone performance. The document explicitly states: "The documentation was provided as recommended by FDA's Guidance for Industry and FDA staff, "Content of Premarket Submissions for Device Software Functions," June 14, 2023. In addition to the software verification and validation testing described in the sections above, JLK, Inc. performed a standalone performance in accordance with the §892.2080 special controls to demonstrate adequate clinical performance of the JLK-ICH module."
Therefore, there is no effect size reported for human readers improving with AI vs. without AI assistance.
6. Standalone Performance (Algorithm Only)
Yes, a standalone performance evaluation (algorithm only, without human-in-the-loop performance) was performed.
- "The algorithm's performance was validated through a standalone performance evaluation using an independent dataset, distinct from the one for algorithm training data."
- "JLK, Inc. performed a standalone performance in accordance with the §892.2080 special controls to demonstrate adequate clinical performance of the JLK-ICH module."
7. Type of Ground Truth Used
The ground truth used was expert consensus. It was established by U.S. board-certified neuroradiologists following a 2+1 adjudication scheme.
8. Sample Size for the Training Set
The training dataset was substantial and diverse:
- Total ICH cases: 14,462
- US-based datasets: 14,998 cases (7,499 ICH cases, 7,499 normal cases)
- Out-of-US datasets: 13,926 cases (6,963 normal cases, 6,963 ICH cases)
- Total Training Cases (approximate sum): 14,998 (US) + 13,926 (Out-of-US) = 28,924 cases total. (Note: The first line stating 14,462 ICH cases seems to be a subset or summary that might exclude normal cases, given the detailed breakdown follows. The calculated sum provides a better approximation of the total
images used for training).
9. How Ground Truth for the Training Set Was Established
The document states: "The JLK-ICH Al model was trained using a dataset that includes 14,462 ICH cases from various institutions, divided between US-based and Out-of-US sources... All cases are carefully separated from the clinical performance datasets."
While it mentions the dataset composition, it does not explicitly detail how the ground truth for the training set was established. It only specifies the ground truth establishment method (2+1 neuroradiologist consensus) for the test set. It is common for training data to have ground truth established through similar expert review processes, but this document does not provide those specifics for the training set itself.
§ 892.2080 Radiological computer aided triage and notification software.
(a)
Identification. Radiological computer aided triage and notification software is an image processing prescription device intended to aid in prioritization and triage of radiological medical images. The device notifies a designated list of clinicians of the availability of time sensitive radiological medical images for review based on computer aided image analysis of those images performed by the device. The device does not mark, highlight, or direct users' attention to a specific location in the original image. The device does not remove cases from a reading queue. The device operates in parallel with the standard of care, which remains the default option for all cases.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Design verification and validation must include:
(i) A detailed description of the notification and triage algorithms and all underlying image analysis algorithms including, but not limited to, a detailed description of the algorithm inputs and outputs, each major component or block, how the algorithm affects or relates to clinical practice or patient care, and any algorithm limitations.
(ii) A detailed description of pre-specified performance testing protocols and dataset(s) used to assess whether the device will provide effective triage (
e.g., improved time to review of prioritized images for pre-specified clinicians).(iii) Results from performance testing that demonstrate that the device will provide effective triage. The performance assessment must be based on an appropriate measure to estimate the clinical effectiveness. The test dataset must contain sufficient numbers of cases from important cohorts (
e.g., subsets defined by clinically relevant confounders, effect modifiers, associated diseases, and subsets defined by image acquisition characteristics) such that the performance estimates and confidence intervals for these individual subsets can be characterized with the device for the intended use population and imaging equipment.(iv) Stand-alone performance testing protocols and results of the device.
(v) Appropriate software documentation (
e.g., device hazard analysis; software requirements specification document; software design specification document; traceability analysis; description of verification and validation activities including system level test protocol, pass/fail criteria, and results).(2) Labeling must include the following:
(i) A detailed description of the patient population for which the device is indicated for use;
(ii) A detailed description of the intended user and user training that addresses appropriate use protocols for the device;
(iii) Discussion of warnings, precautions, and limitations must include situations in which the device may fail or may not operate at its expected performance level (
e.g., poor image quality for certain subpopulations), as applicable;(iv) A detailed description of compatible imaging hardware, imaging protocols, and requirements for input images;
(v) Device operating instructions; and
(vi) A detailed summary of the performance testing, including: test methods, dataset characteristics, triage effectiveness (
e.g., improved time to review of prioritized images for pre-specified clinicians), diagnostic accuracy of algorithms informing triage decision, and results with associated statistical uncertainty (e.g., confidence intervals), including a summary of subanalyses on case distributions stratified by relevant confounders, such as lesion and organ characteristics, disease stages, and imaging equipment.