(262 days)
The MedCognetics (CogNet) QmTRIAGE™ software is a passive notification-only, parallel-workflow software tool used by MQSA qualified interpreting physicians to prioritize patients with suspicious findings in the medical care environment. QmTRIAGE™ utilizes an artificial intelligence algorithm to analyze 2D FFDM screening mammograms and flags those that are suggestive of the presence of at least one suspicious finding at the exam level. QmTRIAGE™ produces an exam level output to a PACS/Workstation for flagging the suspicious study and allows for worklist prioritization.
MQSA qualified interpreting physicians are responsible for reviewing each exam on a display approved for use in mammography, according to the current standard of care. The OmTRIAGE™ device is limited to the categorization of exams, does not provide any diagnostic information beyond triage and prioritization, does not remove images from the interpreting physician's worklist, and should not be used in lieu of full patient evaluation, or relied upon to make or confirm diagnosis.
The QmTRIAGE™ device is intended for use with complete 2D FFDM mammography exams acquired using validated FFDM systems only.
The MedCognetics (CogNet) QmTRIAGE is a non-invasive computer-assisted triage and notification software as a medical device (SaMD) that analyzes 2D FFDM screening mammograms using a machine learning algorithm and notifies a PACS/workstation of the presence of findings suspicious of cancer in a study. The passive-notification enables radiologists to prioritize their worklist and assists them in viewing prioritized studies using the standard PACS or workstation viewing software. The device aim is to aid in the prioritization and triage of radiological medical images only. It is a software tool for MQSA interpreting physicians reading mammograms and does not replace complete evaluation according to the standard of care.
Here's a breakdown of the acceptance criteria and the study details for the MedCognetics CogNet QmTRIAGE device, based on the provided text:
Acceptance Criteria and Reported Device Performance
Criteria | Acceptance Criteria (Implied) | Reported Device Performance |
---|---|---|
AUROC | High AUROC value for overall performance. | 0.9569 (95% CI: 0.9364-0.9738) |
Sensitivity | Exceed the standard of care (e.g., BCSC study reported values). | 87% |
Specificity | Exceed the standard of care (e.g., BCSC study reported values). | 89% |
Triage Accuracy | Demonstrated accuracy across various cohorts (lesion type, breast density, age, race). | Measured and validated. Specific values not provided in this summary, but the study notes confounding factors were considered. |
Note: The document explicitly states the reported Sensitivity (87%) and Specificity (89%) "exceeded the standard of care as reported in the Breast Cancer Surveillance Consortium (BCSC) study," implying this was the acceptance benchmark.
Study Details
-
Sample Size used for the test set and data provenance:
- Sample Size: 800 anonymized 2D FFDM mammograms.
- Data Provenance: Retrospective cohort from various countries, specifically mentioned USA and Germany. (Training data was obtained from various sites worldwide including North America, South America, Europe, Africa, and Southeast Asia, but the test set is specified as USA and Germany.)
-
Number of experts used to establish the ground truth for the test set and the qualifications of those experts:
- The document does not specify the number of experts used to establish ground truth for the test set.
- Qualifications: The ground truth for cancer cases was based on biopsy confirmation, and for negative cases (BI-RADS 1 and 2), it was established by a two-year follow-up of a negative diagnosis. While MQSA qualified interpreting physicians are mentioned in the intended use for reviewing exams, their direct role in establishing ground truth for the test set is not explicitly detailed.
-
Adjudication method for the test set:
- The document does not explicitly state an adjudication method (e.g., 2+1, 3+1). The ground truth relies on objective medical outcomes: biopsy confirmation for positive cases and 2-year follow-up for negative cases.
-
If a multi-reader multi-case (MRMC) comparative effectiveness study was done, and the effect size of how much human readers improve with AI vs without AI assistance:
- No, an MRMC comparative effectiveness study involving human readers with and without AI assistance was not described. The study focuses on the standalone performance of the AI algorithm for triage. The device is described as a "passive notification-only, parallel-workflow software tool" to "prioritize patients," not as an aid that directly impacts a reader's diagnostic performance measured against a baseline.
-
If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:
- Yes, a standalone performance study was done. The reported AUROC, Sensitivity, and Specificity values are for the algorithm's performance independent of human-in-the-loop interaction. The device's role is to "prioritize their worklist and assists them in viewing prioritized studies," implying its standalone functionality in flagging.
-
The type of ground truth used:
- Pathology/Biopsy: For positive cases (399 cases positive for cancer), ground truth was established by biopsy confirmation.
- Outcomes Data/Clinical Follow-up: For negative cases (401 cases negative for breast cancer, BI-RADS 1 and 2), ground truth was established by a two-year follow-up of a negative diagnosis.
-
The sample size for the training set:
- The exact sample size for the training set is not provided in the summary. It only states that "Data sets used for training the algorithm were independent of the testing datasets and were obtained from various sites worldwide."
-
How the ground truth for the training set was established:
- The document does not explicitly detail how the ground truth for the training set was established. It only mentions the training data was obtained from various sites worldwide and was independent of the testing dataset. Given the nature of breast cancer screening AI, it can be inferred that similar methods (biopsy, follow-up, or expert review) would have been used, but it's not stated.
§ 892.2080 Radiological computer aided triage and notification software.
(a)
Identification. Radiological computer aided triage and notification software is an image processing prescription device intended to aid in prioritization and triage of radiological medical images. The device notifies a designated list of clinicians of the availability of time sensitive radiological medical images for review based on computer aided image analysis of those images performed by the device. The device does not mark, highlight, or direct users' attention to a specific location in the original image. The device does not remove cases from a reading queue. The device operates in parallel with the standard of care, which remains the default option for all cases.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Design verification and validation must include:
(i) A detailed description of the notification and triage algorithms and all underlying image analysis algorithms including, but not limited to, a detailed description of the algorithm inputs and outputs, each major component or block, how the algorithm affects or relates to clinical practice or patient care, and any algorithm limitations.
(ii) A detailed description of pre-specified performance testing protocols and dataset(s) used to assess whether the device will provide effective triage (
e.g., improved time to review of prioritized images for pre-specified clinicians).(iii) Results from performance testing that demonstrate that the device will provide effective triage. The performance assessment must be based on an appropriate measure to estimate the clinical effectiveness. The test dataset must contain sufficient numbers of cases from important cohorts (
e.g., subsets defined by clinically relevant confounders, effect modifiers, associated diseases, and subsets defined by image acquisition characteristics) such that the performance estimates and confidence intervals for these individual subsets can be characterized with the device for the intended use population and imaging equipment.(iv) Stand-alone performance testing protocols and results of the device.
(v) Appropriate software documentation (
e.g., device hazard analysis; software requirements specification document; software design specification document; traceability analysis; description of verification and validation activities including system level test protocol, pass/fail criteria, and results).(2) Labeling must include the following:
(i) A detailed description of the patient population for which the device is indicated for use;
(ii) A detailed description of the intended user and user training that addresses appropriate use protocols for the device;
(iii) Discussion of warnings, precautions, and limitations must include situations in which the device may fail or may not operate at its expected performance level (
e.g., poor image quality for certain subpopulations), as applicable;(iv) A detailed description of compatible imaging hardware, imaging protocols, and requirements for input images;
(v) Device operating instructions; and
(vi) A detailed summary of the performance testing, including: test methods, dataset characteristics, triage effectiveness (
e.g., improved time to review of prioritized images for pre-specified clinicians), diagnostic accuracy of algorithms informing triage decision, and results with associated statistical uncertainty (e.g., confidence intervals), including a summary of subanalyses on case distributions stratified by relevant confounders, such as lesion and organ characteristics, disease stages, and imaging equipment.