(110 days)
Libby™ Echo:Prio is software that is used to process previously acquired DICOM-compliant cardiac ultrasound images, and to make measurements on these images in order to provide automated estimation of several cardiac measurements. The data produced by this software is intended to be used to support qualified cardiologists, sonographers, or other licensed professional healthcare practitioners for clinical decision-making. Libby™ Echo:Prio is indicated for use in adult patients.
Echo:Prio is an image post-processing analysis software device used for viewing and quantifying cardiovascular ultrasound images. The device is intended to aid diagnostic review and analysis of echocardiographic data, patient record management and reporting. The software provides an interface for a skilled sonographer to perform the necessary markup on the echocardiographic image prior to review by the prescribing physician. The markup includes: the cardiac segments captured, measurements of distance, time, area, quantitative analysis of cardiac function, and a summary report. The software allows the sonographer to enter their markup manually and/or manually correct automatically generated results. It also provides automated markup and analysis, which the sonographer may choose to accept outright, to accept partially and modify, or to reject and ignore. Machine learning based view classification and border segmentation form the basis for this automated analysis. Additionally, the software has features for organizing, displaying, and comparing to reference guidelines the quantitative data from cardiovascular images acquired from ultrasound scanners.
The provided text describes the Libby™ Echo:Prio software, its intended use, and performance data from its premarket notification. Here's a breakdown of the acceptance criteria and the study proving the device meets them:
1. A table of acceptance criteria and the reported device performance
The document does not explicitly state "acceptance criteria" in a tabular format with defined thresholds. However, it presents performance metrics from the validation study which serve as the evidence of the device's capability. We can infer the implicit "acceptance criteria" from these reported performance metrics, which are presented as achieved targets.
Metric (Implied Acceptance Criteria) | Reported Device Performance |
---|---|
View Classification Accuracy | 97% |
View Classification F1 Score | > 96.6% (average) |
View Classification Sensitivity (Sn) | 96.8% (average) |
View Classification Specificity (Sp) | 98.5% (average) |
Heart Rate (HR) Estimation Bias (Regression Slope) | 0.98 (95% CI) compared to 12-lead ECG ground truth |
Ejection Fraction (EF) Prediction (Bivariate Linear Regression Slope) | 0.79 (95% CI: 0.52, 0.98) compared to human expert annotations |
2. Sample size used for the test set and the data provenance
- Test Set Sample Size: The document states that performance testing was done "retrospectively on a diverse clinical dataset." However, the exact sample size for this test set is not specified in the provided text.
- Data Provenance:
- Country of Origin: Not specified.
- Retrospective or Prospective: Retrospective.
3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts
- Number of Experts: For the Ejection Fraction (EF) prediction, ground truth was established by "four human experts."
- Qualifications of Experts: The specific qualifications (e.g., years of experience, subspecialty) of these four human experts are not explicitly stated. They are referred to as "human experts" or "clinicians."
4. Adjudication method (e.g. 2+1, 3+1, none) for the test set
The adjudication method for constructing the ground truth from the four human experts for EF prediction is not explicitly stated. It is only mentioned that EF prediction was compared with "annotations by four human experts."
5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance
A Multi-Reader Multi-Case (MRMC) comparative effectiveness study was not explicitly described in the provided text. The performance data focuses on the software's standalone accuracy in comparison to a ground truth, rather than measuring the improvement of human readers assisted by the AI.
6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done
Yes, a standalone performance evaluation of the algorithm's predictions (view classification, HR, EF) against established ground truth was performed. The data presented ("view classification accuracy of 97%", "HR output estimate is with minimal bias", "prediction of the EF output... had a slope of 0.79") reflects the algorithm's direct performance.
7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.)
The types of ground truth used are:
- 12-lead ECG: For Heart Rate (HR) estimation.
- Human Expert Annotations/Consensus (implied): For Left Ventricular Ejection Fraction (EF) prediction, derived from the "annotations by four human experts." For view classification, the "accuracy" and "F1 value" imply a comparison to a set ground truth, likely also established by experts.
8. The sample size for the training set
The sample size for the training set used to develop the machine learning model is not specified in the provided text.
9. How the ground truth for the training set was established
The method for establishing ground truth for the training set is not specified in the provided text. It only vaguely mentions "Machine learning based view classification and border segmentation form the basis for this automated analysis."
§ 892.2050 Medical image management and processing system.
(a)
Identification. A medical image management and processing system is a device that provides one or more capabilities relating to the review and digital processing of medical images for the purposes of interpretation by a trained practitioner of disease detection, diagnosis, or patient management. The software components may provide advanced or complex image processing functions for image manipulation, enhancement, or quantification that are intended for use in the interpretation and analysis of medical images. Advanced image manipulation functions may include image segmentation, multimodality image registration, or 3D visualization. Complex quantitative functions may include semi-automated measurements or time-series measurements.(b)
Classification. Class II (special controls; voluntary standards—Digital Imaging and Communications in Medicine (DICOM) Std., Joint Photographic Experts Group (JPEG) Std., Society of Motion Picture and Television Engineers (SMPTE) Test Pattern).