(237 days)
LARALAB enables visualization, assessment and measurement of cardiovascular structures for:
- Preprocedural planning and sizing for cardiovascular interventions and surgery
- Postprocedural image review
To facilitate the above, LARALAB provides general functionality such as:
- Automatic segmentation of cardiovascular structures and other objects of interest (calcifications)
- Automatic measurements
- Manual measurement and adjustment tools
- Visualization and image reconstruction techniques: Multiplanar Reconstruction (MPR), Surface rendering
- Reporting tools
LARALAB is a stand-alone software developed to enable cardiologists, radiologists, heart surgeons and healthcare professionals ("Users") to import, view and process Medical Images. In particular, the software generates pre-calculated automatic segmentations and measurements based on deterministic Deep Learning Algorithms. Based on the output of the Deep Learning Algorithms, the User is able to further visualize, assess and measure ("Case Planning") various anatomical structures of the heart in the context of cardiovascular procedures (e.g., TAVR) such as heart valves, heart chambers, cardiac tissue and vessels, as well as such vessels and tissue relevant as access routes.
Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided FDA 510(k) Clearance Letter for LARALAB:
1. Table of Acceptance Criteria and Reported Device Performance
Acceptance Criteria Category | Specific Metric | Acceptance Criterion | Reported Device Performance |
---|---|---|---|
Segmentation Accuracy | Dice score for primary cardiovascular structures (LA, LV, RV, RA) | Met predefined acceptance criteria | Ranged from 0.89 to 0.98 |
Dice score for secondary and tertiary structures | Met predefined acceptance criteria | Met predefined acceptance criteria | |
Mean Surface Distance (MSD) | Not explicitly stated, implied by "met predefined acceptance criteria" | Not explicitly stated, implied by met criteria | |
95th percentile Hausdorff distance (95% HD) | Not explicitly stated, implied by "met predefined acceptance criteria" | Not explicitly stated, implied by met criteria | |
Measurement Accuracy | Bland-Altman analysis: Mean bias and 95% Limits of Agreement for all assessed parameters | Within predefined acceptance criteria | Within predefined acceptance criteria |
Measurement Consistency (Ground Truth) | Intraclass Correlation Coefficient (ICC) for clinical experts' manual measurements | ICC > 0.75 | Above 0.75 for all measurements |
Cybersecurity | Identify medium or high-risk vulnerabilities | No medium or high-risk vulnerabilities identified | No medium or high-risk vulnerabilities identified |
Overall security posture | Strong overall security posture with no critical issues | Strong overall security posture with no critical issues |
2. Sample Size Used for the Test Set and Data Provenance
- Sample Size: 60 patient datasets
- Data Provenance: Multi-centric observational cohort study. The document does not explicitly state the country of origin but implies data diversity across different CT manufacturers and imaging parameters (slice thickness, contrast enhancement). The study was retrospective as it states "No datasets were included that were used for training the deep learning models," indicating these were pre-existing datasets not specifically collected for the deep learning training.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications
- Number of Experts: Not explicitly stated as a specific number, but "clinical experts" are mentioned as generating the ground truth using the predicate device. The ICC values (above 0.75) confirm that multiple experts were involved and showed good agreement.
- Qualifications: "Expert clinicians" (implied to be cardiologists, radiologists, heart surgeons, or other healthcare professionals as per the device's intended users and the "Comparison" section referencing these specialists). No specific years of experience are provided, but their status as "experts" and their use of the predicate device for ground truth generation supports their qualification.
4. Adjudication Method for the Test Set
The document does not explicitly state a specific adjudication method like 2+1 or 3+1. However, since the Intraclass Correlation Coefficient (ICC) was calculated to assess the consistency between the clinical experts' manual measurements, it implies that multiple experts independently created measurements, and their agreement was quantified, likely without a formal adjudication process to resolve disagreements, but rather to confirm their consistency.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done
- Yes, in spirit. While not explicitly termed an "MRMC comparative effectiveness study" in the context of human readers with AI vs. without AI assistance, the study involves expert clinicians generating ground truth using the predicate device (which the LARALAB device is compared against). This essentially sets up a comparison baseline for performance against a current standard.
- Effect Size of Human Readers Improve with AI vs. without AI assistance: The study focuses on comparing LARALAB's automatic segmentations and measurements to manual ground-truth measurements obtained by clinicians using the predicate device. It demonstrates that LARALAB's automatic outputs are as accurate and reliable as those obtained using the predicate device manually. The document states, "The study concluded that LARALAB's automatic pre-calculated segmentations and measurements are as accurate and reliable as those obtained using the predicate device." This implies that the AI-driven automated measurements are on par with, and potentially reduce the burden of, manual measurements by human experts. No specific numerical effect size of human improvement with AI assistance is provided, as the study primarily validated the AI's standalone performance against human-derived ground truth.
6. If a Standalone Performance (Algorithm Only Without Human-in-the-Loop Performance) Was Done
- Yes. The study directly evaluates the "automatic pre-calculated segmentations and measurements" generated by LARALAB's "deterministic Deep Learning Algorithms." These automatic outputs are then compared against the ground truth. This is a standalone performance evaluation of the algorithm. The device then allows the user to "further visualize, assess and measure" and "review/adjust/approve" the pre-calculated outputs, indicating that the algorithm's initial output is standalone.
7. The Type of Ground Truth Used
- Expert Consensus/Manual Measurements using a Predicate Device. The ground truth was established by "expert clinicians with the predicate device." Specifically, manual measurements generated by these experts using the predicate device served as the reference. The ICC was used to confirm the consistency of these expert measurements.
8. The Sample Size for the Training Set
- The document states, "No datasets were included that were used for training the deep learning models" for the test set. However, the actual sample size for the training set is not provided in this document.
9. How the Ground Truth for the Training Set Was Established
- The document does not explicitly describe how the ground truth for the training set was established. It only mentions that the deep learning algorithms were used to generate "pre-calculated automatic segmentations and measurements." Without further information, one would infer similar methods (e.g., expert annotation) were likely used, but this is not confirmed in the text.
§ 892.2050 Medical image management and processing system.
(a)
Identification. A medical image management and processing system is a device that provides one or more capabilities relating to the review and digital processing of medical images for the purposes of interpretation by a trained practitioner of disease detection, diagnosis, or patient management. The software components may provide advanced or complex image processing functions for image manipulation, enhancement, or quantification that are intended for use in the interpretation and analysis of medical images. Advanced image manipulation functions may include image segmentation, multimodality image registration, or 3D visualization. Complex quantitative functions may include semi-automated measurements or time-series measurements.(b)
Classification. Class II (special controls; voluntary standards—Digital Imaging and Communications in Medicine (DICOM) Std., Joint Photographic Experts Group (JPEG) Std., Society of Motion Picture and Television Engineers (SMPTE) Test Pattern).