Search Results

EchoConfidence is Software as a Medical Device (SaMD) that displays images from a Transthoracic Echocardiogram, and assists the user in reviewing the images, making measurements and writing a report.

The intended medical indication is for patients requiring review or analysis of their echocardiographic images acquired for their cardiac anatomy, structure and function. This includes automatic view classification; segmentation of cardiac structures including the left and right ventricle, chamber walls, left and right atria and great vessels; measures of cardiac function; and Doppler assessments.

The intended patient population is both healthy individuals and patients in whom an underlying cardiac disease is known or suspected; the intended patient age range is for adults (>= 22 years old) and adolescent in the age range 18 – 21 years old.

Device Description

EchoConfidence is Software as a Medical Device (SaMD) that displays images from a Transthoracic Echocardiogram, and assists the user in reviewing the images, making measurements and writing a report.

AI/ML Overview

Here's an analysis of the provided FDA 510(k) clearance letter for EchoConfidence (USA), incorporating all the requested information:

Acceptance Criteria and Device Performance Study for EchoConfidence (USA)

The EchoConfidence (USA) device, a Software as a Medical Device (SaMD) for reviewing, measuring, and reporting on Transthoracic Echocardiogram images, underwent a clinical evaluation to demonstrate its performance against predefined acceptance criteria.

1. Acceptance Criteria and Reported Device Performance

The primary acceptance criteria for EchoConfidence were based on the "mean absolute error" (MAE) of the AI's measurements compared to three human experts. The reported performance details indicate that the device met these criteria.

Acceptance Criteria Category	Acceptance Criteria	Reported Device Performance
Primary Criteria (AI vs. Human Expert MAE)	The upper 95% confidence interval of the difference between the MAE of the AI (against 3 human experts) and the MAE of the 3 human experts (against each other) must be less than +25%.	In the majority of cases, the point estimate (of the difference between AI MAE and human expert MAE) was substantially below 0% (indicating the AI agrees with humans more than they agree with each other). The reporting consistently showed that the upper 95% confidence interval was <0%, and well below the +25% criterion standard.
Subgroup Analysis (Consistency)	The performance criteria should be met across various demographic and technical subgroups to ensure robust and generalizable performance.	Across 20 subgroups (by age, gender, ethnicity, cardiac pathologies, ultrasound equipment vendor/model, year of scan, and qualitative image quality), the finding was consistent: the point estimation showed the AI agreed with human experts better than the humans agreed with themselves, and the upper 95% confidence interval was <0% and well below the +25% criterion.

2. Sample Size and Data Provenance

Test Set Sample Size: 200 echocardiographic cases from 200 different patients.
Data Provenance: All cases were delivered via a US Echocardiography CoreLab. The data used for validation was derived from non-public, US-based sources and was kept on servers controlled by the CoreLab, specifically to prevent it from entering the training dataset. The study was retrospective.

3. Number and Qualifications of Experts for Ground Truth

Number of Experts: Three (3) human experts.
Qualifications of Experts: The experts were US accredited and US-based, employed by the US CoreLab that supplied the data. While specific years of experience are not mentioned, their accreditation and employment by a CoreLab imply significant expertise in echocardiography and clinical measurements.

4. Adjudication Method for the Test Set

The ground truth was established by having each of the three human experts independently perform the measurements for each echocardiogram, as if for clinical use. A physician then reviewed and adjusted, if needed, approximately 10% of the measurements. This could be interpreted as a form of a 3-expert reading with a final physician review/adjudication for a subset of cases. The primary analysis method, however, preserved the individual measurements of each expert rather than averaging them, by comparing the AI's MAE to each expert's measurements and then comparing inter-expert MAE.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

The provided text does not explicitly describe a MRMC comparative effectiveness study where human readers' performance with AI assistance is compared to their performance without AI assistance to measure improvement (effect size). The study rather focuses on comparing the AI's performance to human experts directly, and comparing inter-human expert variability. The device is described as assisting the user in reviewing images, making measurements, and writing reports, suggesting a human-in-the-loop application, but a specific MRMC study measuring reader improvement with AI assistance is not detailed.

6. Standalone (Algorithm Only) Performance

Yes, a standalone performance study was done. The primary acceptance criteria directly evaluate the "mean absolute error" (MAE) of the AI against the 3 human expert reads. This directly assesses the algorithm's performance in generating measurements without human intervention during the measurement process, assuming the output measurements are directly from the AI. The comparison with inter-expert variability helps contextualize this standalone AI performance.

7. Type of Ground Truth Used

The ground truth used was expert consensus / expert measurements. The process involved three human experts independently performing measurements, with a physician reviewing and potentially adjusting ~10% of these measurements. This establishes a "clinical expert gold standard" based on their interpretation and measurement.

8. Sample Size for the Training Set

The sample size for the training set is not explicitly stated in the provided document. It only mentions that the dataset used for development and internal testing was derived from a separate source and was not from the US-based CoreLab that provided the validation data.

9. How Ground Truth for the Training Set Was Established

The method for establishing ground truth for the training set is not explicitly described in the provided document. It only states that the development dataset was separate from the validation dataset and that within the development dataset, source patients were specifically tagged as being used for either training or internal testing.

Ask a Question

Ask a specific question about this device

K Number

K242062

Device Name

1CMR Pro

Manufacturer

Mycardium AI Limited

Date Cleared

2024-11-15

(123 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K220624,K213998

Predicate For

N/A

Intended Use

1CMR Pro is software that displays, analyses and transfers DICOM cardiovascular images acquired in Cardiovascular Magnetic Resonance (CMR) scanners, specifically structure, function and flow in the heart and major vessels using multi-slice, multi-parametric and velocity encoded CMR images. It is compatible with 1.5T and 3T CMR acquisitions.

The intended patient population is both known healthy patients in whom an underlying cardiac disease is suspected. The standard viewing tools are indicated for all patients. The Al analysis components are not intended for use in patients with a known congenital cardiac abnormality, children (Age<18), or individuals with pacemakers (even if MRI compatible).

Device Description

1CMR Pro is software that displays, analyses and transfers DICOM cardiovascular images acquired in Cardiovascular Magnetic Resonance (CMR) scanners, specifically structure, function and flow in the heart and major vessels using multi-slice, multi-phase, multi-parametric and velocity encoded CMR images. It is compatible with 1.5T and 3T CMR acquisitions.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study details for the 1CMR Pro device, based on the provided FDA 510(k) summary:

Acceptance Criteria and Reported Device Performance

The document doesn't explicitly state 'acceptance criteria' in a formal table with pass/fail thresholds. Instead, it describes performance benchmarks which the device met or exceeded. The performance for 1CMR Pro was assessed against two primary benchmarks: human 'truthers' and other FDA-cleared software.

Metric / Assessment Type	Acceptance Criteria (Implied)	Reported Device Performance (1CMR Pro)
DICE Scores (LV short axis contours)	Expected to be comparable to or superior to human truthers.	Overall DICE scores averaged 0.90, which was superior to Truther average of 0.89.
Accuracy (14 variables: volumes, function, LV mass)	Expected to pass all assessments and demonstrate accuracy comparable to or exceeding human truthers.	Passed all assessments. Accuracy exceeded that of the truthers.
Precision (LV variables - LVEF, LVmass, LVEDV, LVESV)	Expected to be superior to human clinicians and prior FDA cleared software (lower Coefficient of Variation).	Superior to humans for all measurements.
LVEF Coefficient of Variation (CoV)	Lower CoV than Clinician CoV.	4.3±0.3% vs 7.0±0.6% (Clinician), p<0.001.
LVmass Coefficient of Variation (CoV)	Lower CoV than Clinician CoV.	3.8±0.3% vs 4.6±0.3% (Clinician), p<0.001.
LVEDV Coefficient of Variation (CoV)	Lower CoV than Clinician CoV.	4.9±0.4% vs 6.2±0.5% (Clinician), p<0.001.
LVESV Coefficient of Variation (CoV)	Lower CoV than Clinician CoV.	5.4% (4.3-6.4%) vs 11.4% (6.5-15.6%) (Clinician), p=0.008.
Comparison to other cleared software (Circle CVI v 5.13) (LVEF, LV mass, LVEDV CoV)	Expected to exceed performance of other cleared software.	Exceeded performance in every measured variable:- LVEF: 4.2% (95% CI: 3.5-5.0%) vs 10.4% (95% CI: 6.8-14.0%)- LV mass: 4.2% (95% CI: 3.5-5.0%) vs 10.4% (95% CI: 6.8-14.0%)- LVEDV: 5.4% (95% CI: 4.3-6.4%) vs 11.4% (95% CI: 6.5-15.6%)

Study Details

Sample size used for the test set and the data provenance:
- Accuracy Test Set: 64 adults. Data provenance is not explicitly stated beyond "images collected on either 1.5T or 3T scanners (range of Siemens, Philips, and GE)." It is implied to be retrospective as part of a validation dataset.
- Precision Test Set: 110 adults, scanned twice. Data provenance similar to the accuracy test, implied retrospective.
- Comparison to other cleared software Test Set: No specific sample size is provided for this comparison, but it likely used the same or a subset of the precision test set as the variables measured are similar.
Number of experts used to establish the ground truth for the test set and the qualifications of those experts:
- Number of Experts: 3 independent US-based 'truthers'.
- Qualifications: All had ">5 years experience." (Specific medical specialty e.g., radiologist, cardiologist, or CMR specialist is not stated, but implied to be relevant to cardiac imaging).
Adjudication method for the test set: Not explicitly stated. The document mentions "3 independent US based truthers" indicating multiple expert readings but does not specify how discrepancies were resolved or if a consensus method (e.g., 2+1, 3+1) was used to establish the final ground truth. It seems each truther's reading contributed to a 'Truther DICE score average' and 'Clinician coefficient of variation' rather than establishing a single adjudicated ground truth for direct comparison.
If a multi-reader multi-case (MRMC) comparative effectiveness study was done: Yes, a form of MRMC study was implicitly conducted for precision and DICE score comparison.
- Effect size of how much human readers improve with AI vs without AI assistance: The study focused on the AI's standalone performance compared to human readers, rather than human readers with AI assistance. The results show 1CMR Pro's performance (DICE scores, accuracy, precision/CoV) was superior to human truthers/clinicians, suggesting that if AI assistance leads to similar performance as 1CMR Pro standalone, it would represent an improvement over unassisted human performance. For example, for LVEF CoV, the AI achieved 4.3% while clinicians achieved 7.0%.
If a standalone (i.e. algorithm only without human-in-the-loop performance) was done: Yes, the entire validation described for accuracy and precision was for the 1CMR Pro algorithm in a standalone capacity ("no human editing" for the comparison with other cleared software). The comparison "Al versus Clinician" also reflects standalone AI performance against human performance.
The type of ground truth used (expert consensus, pathology, outcomes data, etc.): The ground truth was established by "3 independent US based truthers" with >5 years experience. This indicates an expert consensus or expert-generated truth, specifically based on their interpretation and measurements of the CMR images. It is not based on pathology or clinical outcomes data.
The sample size for the training set: Not provided in the document. The document only discusses "independent validation datasets."
How the ground truth for the training set was established: Not provided in the document, as the training set details are omitted.

Ask a Question

Ask a specific question about this device

Page 1 of 1