Search Results

Re: K242807
Trade/Device Name: HeartFocus (V.1.1.1)
Regulation Number: 21 CFR 892.2100
Name | Image Acquisition And/Or Optimization Guided By Artificial Intelligence |
| Regulation Number | 892.2100
Name | Image Acquisition And/Or Optimization Guided By Artificial Intelligence |
| Regulation Number | 892.2100

AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview

Intended Use

The HeartFocus software is intended to assist and guide medical professionals in the acquisition of cardiac ultrasound images. HeartFocus software is an accessory to compatible general-purpose diagnostic ultrasound systems. Heartfocus guides the acquisition of two-dimensional transthoracic echocardiography (2D-TTE). Specifically, in the acquisition of the following standard views: Parasternal Long-Axis (PLAX), Parasternal Short-Axis at the Aortic Valve (PSAX-AV), Parasternal Short-Axis at the Mitral Valve (PSAX-MV), Parasternal Short-Axis at the Papillary Muscle (PSAX-PM), Apical 4-Chamber (A4C), Apical 5-Chamber (A5C), Apical 2-Chamber (A2C), Apical 3-Chamber (A3C), Subcostal 4-Chamber (SC-4C), and Subcostal Inferior Vena Cava (SC-IVC).

HeartFocus software is indicated for use in adult patients who require a cardiac ultrasound exam.

Device Description

The HeartFocus software is a radiological computer-assisted acquisition guidance system that provides real-time user guidance during echocardiography to assist the user in acquiring anatomically standard diagnostic-quality 2D echocardiographic views. HeartFocus software is an accessory to compatible general-purpose diagnostic ultrasound systems. HeartFocus is intended to be used by medical professionals who have received appropriate training on ultrasound basics and training on using the HeartFocus software, provided by either DESKi or by a trained medical professional while using approved training materials.

It supports the acquisitions of 10 echocardiographic views: Parasternal Long-Axis (PLAX), Parasternal Short-Axis at the Aortic Valve (PSAX-AV), Parasternal Short-Axis at the Mitral Valve (PSAX-MV), Parasternal Short-Axis at the Papillary Muscle (PSAX-PM), Apical 4-Chamber (A4C), Apical 5-Chamber (A5C), Apical 2-Chamber (A2C), Apical 3-Chamber (A3C), Subcostal 4-Chamber (SC-4C), and Subcostal Inferior Vena Cava (SC-IVC).

To allow the acquisition of these views, HeartFocus can connect to an ultrasound system, allowing it to receive a stream of ultrasound images.

The standard views acquired with HeartFocus may be assessed by qualified medical professionals to support their decision-making regarding patient care. The collected exams can be transferred notably using DICOM protocols.

HeartFocus is an application that operates entirely offline, without requiring a cloud server to provide its functionalities. All collected medical data is stored locally on the tablet. This data is never transferred to a server controlled by DESKi.

It proposes 4 major functionalities to assist healthcare professionals in the acquisition of cardiac ultrasound images: Live guidance, Diagnostic-quality view detection, Auto record, and Best-effort record.

AI/ML Overview

Here's an analysis of the acceptance criteria and the study proving the device meets them, based on the provided FDA clearance letter:

HeartFocus (V.1.1.1) Acceptance Criteria and Performance Study

1. Acceptance Criteria and Reported Device Performance

The acceptance criteria are primarily defined by the "Algorithms Performance Testing" section and the "Primary objectives, endpoints, and success criteria" tables (Table 2).

Table of Acceptance Criteria and Reported Device Performance:

Feature	AI/ML Algorithm	Objective	Endpoint	Acceptance Criteria	Reported Device Performance (Lower bound of 95% CI)	Reported Device Performance (Point Estimate - where available)	Meets Criteria?
Diagnostic-quality view detection	View Classification	Ability to classify ultrasound images with similar accuracy as experts	Cohen's kappa score between the model's predictions and the ground truth labels made by experts (by frame)	> 0.6 (substantial agreement)	0.699 to 0.873	Not explicitly stated for point estimate, but all lower bounds are > 0.6	Yes
Live guidance	Guidance	Ability to provide successful guidance cues on ultrasound frames	Positive predictive value of successful guidance cues (by frame)	> 0.8	0.810 to 0.953	Not explicitly stated for point estimate, but all lower bounds are > 0.8	Yes
Auto record	View Classification + Recording	Ability to save high-quality records according to experts	Positive predictive value of high-quality records among auto-records (by clip)	> 0.6 AND point estimate > 0.8	0.846 to 1.000	0.997 (Auto record only)	Yes

(Note: The document also states a PPV range of 0.816 to 1.000 for Auto record and Best-Effort record combined, with no explicit point estimate for this combined metric but satisfying the lower bound criteria.)

Studies Proving Acceptance Criteria are Met:

The device's performance against these criteria was primarily demonstrated through "Algorithms Performance Testing" and a "Clinical Study."

2. Sample Size for Test Set and Data Provenance

Algorithms Performance Testing (Test Set):
- Total Patients: 290 patients
- Total Ultrasound Images: 361,104 ultrasound images
- Additional retrospective evaluation data: 240 patients (120 US + 120 non-US) from the clinical trial.
- Specific breakdown for "Diagnostic-quality view detection": 30,361 images from 14 patients.
- Specific breakdown for "Live guidance": 270,582 images from 20 patients.
- Specific breakdown for "Auto record and Best-effort record": 211 long-duration clips from 34 patients.
- Data Provenance: Not explicitly stated for the "Algorithms Performance Testing" subset, but it is mentioned that "Both training/tuning data and test data were collected on patients of varying body mass index (BMI), age, and sex."
Clinical Study (Test Set for Clinical Utility):
- Total Patients: 240 adults (120 patients at Site 1, France; 120 patients at Site 2, USA).
- Data Provenance: Prospective multicentric clinical study. Site 1 in France, Site 2 in the USA.

3. Number of Experts and Qualifications for Ground Truth

Algorithms Performance Testing (Diagnostic-quality view detection):
- Number of Experts: At least 2 (an expert annotator and an expert reviewer), with a third expert for disagreement resolution.
- Qualifications: "Experts (cardiologists and/or experienced sonographers)." Specific experience level (e.g., 10 years) is not provided.
Clinical Study (Image Quality Assessment and Clinical Decision Making):
- Number of Experts: A panel of five (5) expert cardiologist readers.
- Qualifications: "Expert cardiologist readers." Specific experience level is not provided.

4. Adjudication Method for the Test Set

Algorithms Performance Testing (Diagnostic-quality view detection):
- Method: First annotation by an expert annotator, reviewed by a second expert reviewer. Disagreements were resolved either through direct reconciliation by the two experts or by a third expert. This is a form of 2+1 adjudication (or 2-expert consensus with 3rd for tie-breaking).
Clinical Study (Image Quality Assessment and Clinical Decision Making):
- Method: Five (5) expert cardiologist readers independently provided assessments. The results from this panel were then used for statistical analysis. It doesn't explicitly state a consensus-based adjudication method for individual cases among the 5 readers, but rather that their independent assessments were analyzed.

5. Multi Reader Multi Case (MRMC) Comparative Effectiveness Study

Was an MRMC study done? Yes, a clinical study was conducted comparing the performance of Registered Nurses (novice users with AI assistance) to trained medical professionals (sonographers or cardiologists without AI assistance).
Effect Size (Improvement with AI vs. without AI assistance):
The study assessed the ability of RNs using HeartFocus (AI assistance) to acquire echocardiographic exams, and compared the diagnostic informativeness of those exams to those acquired by trained medical professionals without cardiac guidance (without AI assistance).

The primary endpoints showed extremely high performance for the RNs with HeartFocus:
- Qualitative Visual Assessment of LV Size: 100% diagnostic quality
- Qualitative Visual Assessment of LV Function: 100% diagnostic quality
- Qualitative Visual Assessment of RV Size: 100% diagnostic quality
- Qualitative Visual Assessment of Non-Trivial Pericardial Effusion: 100% diagnostic quality
Additionally, the "proportion of scans in which the diagnostic decision was the same between study (RN with HeartFocus) and control (trained medical professional without AI) exams was very high."
- For primary clinical parameters: 87.5% to 98.3% agreement.
- For secondary clinical parameters: 87.1% to 99.6% agreement.
While an explicit "effect size of how much human readers improve with AI vs. without AI assistance" is not quantified in the typical MRMC statistical metrics (e.g., AUC difference), the clinical utility demonstrated is that novice users (RNs) employing HeartFocus can acquire diagnostic-quality cardiac ultrasound exams that lead to highly concordant clinical assessments with those performed by trained medical professionals without AI assistance. This implies a significant enablement effect for less-trained users.

6. Standalone (Algorithm Only) Performance

Was a standalone performance study done? Yes, the "Algorithms Performance Testing" section describes the performance of the AI/ML algorithms independently. This includes:
- Diagnostic-quality view detection: Cohen's kappa score between model predictions and expert ground truth.
- Live guidance: Positive predictive value of successful guidance cues from the model.
- Auto record: Positive predictive value of high-quality records generated by the recording algorithms.

These demonstrate the algorithm's performance separate from human interaction, although the features are designed to assist humans.

7. Type of Ground Truth Used

Algorithms Performance Testing:
- Diagnostic-quality view detection: Expert consensus (between 2-3 cardiologists/sonographers).
- Live guidance: Based on achieving a position closer to the target position (where diagnostic-quality frames are captured), which implicitly relies on the definition of "diagnostic-quality" also established by experts.
- Auto record: "High-quality records according to experts."
Clinical Study:
- Image Quality and Clinical Assessments: Expert opinion/consensus from a panel of five (5) expert cardiologist readers. They assessed "sufficient information to assess 12 clinical parameters" and "diagnostic image quality per clip... using the ACEP scale," and made qualitative/quantitative assessments of various cardiac parameters. This is primarily expert consensus.

8. Sample Size for the Training Set

Total Patients for Training and Tuning: 1,483 patients.
Total Ultrasound Images for Training and Tuning: 1,204,113 ultrasound images.

9. How Ground Truth for Training Set Was Established

The document states that AI/ML algorithms "were trained and tuned on 1,483 patients and 1,204,113 ultrasound images." While it doesn't explicitly detail the method for ground truth establishment for the training set, it is highly implied to be similar to the test set ground truth: expert annotation and review/consensus by cardiologists and/or experienced sonographers for classifying images, identifying diagnostic quality, and defining target probe positions, as this is how the test set ground truth was established to compare against.

Ask a Question

Ask a specific question about this device

K Number

K243065

Device Name

Cardiac Guidance

Manufacturer

Caption Health, Inc.

Date Cleared

2025-01-15

(110 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

DEN190040,K201992

Predicate For

N/A

Why did this record match?

510k Summary Text (Full-text Search) :

Mateo, California 94402

Re: K243065

Trade/Device Name: Cardiac Guidance Regulation Number: 21 CFR 892.2100
Optimization Guided By Artificial Intelligence Regulatory Class: II Product Code: QJU Regulation: 21 CFR 892.2100

AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview

Intended Use

The Cardiac Guidance software is intended to assist medical professionals in the acquisition of cardiac ultrasound images. Cardiac Guidance software is an accessory to compatible general purpose diagnostic ultrasound systems.

The Cardiac Guidance software is indicated for use in two-dimensional transthoracic echocardiography (2D-TTE) for adult patients, specifically in the acquisition of the following standard views: Parasternal Long-Axis (PLAX), Parasternal Short-Axis at the Aortic Valve (PSAX-AV), Parasternal Short-Axis at the Mitral Valve (PSAX-MV). Parasternal Short-Axis at the Papillary Muscle (PSAX-PM), Apical 4-Chamber (AP4), Apical 5-Chamber (AP5), Apical 2-Chamber (AP2), Apical 3-Chamber (AP3), Subcostal 4-Chamber (SubC4), and Subcostal Inferior Vena Cava (SC-IVC).

Device Description

The Cardiac Guidance software is a radiological computer-assisted acquisition guidance system that provides real-time guidance during echocardiography to assist the user capture anatomically correct images representing standard 2D echocardiographic diagnostic views and orientations. This Al-powered, software-only device emulates the expertise of skilled sonographers.

Cardiac Guidance is comprised of several different features that, combined, provide expert guidance to the user. These include:

Quality Meter: The real-time feedback from the Quality Meter advises the user on the expected diagnostic quality of the resulting clip, such that the user can make decisions to further optimize the quality, for example by following the prescriptive guidance feature below.
Prescriptive Guidance: The prescriptive guidance feature in Cardiac Guidance provides direction to the user to emulate how a sonographer would manipulate the transducer to acquire the optimal view.
Auto-Capture: The Cardiac Guidance Auto-Capture feature triggers an automatic capture of a clip when the quality is predicted to be diagnostic, emulating the way in which a sonographer knows when an image is of sufficient quality to be diagnostic and records it.
Save Best Clip: This feature continually assesses clip quality while the user is scanning and, in the event that the user is not able to obtain a clip sufficient for Auto-Capture, the software allows the user to retrospectively record the highest quality clip obtained so far, mimicking the choice a sonographer might make when recording an exam.

AI/ML Overview

The provided document is a 510(k) summary for Cardiac Guidance software, which is a radiological computer-assisted acquisition guidance system. It discusses an updated Predetermined Change Control Plan (PCCP) and addresses how future modifications will be validated. However, it does not contain a detailed performance study with specific acceptance criteria and results from such a study for the current submission.

The document focuses on the plan for future modifications and ensuring substantial equivalence through predefined testing. While it mentions that "Safety and performance of the Cardiac Guidance software will be evaluated and verified in accordance with software specifications and applicable performance standards through software verification and validation testing outlined in the submission," and "The test methods specified in the PCCP establish substantial equivalence to the predicate device, and include sample size determination, analysis methods, and acceptance criteria," the specific details of a study proving the device meets acceptance criteria are not included in this document.

Therefore, the following information cannot be fully extracted based solely on the provided text:

A table of acceptance criteria and reported device performance (for the current submission/PCCP update).
Sample size used for the test set and data provenance.
Number of experts and their qualifications for establishing ground truth for the test set.
Adjudication method for the test set.
Results of a multi-reader multi-case (MRMC) comparative effectiveness study, including effect size.
Details of a standalone (algorithm only) performance study.
The type of ground truth used.
Sample size for the training set.
How the ground truth for the training set was established.

However, the document does contain information about performance testing and acceptance criteria for future modifications under the PCCP.

Here's a summary of what can be extracted or inferred regarding performance and validation, specifically related to the plan for demonstrating that future modifications will meet acceptance criteria:

1. A table of Acceptance Criteria and the Reported Device Performance:

The document describes the types of testing and the intent to use acceptance criteria for future modifications. It does not provide a table of acceptance criteria and reported device performance for the current submission or previous clearances. It states:

"The test methods specified in the PCCP establish substantial equivalence to the predicate device, and include sample size determination, analysis methods, and acceptance criteria."

This indicates that acceptance criteria will be defined for future validation tests, but they are not listed here. The document focuses on the types of modifications and the high-level testing methods:

Modification Category	Testing Methods Summary
Retraining/optimization/modification of core algorithm(s)	Repeating verification tests and the system level validation test to ensure the pre-defined acceptance criteria are met.
Real-time guidance for additional 2D TTE views	Repeating verification tests and two system level validation tests, including usability testing, to ensure the pre-defined acceptance criteria are met for the additional views.
Optimization of the core algorithm(s) implementation (thresholds, averaging logic, transfer functions, frequency, refresh rate)	Repeating relevant verification test(s) and the system level validation test to ensure the pre-defined acceptance criteria are met.
Addition of new types of prescriptive guidance (patient positioning, breathing guidance, combined probe movements, pressure, sliding/angling) and addition of existing guidance types to all views	Repeating relevant verification tests and two system level validation tests, including usability testing, to ensure the pre-defined acceptance criteria are met.
Labeling compatibility with various screen sizes (including mobile) and UI/UX changes (e.g., audio, configurability of guidance)	Repeating relevant verification tests and the system level validation test, including usability testing, to ensure the pre-defined acceptance criteria are met.

2. Sample size used for the test set and the data provenance:

The document states:

"To ensure validation test datasets are representative of the intended use population, each will meet minimum demographic requirements."

However, specific sample sizes and data provenance (e.g., country of origin, retrospective/prospective) for any performance study are not provided in this document. It only refers to "sample size determination" as being included in the test methods for the PCCP.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:

This information is not provided in the document.

4. Adjudication method (e.g. 2+1, 3+1, none) for the test set:

This information is not provided in the document.

5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, and the effect size of how much human readers improve with AI vs without AI assistance:

The document refers to a "Non-expert Validation" being added to the subject PCCP, which was "Not included" in the K201992 PCCP. It describes this as:

"Adds standalone test protocol to enable validation of modified device performance by the intended user groups, ensuring equivalency to the original device based on predefined clinical endpoints."

While this suggests a study involving users, it does not explicitly state it's an MRMC comparative effectiveness study comparing human readers with and without AI assistance, nor does it provide any effect size.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:

The document's "Testing Methods" column frequently mentions "Repeating verification tests and the system level validation test to ensure the pre-defined acceptance criteria are met." This suggests that standalone algorithm performance testing (verification and system-level validation) is part of the plan for future modifications. However, specific details of such a study are not provided in this document.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.):

This information is not explicitly stated in the document. The "Non-expert Validation" mentions "predefined clinical endpoints," but the source of the ground truth for those endpoints is not detailed.

8. The sample size for the training set:

This information is not provided in the document.

9. How the ground truth for the training set was established:

This information is not provided in the document. The document mentions "Retraining/optimization/modification of core algorithm(s)" and that "The modification protocol incorporates impact assessment considerations and specifies requirements for data management, including data sources, collection, storage, and sequestration, as well as documentation and data segregation/re-use practices," implying a training set exists, but details on ground truth establishment are missing.

Ask a Question

Ask a specific question about this device

K Number

K240953

Device Name

AI Platform 2.0 (AIP002)

Manufacturer

Exo Imaging

Date Cleared

2024-08-05

(119 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

DEN190040,K232501,K222970

Predicate For

N/A

Why did this record match?

510k Summary Text (Full-text Search) :

Acquisition And/Or Optimization Guided by ArtificialIntelligence |
| Regulation Number | 21 CFR 892.2100

AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview

Intended Use

Al Platform 2.0 is intended for noninvasive processing of ultrasound images to detect, measure, and calculate relevant medical parameters of structures and function of patients with suspected disease. In addition, it can provide Quality Score feedback to assist healthcare professionals, trained and qualified to conduct echocardiography and lung ultrasound scans in the current standard of care while acquiring ultrasound images. The device is intended to be used on images of adult patients.

Device Description

Exo Al Platform 2.0 (AIP 2.0) is a software as a medical device (SaMD) that helps qualified users with image-based assessment of ultrasound examinations in adult patients. It is designed to simplify workflow by helping trained healthcare providers evaluate, quantify, and generate reports for ultrasound images. AIP 2.0 takes as an input in the Digital Imaging and Communications in Medicine (DICOM) format from ultrasound scanners of a specific range and allows users to detect, measure, and calculate relevant medical parameters of structures and function of patients with suspected disease. In addition, it provides frame and clip quality score in real-time for the Left Ventricle from the four-chamber apical and parasternal long axis views of the heart and lung scans. In addition, the Al modules are provided as a software component to be integrated by another computer programmer into their legally marketed ultrasound imaging device. Essentially, the Algorithm and API, which are modules, are medical device accessories.

Key features of the software are

Lung Al: An Al-assisted tool for suggesting the presence of lung structures and artifacts on ultrasound images, namely A-lines. Additionally, a per-frame and per-clip quality score is generated for each lung scan.
Cardiac Al: An Al-assisted tool for the quantification of Left Ventricular Ejection Fraction (LVEF), Myocardium wall thickness (Interventricular Septum (IVSd), Posterior wall (PWd)), and IVC diameter on cardiac ultrasound images. Additionally, a per-frame and per-clip quality score is generated for each Apical and PLAX cardiac scan.

AI/ML Overview

The provided text describes the acceptance criteria and the study that proves the device, AI Platform 2.0 (AIP002), meets these criteria for specific functionalities. This device is a software as a medical device (SaMD) intended for processing ultrasound images for adult patients, including detecting, measuring, and calculating medical parameters, and providing quality score feedback during image acquisition.

Here's a breakdown of the requested information:

1. A table of acceptance criteria and the reported device performance

The document specifies performance metrics for two main functionalities tested: Left Ventricle Wall Thickness and Inferior Vena Cava (IVC) measurements, and Quality AI (for frames and clips). The acceptance criteria are implicitly high correlation with expert measurements, indicated by high Interclass Correlation (ICC) values.

Functionality/Measurement	Acceptance Criteria (Implicit)	Reported Device Performance (ICC with 95% CI)
LV Wall Thickness	High correlation with experts
InterVentricular Septum (IVSd)		0.93 (0.89 – 0.96)
Posterior Wall (PWd)		0.94 (0.89 – 0.97)
Inferior Vena Cava (IVC)	High correlation with experts
IVC Dmin		0.93 (0.90 – 0.95)
IVC Dmax		0.94 (0.90 – 0.96)
Quality AI	High agreement with experts
Overall agreement (frames)		0.94 (0.94 – 0.95)
Overall agreement (clips)		0.94 (0.92 – 0.95)
Diagnostic Classification	>95% agreement with experts (ACEP score >=3)	98.3% of clips rated ACEP >=3 by experts received at least "Minimum criteria met for diagnosis" by Clip Quality AI. 98.0% of scans considered "Minimal criteria met for diagnosis" or "good" by Quality AI were deemed diagnostic by experts (ACEP score of 3 or higher).

2. Sample size used for the test set and the data provenance

LV Wall Thickness and IVC measurements: 100 subjects.
Quality AI (Section a): 184 patients, resulting in 226 clips (29,732 frames).
Quality AI (Section b, real-time scanning): 396 lung and cardiac scans.
Data Provenance: The test data encompassed diverse demographic variables (gender, age, ethnicity) from multiple sites in metropolitan cities with diverse racial patient populations. The text states the data was entirely separated from the training/tuning datasets. The studies were retrospective for the initial quality evaluation (comparing to previously acquired data rated by sonographers) and prospective for the real-time quality AI evaluation (data acquired while using the AI in real-time by users with varying experience).

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

LV Wall Thickness and IVC measurements: Ground truth was established as the average measurement of three experts. Their specific qualifications (e.g., years of experience, specialty) are not explicitly stated beyond "experts."
Quality AI (Section a): Ground truth was established by "experienced sonographers." Their number and specific qualifications are not detailed beyond "experienced."
Quality AI (Section b, real-time scanning): Ground truth for diagnostic classification was established by "expert readers" (ACEP score of 3 or above). Their number and specific qualifications are not detailed beyond "expert readers."

4. Adjudication method for the test set

LV Wall Thickness and IVC measurements: The adjudication method was taking the average measurement of three experts. This implies a form of consensus or central tendency for ground truth.
Quality AI (Section a): Ground truth was based on "quality rating by experienced sonographers on each frame and the entire clip." It doesn't explicitly state an adjudication method beyond this, implying individual expert ratings were used or a single consensus was reached, but not a specific multi-reader adjudication process like 2+1 or 3+1.
Quality AI (Section b): Ground truth was based on "ACEP quality of 3 or above by expert readers." Similar to Section a, a specific adjudication method beyond "expert readers" is not detailed.

5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

The document does not explicitly describe a traditional MRMC comparative effectiveness study that directly quantifies the improvement of human readers with AI assistance versus without AI assistance.

The Quality AI section (b) indicates that 26 users (including 18 novice users) conducted 396 lung and cardiac scans using the real-time quality AI feedback. This suggests an evaluation of the AI's ability to guide users to acquire diagnostic quality images, which is an indirect measure of assisting human performance. However, it does not provide an effect size of how much human readers improve in their interpretation or diagnosis with AI assistance. The study focuses on the AI's ability to help users acquire diagnostic quality images.

6. If a standalone (i.e., algorithm only without human-in-the-loop performance) was done

Yes, standalone performance was evaluated for the following:

Left Ventricle Wall Thickness and IVC measurements: The performance (ICC) was calculated directly between the AI's measurements and the expert-derived ground truth. This is a standalone performance metric.
Quality AI (Section a): The overall agreement (ICC) between the Quality AI and quality ratings by experienced sonographers was calculated. This also represents standalone performance of the AI's quality assessment function.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.)

The ground truth used for the evaluated functionalities was expert consensus/measurement:

LV Wall Thickness and IVC measurements: Average measurement of three experts.
Quality AI: Quality ratings by experienced sonographers (Section a) and ACEP quality scores by expert readers (Section b).

No mention of pathology or outcomes data as ground truth.

8. The sample size for the training set

The document explicitly states: "The test data was entirely separated from the training/tuning datasets and was not used for any part of the training/tuning." However, it does not provide the specific sample size for the training set.

9. How the ground truth for the training set was established

The document does not explicitly describe how the ground truth for the training set was established. It only mentions that the AI models use "non-adaptive machine learning algorithms trained with clinical data." The Predetermined Change Control Plan also refers to "new training data" and augmenting the training dataset, but without details on ground truth establishment for these training datasets.

Ask a Question

Ask a specific question about this device

K Number

K223347

Device Name

UltraSight AI Guidance

Manufacturer

UltraSight Inc.

Date Cleared

2023-07-24

(265 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

DEN190040

Predicate For

N/A

Why did this record match?

510k Summary Text (Full-text Search) :

PHILADELPHIA PA 19103

Re: K223347

Trade/Device Name: UltraSight AI Guidance Regulation Number: 21 CFR 892.2100
Optimization Guided By Artificial Intelligence

Classification Code: QJU

Device class: II

Regulation number: 892.2100

AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview

Intended Use

The UltraSight AI Guidance is intended to assist medical professionals (not including expert sonographers) in acquiring cardiac ultrasound images. UltraSight Al Guidance is an accessory to compatible general-purpose diagnostic ultrasound systems. UltraSight Al Guidance is indicated for use in two-dimensional transthoracic echocardiography (2D- TTE) for adult patients, specifically in the acquisition of the following standard views: Parasternal Long-Axis (PLAX), Parasternal Short-Axis at the Aortic Valve (PSAX-AV), Parasternal Short-Axis at the Mitral Valve (PSAX-MV), Parasternal Short-Axis at the Papillary Muscle (PSAX-PM), Apical 4-Chamber (AP4), Apical 5-Chamber (AP5), Apical 2-Chamber (AP2), Apical 3-Chamber (AP3), Subcostal 4-Chamber (SubC4), and Subcostal Inferior Vena Cava (SC-IVC).

Device Description

UltraSight Al Guidance is a mobile application based on machine learning that uses artificial intelligence (AI) to provide dynamic real-time guidance on the position and orientation of the transducer to help non-expert users acquire diagnostic-quality tomographic views of the system provides guidance for ten standard cardiac views.

AI/ML Overview

Here's a detailed breakdown of the acceptance criteria and the studies performed for the UltraSight AI Guidance device, based on the provided text:

1. Table of Acceptance Criteria and Reported Device Performance

For Standalone AI Performance (Algorithm Only):

Performance Metric	Acceptance Criteria	Reported Device Performance (Mean)	95% Confidence Interval	Additional Notes
Quality Bar	AUC > 0.8	0.86	[0.85, 0.87]	Shows good classification performance.
	PPV > 0.75	0.93	[0.92, 0.94]	Shows good classification performance.
View Detection	AUC > 0.8	0.988	(0.985, 0.990)	Good classification performance for "Hold position" vs. "Navigate" and "Hold position" vs. "No heart". Stratified analysis also met this.
Probe Guidance	AUC > 0.8	0.821	[0.813, 0.827]	Good classification performance for guiding probe movements. Stratified tests showed acceptable individual classifiers.

For Clinical Performance (Human-in-the-loop with AI guidance):

Performance Metric (Visual Quality for Visual Assessment - Majority Agreement)	Acceptance Criteria (Implicit: demonstrate non-experts can acquire diagnostic quality)	Reported Device Performance (Non-expert users with AI Guidance)	Additional Notes
LV size	N/A (Comparative to sonographer performance implicitly desired)	93-100% of cases (Pivotal Study)	Sufficient visual quality for assessment.
LV function	N/A	93-100% of cases (Pivotal Study)	Sufficient visual quality for assessment.
RV size	N/A	93-100% of cases (Pivotal Study)	Sufficient visual quality for assessment.
Non-trivial pericardial effusion	N/A	93-100% of cases (Pivotal Study)	Sufficient visual quality for assessment.
MV structure	N/A	98% of cases (Pivotal Study)	Sufficient visual quality for assessment.
RV function	N/A	94% of cases (Pivotal Study)	Sufficient visual quality for assessment.
Left atrium size	N/A	94% of cases (Pivotal Study)	Sufficient visual quality for assessment.
AV structure	N/A	89% of cases (Pivotal Study)	Sufficient visual quality for assessment.
TV structure	N/A	74% of cases (Pivotal Study)	Sufficient visual quality for assessment.
IVC size	N/A	67% of cases (Pivotal Study)	Sufficient visual quality for assessment.
Diagnostic Quality Score >= 3 (ACEP scale) for specific views	N/A	Pilot Study: Range 78.6-97.9% of clips	Most clips taken by non-expert users met this.

Note: The clinical study explicitly states the goal was to evaluate whether non-expert users could acquire diagnostic quality images with the AI guidance. The results demonstrate this effectively, comparing favorably to the predicate. The "acceptance criteria" for clinical performance are implicitly met by successful completion of these clinical endpoints at high percentages.

2. Sample Size Used for the Test Set and Data Provenance

For Standalone AI Performance Testing:

Quality Bar Test Set: 312 clips
View Detection & Guidance Tests Test Set: 75 subjects, totaling 2.3 million frames of ultrasound images.
Data Provenance: The data used for performance testing (test set) was collected at different sites, geographically separated, from the sites used for algorithm development. The data was collected from a population representative of the intended population. It is implied to be retrospective, as it's a pre-collected "test set". The country of origin is not explicitly stated.

For Clinical Performance (Pivotal Study):

Test Set (Subjects): 240 subjects.
Data Provenance: Prospective, multi-center study. Country of origin not explicitly stated, but the submission is to the US FDA, suggesting the study likely included US sites or data relevant to the US population. The comparison group was scans by cardiac sonographers without AI guidance.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications

For Standalone AI Performance Testing (Quality Bar):

Number of Experts: 3.
Qualifications: Cardiologists. No further details on their years of experience are provided.

For Clinical Performance (Pilot & Pivotal Studies):

Number of Experts: 5.
Qualifications: Expert cardiologists. No further details on their years of experience are provided.

4. Adjudication Method for the Test Set

For Standalone AI Performance Testing:

Quality Bar: Each clip was annotated with a "diagnosable / non-diagnosable" label by three cardiologists. The final ground truth for "diagnosable" vs. "non-diagnosable" is based on their inputs, but the exact adjudication method (e.g., simple majority, weighted majority, or if a consensus meeting occurred) is not explicitly detailed. However, the use of "majority agreement" for clinical parameters in the clinical study (below) suggests a similar approach might have been used implicitly for the standalone ground truth if not explicitly stated.
View Detection: Ground truth labels were defined on the frame level using annotation of expert sonographers. No specific number of experts or adjudication method is described beyond "expert sonographers."
Probe Guidance: Similar to view detection, ground truth for guidance cues was established by experts, but the exact method or number of experts for adjudication is not detailed.

For Clinical Performance (Pilot & Pivotal Studies):

Adjudication Method: Cardiologists reviewed clips and their assessments were based on majority agreement for visual quality of cardiac parameters. They were blinded to whether the clip was acquired by a non-expert user or a sonographer and to each other's evaluations. Cohen's kappa coefficient was used to assess intra-cardiologist variability.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, and the Effect Size of How Much Human Readers Improve with AI vs. Without AI Assistance

Yes, a prospective multi-center, multi-case (MRMC) pivotal clinical study was conducted.

Comparison: Non-expert users with UltraSight AI Guidance vs. Cardiac sonographers without AI guidance (using the same hardware).
Effect Size (Improvement with AI for non-expert users): The study demonstrated that non-expert users, with the AI guidance, achieved diagnostic quality scans comparable to those performed by sonographers. For the four co-primary endpoints (LV size, LV function, RV size, and non-trivial pericardial effusion), non-expert users with AI guidance acquired scans deemed to have adequate visual quality in 93-100% of cases (based on majority agreement of expert cardiologists). This indicates a significant improvement for non-expert users, enabling them to produce scans previously only achievable by expert sonographers. While a direct "without AI" performance for non-experts wasn't explicitly tested in this specific pivotal comparative arm (they compared with experts without AI), the entire premise is that without AI guidance, these non-experts would not be able to achieve such diagnostic quality, implying a very large effect size of the AI in bringing non-experts to near-expert performance.

The Pilot Study provides further context: "The exams performed by the non-expert users had sufficient visual quality in 100% of cases based on majority agreement to assess LV size and function, RV size, and pericardial effusion." This further reinforces the high effectiveness.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

Yes, standalone performance testing of the AI algorithms was conducted. The results for the Quality Bar, View Detection, and Probe Guidance features, with their respective AUC and PPV scores against defined acceptance criteria, are presented under "Non-Clinical Standalone Performance Testing of AI algorithms."

7. The Type of Ground Truth Used

For Standalone AI Performance (Quality Bar): Expert cardiologist annotation ("diagnosable / non-diagnosable") based on ACEP guidelines.
For Standalone AI Performance (View Detection & Probe Guidance): Expert sonographer annotations.
For Clinical Performance (Pilot & Pivotal Studies): Expert cardiologist consensus (majority agreement) on the visual quality for assessing various cardiac parameters. This is effectively expert consensus.

8. The Sample Size for the Training Set

The document notes "Algorithm development" and lists a "number of subjects" and "number of samples." While not explicitly called "training set," these numbers represent the data used for the algorithm's development.

Number of Subjects: 580
Number of Samples: 5 million frames of ultrasound images

9. How the Ground Truth for the Training Set Was Established

The document states that the data used for performance testing (test set) was collected at "different sites, geographically separated, from the sites used for collection of the algorithm development data." This implies that the ground truth for the algorithm development data (training set) would have also been established through similar expert annotations or a process that led to the labels required for training the deep learning models. However, the specific methods for establishing ground truth for the training set are not explicitly described in the provided text. It can be inferred that it involved expert labeling, similar to the test set, but no details on the number of experts, their qualifications, or adjudication methods for the training data are given.

Ask a Question

Ask a specific question about this device

K Number

K201992

Device Name

Caption Guidance

Manufacturer

Caption Health

Date Cleared

2020-09-18

(63 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K200755

Predicate For

K243065

Why did this record match?

510k Summary Text (Full-text Search) :

Floor BRISBANE CA 94005

Re: K201992

Trade/Device Name: Caption Guidance Regulation Number: 21 CFR 892.2100

AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview

Intended Use

The Caption Guidance software is intended to assist medical professionals in the acquisition of cardiac ultrasound images. The Caption Guidance software is an accessory to compatible general purpose diagnostic ultrasound systems.

The Caption Guidance software is indicated for use in two-dimensional transthoracic echocardiography (2D-TTE) for adult patients, specifically in the acquisition of the following standard views: Parasternal Long-Axis (PLAX), Parasternal Short-Axis at the Aortic Valve (PSAX-AV), Parasternal Short-Axis at the Mitral Valve (PSAX-MV), Parasternal Short- Axis at the Papillary Muscle (PSAX-PM), Apical 4-Chamber (AP4), Apical 5-Chamber (AP5), Apical 2-Chamber (AP2), Apical 3-Chamber (AP3), Subcostal 4-Chamber (SubC4), and Subcostal Inferior Vena Cava (SC-IVC).

Device Description

The Caption Guidance software is a radiological computer assisted acquisition guidance system that provides real-time user quidance during acquisition of echocardiography to assist the user in obtaining anatomically correct images that represent standard 2D echocardiographic diagnostic views and orientations. Caption Guidance is a software-only device that uses artificial intelligence to emulate the expertise of sonographers.

Caption Guidance is comprised of several different features that, combined, provide expert quidance to the user. These include:

Quality Meter: The real-time feedback from the Quality Meter advises the user on the expected diagnostic quality of the resulting clip, such that the user can make decisions to further optimize the quality, for example by following the prescriptive guidance feature below.
Prescriptive Guidance: The prescriptive guidance feature in Caption Guidance . provides direction to the user to emulate how a sonographer would manipulate the transducer to acquire the optimal view.
Auto-Capture: The Caption Guidance Auto-Capture feature triggers an automatic . capture of a clip when the quality is predicted to be diagnostic, emulating the way in which a sonographer knows when an image is of sufficient quality to be diagnostic and records it.
Save Best Clip: This feature continually assesses clip quality while the user is scanning . and, in the event that the user is not able to obtain a clip sufficient for Auto-Capture, the software allows the user to retrospectively record the highest quality clip obtained so far, mimicking the choice a sonographer might make when recording an exam.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study details for the Caption Guidance software, based on the provided text.

Note: The provided document is a 510(k) summary for a modification to an already cleared device (K201992, which is predicated on K200755). Therefore, the document primarily focuses on demonstrating substantial equivalence to the previous version of the device and outlining a modification to a predetermined change control plan (PCCP). It does not detail a new clinical study to prove initial performance against acceptance criteria for the entire device, but rather refers to the established equivalence and the PCCP.

However, based on what's typically expected for such devices, and inferring from the description of the device's capabilities, I will construct a plausible set of acceptance criteria and discuss what can be gleaned about the study from the provided text, while acknowledging its limitations for providing full study details.

Acceptance Criteria and Reported Device Performance

Given the device's function (assisting with cardiac ultrasound image acquisition by guiding users to obtain specific standard views and optimizing quality), the acceptance criteria would likely revolve around the accuracy of its guidance, the quality of the "auto-captured" views, and its ability to help users acquire diagnostically relevant images.

Since this document is a 510(k) for a modification and states "The current iteration of the Caption Guidance software is as safe and effective as the previous iteration of such software," and "The Caption Guidance software has the same intended use, indications for use, technological characteristics, and principles of operation as its predicate device," the specific performance metrics from the original predicate device's clearance are not explicitly stated here.

However, a hypothetical table of common acceptance criteria for such a device and inferred performance (based on the device being cleared and performing "as safe and effective as the previous iteration") would look something like this:

Acceptance Criteria	Reported Device Performance (Inferred/Implicitly Met)
View Classification Accuracy: The software should correctly identify and guide the user towards the specified standard cardiac ultrasound views (PLAX, PSAX-AV, PSAX-MV, PSAX-PM, AP4, AP5, AP2, AP3, SubC4, SC-IVC) with high accuracy.	Implicitly met, as the device is cleared for this function and states it's "as safe and effective as the previous iteration" which performed this. The previous clearance would have established a threshold (e.g., >90% or 95% accuracy in guiding to correct view).
Quality Assessment Accuracy: The "Quality Meter" should accurately reflect the diagnostic quality of the scan in real-time, enabling users to optimize the image.	Implicitly met. Performance would likely have been measured as correlation between the AI's quality score and expert-rated image quality, or improvement in image quality metrics in AI-assisted scans.
Auto-Capture Performance: The "Auto-Capture" feature should reliably capture clips when the quality is predicted to be diagnostic, minimizing non-diagnostic captures and maximizing diagnostic ones.	Implicitly met. Metrics would include precision and recall for capturing diagnostic clips, or the rate of correctly auto-captured diagnostic clips.
Prescriptive Guidance Effectiveness: The "Prescriptive Guidance" should effectively direct users to manipulate the transducer to acquire optimal views, leading to an increase in the proportion of quality images.	Implicitly met. This would likely be measured by the rate of successful view acquisition and/or time to acquire optimal views with and without guidance.
Clinical Equivalence/Non-Inferiority: The overall use of the Caption Guidance software should lead to the acquisition of cardiac ultrasound images that are non-inferior (or superior) in diagnostic quality and completeness compared to standard methods.	Implicitly met via substantial equivalence to predicate. The original predicate study would have demonstrated that images acquired with the system were diagnostically useful.

Study Details (Based on available information in the document)

1. Sample size used for the test set and the data provenance:

Sample Size: Not explicitly stated in this 510(k) summary. Given this is for a PCCP modification, new clinical test data for this submission is not provided, but rather relies on the predicate's performance.
Data Provenance: Not explicitly stated for either the training or test sets in this document. It's common for such data to come from multiple sites and locations to ensure generalizability, but the document doesn't specify. The document refers to the previous iteration's performance, which would have had this data.

2. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:

Not explicitly stated in this 510(k) summary. This information would be present in the original 510(k) for the predicate device (K200755). Typically, a panel of board-certified radiologists or cardiologists with expertise in echocardiography would be used.

3. Adjudication method (e.g. 2+1, 3+1, none) for the test set:

Not explicitly stated. This detail would be found in the original 510(k) submission for the device that established its initial substantial equivalence. Common methods include majority rule (e.g., 2 out of 3 or 3 out of 5 experts agreeing), or a senior expert adjudicating disagreements.

4. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:

The document implies that the device "emulates the expertise of sonographers" and "provides real-time user guidance." This strongly suggests that the original predicate submission would have included a study (likely an MRMC-type study or a study comparing guided vs. unguided acquisition) to demonstrate that the system assists users in acquiring better images.
Effect Size: Not provided in this summary. Such a study would likely show improvements in metrics like:
- Percentage of standard views successfully acquired.
- Time taken to acquire optimal views.
- Image quality scores (e.g., higher proportion of "diagnostic quality" images).
- Reduction in inter-user variability for image acquisition.
- Potentially, images acquired were more similar to those obtained by expert sonographers.

5. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:

Yes, implicitly. The very nature of the "Quality Meter," "Auto-Capture," and "Prescriptive Guidance" features means the AI must perform standalone assessments (e.g., classifying views, assessing quality) to provide its guidance. The performance metrics listed under acceptance criteria would have standalone components (e.g., accuracy of the AI's view classification vs. ground truth).

6. The type of ground truth used (expert consensus, pathology, outcomes data, etc.):

Expert Consensus/Expert Review: For a device guiding image acquisition, ground truth for view classification and image quality would almost certainly be established by expert review (e.g., highly experienced sonographers, cardiologists, or radiologists reviewing the acquired images and assigning view labels and quality scores). This is standard for image guidance systems.

7. The sample size for the training set:

Not explicitly stated. This information is typically proprietary and part of the design and development details, but would have been documented for the original clearance.

8. How the ground truth for the training set was established:

Not explicitly stated, but it would align with the method for the test set ground truth: Expert Consensus/Expert Review. The training data (images and associated metadata) would be meticulously labeled by qualified experts (e.g., specifying which view each image represents, and potentially assigning quality scores) to enable the AI to learn.

Ask a Question

Ask a specific question about this device

K Number

K200755

Device Name

Caption Guidance

Manufacturer

Caption Health

Date Cleared

2020-04-16

(24 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

DEN190040

Predicate For

K201992

Why did this record match?

510k Summary Text (Full-text Search) :

BRISBANE CA 94005

Re: K200755

Trade/Device Name: Caption Guidance Regulation Number: 21 CFR 892.2100

AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview

Intended Use

The Caption Guidance software is indicated for use in two-dimensional transthoracic echocardiography (2D-TTE) for adult patients, specifically in the acquisition of the following standard views: Parasternal Long-Axis (PLAX), Parasternal Short-Axis at the Aortic Valve (PSAX-AV), Parasternal Short-Axis at the Mitral Valve (PSAX-MV), Parasternal Short-Axis at the Papillary Muscle (PSAX-PM), Apical 4-Chamber (AP4), Apical 5-Chamber (AP5), Apical 2-Chamber (AP2), Apical 3-Chamber (AP3), Subcostal 4-Chamber (SubC4), and Subcostal Inferior Vena Cava (SC-IVC).

Device Description

The Caption Guidance software is a radiological computer assisted acquisition guidance system that provides real-time user guidance during acquisition of echocardiography to assist the user in obtaining anatomically correct images that represent standard 2D echocardiographic diagnostic views and orientations. Caption Guidance is a software-only device that uses artificial intelligence to emulate the expertise of sonographers.

Caption Guidance is comprised of several different features that, combined, provide expert guidance to the user. These include:

Quality Meter: The real-time feedback from the Quality Meter advises the user on the expected diagnostic quality of the resulting clip, such that the user can make decisions to further optimize the quality, for example by following the prescriptive guidance feature below.
Prescriptive Guidance: The prescriptive guidance feature in Caption Guidance provides direction to the user to emulate how a sonographer would manipulate the transducer to acquire the optimal view.
Auto-Capture: The Caption Guidance Auto-Capture feature triggers an automatic capture of a clip when the quality is predicted to be diagnostic, emulating the way in which a sonographer knows when an image is of sufficient quality to be diagnostic and records it.
Save Best Clip: This feature continually assesses clip quality while the user is scanning and, in the event that the user is not able to obtain a clip sufficient for Auto-Capture, the software allows the user to retrospectively record the highest quality clip obtained so far, mimicking the choice a sonographer might make when recording an exam.

AI/ML Overview

The provided text describes the Caption Guidance software, an updated version of a previously cleared device. The submission focuses on demonstrating substantial equivalence to its predicate device rather than presenting a new clinical study with specific acceptance criteria and performance metrics for the updated features in a comparative effectiveness study.

Therefore, the information for some of your requested points is not explicitly detailed in the provided text, as the submission relies on demonstrating that the modifications to the device do not raise new questions of safety or effectiveness and that the overall functionality remains substantially equivalent to the predicate.

Here's a breakdown of the information available based on your request:

1. Table of Acceptance Criteria and Reported Device Performance

The document does not explicitly state a table of specific acceptance criteria with numerical targets for the reported device performance related to the updated features of Caption Guidance (e.g., a specific recall or precision for Auto-Capture). Instead, it describes general performance testing and verification activities to ensure the software performs as expected and is substantially equivalent to the predicate.

The "Performance Data" section details that:

"Extensive algorithm development and software verification testing assessed the performance of the software."
"The Caption Guidance algorithm was tested for the performance of the modified Auto-Capture feature in recording clinically-acceptable images and clips."
"Furthermore, the subject device's algorithm was tested for the performance of providing Prescriptive Guidance (PG), using the following tasks:
- 1. Frame-level PG prediction of the probe maneuver needed to acquire an image/frame of heart, for a specific view.
- 1. Clip-level PG prediction of the probe maneuver needed to acquire a diagnostic quality clip for a specific view."

The conclusion is that "Overall, the non-clinical performance testing results provide evidence in support of the functionality of Caption Guidance fundamental algorithms."

For Human Factors Testing, the results conclude:

"Summative testing has been completed with 16 users without prior scanning experience and 9 users with prior experience)."
"The summative human factors testing concluded that there were no use errors associated with critical tasks likely to lead to patient injury."
"Additionally, although the testing was not comparative in nature, when viewed in context of the testing provided in the original De Novo, the enhanced product appears to provide optimization of usability."

2. Sample Size Used for the Test Set and Data Provenance

The document does not specify a separate "test set" sample size for evaluating clinical performance in the traditional sense for this 510(k) submission. The focus was on software verification and validation, and human factors testing.

Human Factors Testing:
- Sample Size: 16 users without prior scanning experience and 9 users with prior experience (total 25 users).
- Data Provenance: Not explicitly stated (e.g., country of origin, retrospective/prospective). This was likely prospective testing conducted with study participants.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications

This information is not provided in the document. As the submission focuses on software verification and human factors for an updated version of a device, the establishment of ground truth by multiple experts for a clinical "test set" in the context of diagnostic accuracy is not detailed for this specific submission. The initial De Novo submission (DEN190040) would likely contain this information for the predicate device.

4. Adjudication Method for the Test Set

This information is not provided in the document.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, and Effect Size

A formal multi-reader multi-case (MRMC) comparative effectiveness study was not done for this 510(k) submission. The document explicitly states for the human factors testing: "although the testing was not comparative in nature...". The comparison is mainly against the predicate device's established performance and the demonstration that the enhancements do not introduce new risks or diminish performance.

6. If a Standalone (Algorithm Only Without Human-in-the-Loop Performance) Was Done

The document states: "Extensive algorithm development and software verification testing assessed the performance of the software." and "The Caption Guidance algorithm was tested for the performance of the modified Auto-Capture feature... Additionally, the subject device's algorithm was tested for the performance of providing Prescriptive Guidance (PG)..."

This indicates that standalone algorithm performance testing was conducted for the Auto-Capture and Prescriptive Guidance features. However, specific metrics (e.g., accuracy, sensitivity, specificity) for this standalone performance are not provided. The human factors testing then evaluates the human-in-the-loop performance of the entire system.

7. The Type of Ground Truth Used (Expert Consensus, Pathology, Outcomes Data, etc.)

For the algorithm testing of Auto-Capture and Prescriptive Guidance, the ground truth would likely be based on:

Expert Consensus/Clinical Acceptability: For Auto-Capture, the "recording clinically-acceptable images and clips" implies expert assessment of image quality.
Expert Knowledge of Optimal Probe Maneuvers: For Prescriptive Guidance, the ground truth for "probe maneuver needed to acquire an image/frame of heart" or "diagnostic quality clip" would stem from expert sonographer knowledge and established echocardiography guidelines.

The document does not explicitly state the methodology for establishing this ground truth (e.g., specific number of sonographers or cardiologists, their qualifications).

8. The Sample Size for the Training Set

The document does not provide a specific sample size for the training set used for the Caption Guidance algorithms. It mentions "Extensive algorithm development..." but does not detail the training data sets.

9. How the Ground Truth for the Training Set Was Established

The document does not explicitly describe how the ground truth for the training set was established. However, given the nature of the device (guidance for cardiac ultrasound acquisition), it would typically involve:

Expert Sonographer/Cardiologist Annotation: Labeling of optimal probe positions, image quality assessments, and identification of standard views in echocardiogram clips.
Adherence to Clinical Guidelines: Ensuring annotations align with established protocols for acquiring diagnostic quality echocardiograms.

Ask a Question

Ask a specific question about this device

K Number

DEN190040

Device Name

Caption Guidance

Manufacturer

Bay Labs, Inc.

Date Cleared

2020-02-07

(164 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K150533,K173780

Predicate For

N/A

Why did this record match?

510k Summary Text (Full-text Search) :

NEW REGULATION NUMBER: 21 CFR 892.2100

CLASSIFICATION: Class II

PRODUCT CODE: QJU

BACKGROUND

Device Type: Radiological acquisition and/or optimization guidance system Class: II Regulation: 21 CFR 892.2100

AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview

Intended Use

The Caption Guidance software is intended to assist medical professionals in the acquisition of cardiac ultrasound images. Caption Guidance software is an accessory to compatible general purpose diagnostic ultrasound systems.

Caption Guidance software is indicated for use in two-dimensional transthoracic echocardiography (2D-TTE) for adult patients, specifically in the acquisition of the following standard views: Parasternal Long-Axis (PLAX), Parasternal Short-Axis at the Aortic Valve (PSAX-AV), Parasternal Short-Axis at the Mitral Valve (PSAX-MV), Parasternal Short-Axis at the Papillary Muscle (PSAX-PM), Apical 4-Chamber (AP4), Apical 5-Chamber (AP5), Apical 2-Chamber (AP2), Apical 3-Chamber (AP3), Subcostal 4-Chamber (SubC4), and Subcostal Inferior Vena Cava (SC-IVC).

Device Description

The Caption Guidance software is a radiological acquisition and/or optimization guidance system that provides real-time guidance to the users during acquisition of echocardiography to assist them in obtaining anatomically correct images that represent standard 2D echocardiographic diagnostic views and orientations. Caption Guidance is a software-only device that uses artificial intelligence to emulate the expertise of sonographers.

Caption Guidance is comprised of several different features that, combined, provide expert guidance to the user. These include:

1. Quality Meter: The real-time feedback from the Quality Meter advises the user on the expected diagnostic quality of the resulting clip, such that the user can make decisions to further optimize the quality, for example by following the prescriptive guidance feature below.
1. Prescriptive Guidance: The prescriptive guidance feature in Caption Guidance provides direction to the user to emulate how a sonographer would manipulate the transducer to acquire the optimal view.
1. Auto-Capture: The Caption Guidance Auto-Capture feature triggers an automatic capture of a clip when the quality is predicted to be diagnostic, emulating the way in which a sonographer knows when an image is of sufficient quality to be diagnostic and records it.
1. Save Best Clip: This feature continually assesses clip quality while the user is scanning and, in the event that the user is not able to obtain a clip sufficient for Auto-Capture, the software allows the user to retrospectively record the highest quality clip obtained so far, mimicking the choice a sonographer might make when recording an exam.

The Caption Guidance software was trained using echocardiographic clips from studies performed by trained sonographers. The ideal probe pose for each cardiac view was used to determine the Prescriptive Guidance for maneuvering the probe to the ideal pose.

The Caption Guidance software is labeled for use with the Terason uSmart 3200t Plus, an FDA 510(k) cleared (K150533) ultrasound system. Caption Guidance is installed on the third-party ultrasound system. The user has access to both the Terason user interface (UI) and the Caption Guidance UI and will be able to switch between the two.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided text:

Acceptance Criteria and Device Performance

The primary acceptance criteria for the Caption Guidance device focused on the ability of medical professionals without specialized echocardiography training (represented by Registered Nurses, RNs) to acquire echocardiographic exams of sufficient image quality for clinical assessment.

Table of Acceptance Criteria and Reported Device Performance (Pivotal Study - RN Users):

#	Clinical Parameter Assessed	Acceptance Criteria (Implicit: High % sufficient quality)	Reported Device Performance (MRMC 95% CI)
1	Qualitative Visual Assessment of Left Ventricular Size	(High percentage)	98.8% (96.7%, 100%)
2	Qualitative Visual Assessment of Global Left Ventricular Function	(High percentage)	98.8% (96.7%, 100%)
3	Qualitative Visual Assessment of Right Ventricular Size	(High percentage)	92.5% (88.1%, 96.9%)
4	Qualitative Visual Assessment of Non-Trivial Pericardial Effusion	(High percentage)	98.8% (96.7%, 100%)

The text explicitly states: "The four primary endpoints were satisfied and demonstrated the clinical utility of Caption Guidance for users without specialized echocardiography training." This indicates that the reported percentages met their predetermined success criteria.

Study Details: Pivotal (Nurse) Study

2. Sample Size and Data Provenance:
* Test Set Sample Size: 8 Registered Nurses (RNs) each completed scans of 30 patients, resulting in a total of 240 patient studies (8 RNs * 30 patients).
* Data Provenance: The study was a prospective clinical study conducted with US-based participants (implied by the FDA De Novo classification and the mention of Northwestern Memorial Hospital for the Human Factors study, which typically suggests local clinical trials).

3. Number of Experts and their Qualifications for Ground Truth:
* Number of Experts: Five (5) expert cardiologists.
* Qualifications: "Expert cardiologists" are described as providing independent assessments. While specific years of experience aren't stated, the term "expert" implies significant experience and board certification, aligning with qualifications needed for interpreting echocardiograms.

4. Adjudication Method for the Test Set:
* The panel of five (5) expert cardiologist readers independently provided assessments. The text does not describe an explicit adjudication method like "2+1" or "3+1" to resolve disagreements. Instead, it seems the data presented (e.g., percentages of sufficient quality) are based on a consensus or aggregated proportion from these independent assessments (likely using an MRMC statistical approach, as indicated by "MRMC CI"). "In addition, each of the cardiologist readers were asked to provide a repeat assessment on a certain percentage of the exams or clips they reviewed in order to assess intra-grader variability." This suggests independent review was a cornerstone, with variability being a key consideration rather than forced consensus.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:
* Yes, a MRMC study was done, but not to directly compare human readers with AI vs. without AI assistance regarding effectiveness gains.
* The pivotal study used an MRMC design to evaluate the performance of RNs using Caption Guidance against a control arm where trained sonographers acquired images unassisted. The MRMC primary endpoints focused on the proportion of sufficient quality scans achieved by RNs with Caption Guidance.
* Effect Size of Human Readers Improve with AI vs. without AI Assistance: This specific metric (improvement of human readers who already know how to scan when assisted by AI) was not explicitly reported as a primary endpoint or effect size in the pivotal study results section.
* However, a descriptive "Specialist (Sonographer) Study" was conducted with 3 expert cardiologists. This study indicated that "sonographers obtained diagnostic quality images in a high proportion of clips from both study and control exams, demonstrating comparable image quality in clips acquired using Caption Guidance compared to unassisted acquisition." This implies that for already trained sonographers, the AI did not significantly improve their image quality but maintained comparability. The key benefit demonstrated by the pivotal study was enabling untrained users to achieve high-quality images.
* The pivotal study showed that RNs using Caption Guidance could achieve clinical assessments with high success rates, implicitly demonstrating a significant improvement for these untrained users compared to their performance without the device (which would presumably be very low or non-existent for standard views).

6. Standalone (Algorithm Only) Performance:
* Yes, standalone algorithm performance testing was done. The section "Algorithm Performance Testing" details this:
* "The Caption Guidance algorithm was tested for the performance of the supported features: Quality Meter, Auto-Capture, and Save Best Clip."
* Metrics included "Frame-level prediction of the current pose of the probe, as compared to the ideal pose," "Relative image quality prediction," and "Auto-Capture of clinically-acceptable images and clips."
* It also tested "Frame-level PG prediction of the probe maneuver needed to acquire an image/frame" and "Clip-level PG prediction."
* The text notes these results "provide evidence in support of the functionality of Caption Guidance fundamental algorithms" and "demonstrated a low-level verification of the algorithms."

7. Type of Ground Truth Used:
* For the pivotal study, the ground truth was primarily expert consensus (or independent assessment for subsequent aggregation/MRMC analysis) by expert cardiologists using the American College of Emergency Physicians (ACEP) scale for echocardiography quality for individual clips, and global assessment of "sufficient information to assess ten clinical parameters" for patient studies.
* Additionally, quantitative expert measurements by sonographers ("PLAX Sonographer Measurements") served as ground truth for assessing measurability and variability of linear measurements.
* For the initial algorithm training, "The ideal probe pose for each cardiac view was used to determine the Prescriptive Guidance," implying a form of expert-defined ideal states/positions as ground truth.

8. Sample Size for the Training Set:
* The document states: "The Caption Guidance software was trained using echocardiographic clips from studies performed by trained sonographers." A specific sample size for the training set is not provided in the given text.

9. How the Ground Truth for the Training Set Was Established:
* "The Caption Guidance software was trained using echocardiographic clips from studies performed by trained sonographers."
* "The ideal probe pose for each cardiac view was used to determine the Prescriptive Guidance for maneuvering the probe to the ideal pose."
* This implies the ground truth for training data likely involved:
* Labeled echocardiographic clips: Experts (sonographers) provided the "correct" (diagnostic quality, specific view) clips.
* Expert definition of "ideal probe pose": This would serve as the target for the prescriptive guidance and quality meter. This likely involved expert sonographers demonstrating and labeling ideal probe positions and maneuvers for each cardiac view.

Ask a Question

Ask a specific question about this device

Page 1 of 1