Search Results

The Butterfly Gestational Age Tool is indicated to provide an output of gestational age (GA) of a singleton intrauterine pregnancy presumed to be between 16-37 weeks gestation. It is for use by qualified and trained healthcare professionals in environments where healthcare is provided. This adjunctive information is not intended to be used for prenatal management and/or delivery planning. The Butterfly Gestational Age Tool is to be used with Butterfly's ultrasound systems (iQ+ or iQ3).

Device Description

The Butterfly Gestational Age Tool (GA Tool) is a software application that guides trained healthcare professionals through measuring fundal height and obtaining ultrasound cines of a patient's gravid abdomen, using a Butterfly ultrasound probe (iQ+ or iQ3) connected to a tablet. Users launch the tool as a calculation tool within the iQ App's OB scan presets (OB 1/GYN or OB 2/3). Users first measure the fundal height in centimeters, which determines the number of ultrasound videos or "sweeps" needed. These sweeps are short cines captured by moving the probe across the abdomen in specific orientations without relying on the live B-mode. The system presents users with animations for each sweep to communicate the intended path and probe orientation, rather than relying on a live B-mode scan.

The collected sweeps are input into a deep-learning model within the GA Tool. The model then outputs an estimated gestational age. Users can delete the measurement or add additional documentation like patient details or notes. When performing the sweeps, the ultrasound probe makes direct contact with the patient's skin using a coupling medium such as an ultrasound gel.

Once complete, users have the options to delete the measurement or add additional documentation before uploading the results securely to Butterfly's cloud for storage and access by medical professionals. The GA Tool aims to standardize and simplify the process of estimating gestational age using ultrasound technology.

The subject device contains the exact same hardware technology as the previously cleared subject device and no accessories are required to use the GA Tool. The GA Tool is compatible with both the Butterfly iQ3 (primary predicate) and Butterfly iQ/iQ+ Ultrasound Systems. The only change is the new GA Tool, which does not alter the intended use of the device, nor does it affect the safety and effectiveness of the device relative to the predicate.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study details based on the provided FDA 510(k) clearance letter for the Butterfly Gestational Age Tool:

1. Table of Acceptance Criteria and Reported Device Performance

Acceptance Criteria (Set by FDA/Guidance)	Reported Device Performance (as tested against LMP)	Pass/Fail
GA Window 1: Week 16 to 21 6/7
Maximum error: +/- 10 days	iQ+: LOA -6.92 to 12.20 days (Lower CI -10.05, Upper CI 15.33) iQ3: LOA -10.68 to 9.68 days (Lower CI -14.02, Upper CI 13.02)	Pass
GA Window 2: Week 22 to 27 6/7
Maximum error: +/- 14 days	iQ+: LOA -8.20 to 12.74 days (Lower CI -11.14, Upper CI 15.68) iQ3: LOA -8.84 to 10.05 days (Lower CI -11.49, Upper CI 12.70)	Pass
GA Window 3: Week 28 to 37 6/7
Maximum error: +/- 30 days	iQ+: LOA -18.16 to 19.18 days (Lower CI -22.98, Upper CI 24.00) iQ3: LOA -20.03 to 13.85 days (Lower CI -24.40, Upper CI 18.22)	Pass
Consistency with Biometry Measurements (No appreciable performance difference in subgroup analyses compared to Biometry)	Subgroup analyses show no appreciable performance difference compared to Biometry for various covariates (GA window, sites, BMI, HCP type).	Pass

Note: The acceptance criteria itself (e.g., "+/- 10 days") is explicitly mentioned in the document as "pre-defined established clinical acceptable error of ultrasound measured gestational age." The reported results (LOA) for both iQ+ and iQ3 fall within these acceptable errors.

2. Sample size used for the test set and the data provenance

Sample Size for Clinical Performance Evaluation (Test Set): 111 unique subjects (110 for iQ+ data analysis due to one exclusion).
Data Provenance:
- Country of Origin: United States.
- Locations: 4 sites within the USA: Butterfly offices in Burlington, MA; Thomas Jefferson University in Philadelphia, PA; Remedy Direct Primary Care in Flagstaff, AZ; and Butterfly offices in NYC, NY.
- Retrospective/Prospective: Prospective study, conducted between March 2025 to February 2026. This dataset was stated to be "totally independent from that of the Butterfly GA tool development."

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

Number of Healthcare Practitioners for Data Collection: 13 trained healthcare practitioners.
- 7 Physicians
- 6 Sonographers
Ground Truth Establishment for Clinical Performance Evaluation:
- Biometry was performed by the 6 sonographers.
- Gestational Age from the subject reported Last Menstrual Period (LMP) was recorded.
- The primary ground truth for evaluating the device was the Gestational Age calculation from Last Menstrual Period (LMP). Biometry was also performed by sonographers and used for comparison.

4. Adjudication method for the test set

The document does not explicitly describe an adjudication method (e.g., 2+1, 3+1) for establishing the ground truth from LMP or biometry for the clinical performance test set. The LMP was "subject reported," and biometry was "performed by the 6 sonographers." It implies that LMP was taken as reported, and sonographer biometry measurements were used directly for comparison.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

A formal MRMC comparative effectiveness study comparing human readers with AI assistance versus without AI assistance is not explicitly reported in this document.
The clinical performance study compares the device's standalone performance (Butterfly GA Tool) against LMP and against biometry performed by sonographers. It also compares biometry against LMP. It does not evaluate how human performance changes when using the AI tool as an assistant.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Yes, a standalone performance evaluation was done. The clinical performance evaluation directly assessed the "Butterfly GA Tool error in reference to the Gestational Age calculation from Last Menstrual Period (LMP)" and also compared the Butterfly GA Tool against biometry measurements. The device is described as "outputting an estimated gestational age" from collected sweeps, indicating a standalone algorithmic output.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc)

For the Clinical Performance Evaluation (Test Set):
- Primary Ground Truth: Gestational Age calculation from Last Menstrual Period (LMP).
- Secondary Reference: Biometry performed by sonographers.
For the Training Set:
- Previously established gestational age, against which the model's performance was assessed.
- Standard fetal biometry performed by sonographers from the gathered cine loops.

8. The sample size for the training set

Training Set Sample Size: Over 100,000 cine loops comprising millions of image frames from thousands of patients.

9. How the ground truth for the training set was established

The training data (cine loops) were obtained by POCUS users.
Ground Truth for Training: This data was accompanied by "standard fetal biometry performed by sonographers" and assessed "against previously established gestational age" (likely from LMP or other clinical methods) as described in the FAMLI protocol. The model aimed to "estimate gestational age from the sweeps" based on this ground truth.

Ask a Question

Ask a specific question about this device

Page 1 of 1