(149 days)
The device is a general purpose ultrasound system intended for use by qualified and trained healthcare professionals. Specific clinical applications remain the same as previously cleared: Fetall OB; Abdominal (including GYN, pelvic and infertility monitoring/follicle development); Pediatric; Small Organ (breast, testes, thyroid etc.); Neonatal and Adult Cephalic; Cardiac (adult and pediatric); Musculo-skeletal Conventional and Superficial; Vascular; Transvaginal (including GYN); Transrectal
Modes of operation include: B, M, PW Doppler, CW Doppler, Color Doppler, Color M Doppler, Power Doppler, Harmonic Imaging, Coded Pulse, 3D/4D Imaging mode, Elastography, Shear Wave Elastography and Combined modes: B/M, B/Color, B/PWD, B/Color/PWD, B/Power/ PWD, B/ Elastography. The Voluson™ Expert 18, Voluson™ Expert 20, Voluson™ Expert 22 is intended to be used in a hospital or medical clinic.
The systems are full-featured Track 3 ultrasound systems, primarily for general radiology use and specialized for OB/GYN with particular features for real-time 3D/4D acquisition. They consist of a mobile console with keyboard control panel; color LCD/TFT touch panel, color video display and optional image storage and printing devices. They provide high performance ultrasound imaging and analysis and have comprehensive networking and DICOM capability. They utilize a variety of linear, curved linear, matrix phased array transducers including mechanical and electronic scanning transducers, which provide highly accurate real-time three-dimensional imaging supporting all standard acquisition modes.
The provided document describes the predicate devices as the Voluson Expert 18, Voluson Expert 20, Voluson Expert 22. The K-number for the primary predicate device is K231965. The document does NOT describe the acceptance criteria or study that proves the device meets the acceptance criteria for those predicate devices. Instead, it details the testing and acceptance criteria for new or updated AI software features introduced with the new Voluson Expert Series devices (K242168).
Here's a breakdown of the requested information based on the AI testing summaries provided for the new/updated features: Sono Pelvic Floor 3.0 (MHD and Anal Sphincter), SonoAVC Follicle 2.0, and 1st/2nd Trimester SonoLyst/SonoLystlive.
Acceptance Criteria and Device Performance for New/Updated AI Features
1. Table of Acceptance Criteria and Reported Device Performance
AI Feature | Acceptance Criteria | Reported Device Performance |
---|---|---|
Sono Pelvic Floor 3.0 (MHD) | MHD Tracking, Minimum MHD Frame Detection, Maximum MHD Frame Detection: |
- On datasets marked as "Good Image Quality": success rate should be 70% or higher.
- On datasets marked as "Challenging Image Quality": success rate should be 60% or higher.
Overall MHD: - On "Good IQ" datasets: 70% or higher.
- On "Challenging Quality" datasets: 60% or higher. | MHD Tracking:
- Good Image Quality: 89.3%
- Challenging Image Quality: 77.7%
Minimum MHD Frame Detection: - Good Image Quality: 89.3%
- Challenging Image Quality: 83.3%
Maximum MHD Frame Detection: - Good Image Quality: 90.66%
- Challenging Image Quality: 77.7%
Overall MHD: - On Good IQ datasets: 81.9%
- On Challenging quality datasets: 60.9% |
| Sono Pelvic Floor 3.0 (Anal Sphincter)| - On datasets marked as "Good Image Quality": success rate should be 70% or higher. - On datasets marked as "Challenging Image Quality": success rate should be 60% or higher. | The document states "Verification results on actual verification data is as follows" but then the table structure is missing the actual performance metrics for Anal Sphincter. It only lists "On Good IQ datasets: 81.9%" and "On Challenging quality datasets: 60.9%" under the MHD section, implying those might be overall success rates for the entire Sono Pelvic Floor 3.0 feature across both components, but it's not explicitly clear. Therefore, the specific reported device performance for "Anal Sphincter" is not clearly presented in the provided text. |
| SonoAVC Follicle 2.0 | - The success rate for the AI feature should be 70% or higher. (This appears to be an overall accuracy criterion). | Accuracy: - On test data acquired together with train cohort: 94.73%
- On test data acquired consecutively post model development: 92.8%
- Overall Accuracy: 93.6%
Dice Coefficient by Size Range:
- 3-5 mm: 0.937619
- 5-10 mm: 0.946289
- 10-15 mm: 0.962315
-
15 mm: 0.93206 |
| 1st Trimester SonoLyst/SonoLystLive | - The average success rate of SonoLyst 1st Trimester IR, X and SonoBiometry CRL and overall traffic light accuracy is 80% or higher. | The document states "The average success rate...is 80% or higher" as the acceptance criteria and then mentions "Data used for both training and validation has been collected across multiple geographical sites..." but it does not explicitly provide the numerically reported device performance value that met or exceeded the 80% criterion. |
| 2nd Trimester SonoLyst/SonoLystLive | - Acceptance criteria are met for both subgroups (variety of ultrasound systems/data formats vs. target platform). (The specific numerical criteria for acceptance are not explicitly stated, but rather that the performance met them for demonstration of generalization.) | The document states "For both subgroups the acceptance criteria are met." but does not explicitly provide the numerically reported device performance values. |
2. Sample Sizes Used for the Test Set and Data Provenance
-
Sono Pelvic Floor 3.0 (MHD & Anal Sphincter):
- Test Set Sample Size: 93 volumes for MHD, 106 volumes for Anal Sphincter.
- Data Provenance: Data is provided by external clinical partners who de-identified the data. Original data collected in 4D volume Cines (*.vol5 or *.4dv6) or 4D/3D volume acquisitions (*.vol2 or *.4dv3).
- Countries: A diverse range of countries contributed to the test data including Italy, U.S.A, Australia, Germany, Czech Republic, France, India (for MHD); and Italy, U.S.A, France, Germany, India (for Anal Sphincter).
- Retrospective/Prospective: The data collection method ("re-process data to our needs retrospectively during scan conversion") suggests a retrospective approach to assembling the dataset, although a "standardized data collection protocol was followed for all acquisitions." New data was also acquired post-model development from previously unseen sites to test robustness.
-
SonoAVC Follicle 2.0:
- Test Set Sample Size: 138 datasets, with a total follicle count of 2708 across all volumes.
- Data Provenance: External clinical partners provided de-identified data in 3D volumes (*.vol or *.4dv).
- Countries: Germany, India, Spain, United Kingdom, USA.
- Retrospective/Prospective: The data was split into train/validation/test at the start of model development (suggesting retrospective). Additionally, consecutive data was acquired post-model development from previously unseen systems and probes to test robustness (suggesting some prospective element for this later test set).
-
2nd Trimester SonoLyst/SonoLystLive:
- Test Set Sample Size: "Total number of images: 2.2M", "Total number of cine loops: 3595". It's not explicitly stated how much of this was test data vs. training data, but it implies a large dataset for evaluation.
- Data Provenance: Systems used for data collection included GEHC Voluson V730, E6, E8, E10, Siemens S2000, and Hitachi Aloka. Formats included DICOM & JPEG for still images and RAW data for cine loops.
- Countries: UK, Austria, India, and USA.
- Retrospective/Prospective: Not explicitly stated, but "All training data is independent from the test data at a patient level" implies a pre-existing dataset split rather than newly acquired prospective data solely for testing.
-
1st Trimester SonoLyst/SonoLystLive:
- Test Set Sample Size: SonoLyst 1st Trim IR: 5271 images, SonoLyst 1st Trim X: 2400 images, SonoLyst 1st Trim Live: 6000 images, SonoBiometry CRL: 110 images.
- Data Provenance: Systems included GE Voluson V730, P8, S6/S8, E6, E8, E10, Expert 22, Philips Epiq 7G. Formats included DICOM & JPEG for still images and RAW data for cine loops.
- Countries: UK, Austria, India, and USA.
- Retrospective/Prospective: "All training data is independent from the test data at a patient level." "A statistically significant subset of the test data is independent from the training data at a site level, with no test data collected at the site being used in training." This indicates a retrospective collection with careful splitting, and some test data from unseen sites.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts
-
Sono Pelvic Floor 3.0 (MHD & Anal Sphincter), SonoAVC Follicle 2.0, 2nd Trimester SonoLyst/SonoLystLive, 1st Trimester SonoLyst/SonoLystLive:
- Number of Experts: Three independent reviewers.
- Qualifications: "at least two being US Certified sonographers, with extensive clinical experience."
-
Additional for 2nd Trimester SonoLyst/SonoLystLive & 1st Trimester SonoLyst/SonoLystLive:
- For sorting/grading accuracy review, a "5-sonographer review panel" was used. Qualifications are not specified beyond being sonographers.
4. Adjudication Method for the Test Set
- Sono Pelvic Floor 3.0 (MHD & Anal Sphincter), SonoAVC Follicle 2.0, 2nd Trimester SonoLyst/SonoLystLive, 1st Trimester SonoLyst/SonoLystLive:
- The evaluation was "based on interpretation of the AI output by reviewing clinicians." The evaluation was "conducted by three independent reviewers."
- For 2nd and 1st Trimester SonoLyst/SonoLystLive, where sorting/grading accuracy was determined, if initial sorting/grading differed from the ground truth (established by a single sonographer then refined), a 5-sonographer review panel was used, and reclassification was based upon the "majority view of the panel." This implies a form of majority vote adjudication for these specific sub-tasks.
- The general approach for the three reviewers, especially when evaluating AI output, implies an independent review, and while not explicitly stated, differences would likely lead to discussion or a form of consensus/adjudication. However, a strict 'X+Y' model (like 2+1 or 3+1) is not explicitly detailed.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, and Effect Size
- The document does not describe a multi-reader multi-case (MRMC) comparative effectiveness study designed to measure how human readers improve with AI vs. without AI assistance. The studies described are primarily aimed at assessing the standalone performance or workflow utility of the AI features.
6. If a Standalone (i.e., Algorithm Only Without Human-in-the-Loop Performance) Was Done
- Yes, standalone performance was assessed for all described AI features. The "Summary test Statistics" and "Verification Results" sections for each feature (Sono Pelvic Floor 3.0, SonoAVC Follicle 2.0, 1st/2nd Trimester SonoLyst/SonoLystLive) report the algorithm's direct performance (e.g., success rates, accuracy, Dice coefficient) against the established ground truth, indicating standalone evaluation. The "interpretation of the AI output by reviewing clinicians" method primarily focuses on validating the AI's direct result rather than a comparative human performance study.
7. The Type of Ground Truth Used
- Expert Consensus/Annotation:
- Sono Pelvic Floor 3.0 (MHD): Ground truth was established through a "two-stage curation process." Curators identified the MHD plane and marked anatomical structures. These curated datasets were then "reviewed by expert arbitrators."
- Sono Pelvic Floor 3.0 (Anal Sphincter): Ground truth involved "3D segmentation of the Anal Canal using VOCAL tool in the 4D View5 Software." Each volume was "reviewed by a skilled arbitrator for correctness."
- SonoAVC Follicle 2.0: The "Truthing process for training dataset" indicates a "detailed curation protocol (developed by clinical experts)" and a "two-step approach" with an arbitrator reviewing all datasets for clinical accuracy.
- 2nd Trimester SonoLyst/SonoLystLive & 1st Trimester SonoLyst/SonoLystLive: Ground truth for sorting/grading was initially done by a single sonographer, then reviewed by a "5-sonographer review panel" for accuracy, with reclassification based on majority view if needed.
8. The Sample Size for the Training Set
- Sono Pelvic Floor 3.0 (MHD): Total Volumes: 983
- Sono Pelvic Floor 3.0 (Anal Sphincter): Total Volumes: 828
- SonoAVC Follicle 2.0: Total Volumes: 249
- 2nd Trimester SonoLyst/SonoLystLive: "Total number of images: 2.2M", "Total number of cine loops: 3595". (The precise breakdown of training vs. test from this total isn't given for this feature, but it's a large overall dataset).
- 1st Trimester SonoLyst/SonoLystLive: 122,711 labelled source images from 35,861 patients.
9. How the Ground Truth for the Training Set Was Established
- Sono Pelvic Floor 3.0 (MHD): A two-stage curation process. First, curators identify the MHD plane and then mark anatomical structures. These curated datasets are then reviewed by expert arbitrators and "changes/edits made if necessary to maintain correctness and consistency in curations."
- Sono Pelvic Floor 3.0 (Anal Sphincter): "3D segmentation of the Anal Canal using VOCAL tool in the 4D View5 Software." Curation protocol involved aligning the volume and segmenting the Anal Canal. Each volume was "reviewed by a skilled arbitrator for correctness."
- SonoAVC Follicle 2.0: A "two-step approach" was followed. First, curators were trained on a "detailed curation protocol (developed by clinical experts)." Second, an automated quality control step confirmed mask/marking availability, and an arbitrator reviewed all datasets from each curator's completed data pool for clinical accuracy, with inconsistencies discussed by the curation team.
- 2nd Trimester SonoLyst/SonoLystLive & 1st Trimester SonoLyst/SonoLystLive: The images were initially "curated (sorted and graded) by a single sonographer." If these differed from the ground truth (which implies a higher standard or previous ground truth for comparison), a "5-sonographer review panel" reviewed them and reclassified based on majority view to achieve the final ground truth.
§ 892.1550 Ultrasonic pulsed doppler imaging system.
(a)
Identification. An ultrasonic pulsed doppler imaging system is a device that combines the features of continuous wave doppler-effect technology with pulsed-echo effect technology and is intended to determine stationary body tissue characteristics, such as depth or location of tissue interfaces or dynamic tissue characteristics such as velocity of blood or tissue motion. This generic type of device may include signal analysis and display equipment, patient and equipment supports, component parts, and accessories.(b)
Classification. Class II.