Search Results

SpineAR SNAP is intended for use for pre-operative surgical planning on-screen and in a virtual environment, and intra-operative surgical planning and visualization on-screen and in an augmented environment using the HoloLens2 AR headset display with validated navigation systems as identified in the device labeling.

SpineAR SNAP is indicated for spinal stereotaxic surgery, and where reference to a rigid anatomical structure, such as the spine, can be identified relative to images of the anatomy. SpineAR is intended for use in spinal implant procedures, such as Pedicle Screw Placement, in the lumbar and thoracic regions with the HoloLens2 AR headset.

The virtual display should not be relied upon solely for absolute positional information and should always be used in conjunction with the displayed 2D stereotaxic information.

Device Description

The SpineAR SNAP does not require any custom hardware and is a software-based device that runs on a high-performance desktop PC assembled using "commercial off-the-shelf" components that meet minimum performance requirements.

The SpineAR SNAP software transforms 2D medical images into a dynamic interactive 3D scene with multiple point of views for viewing on a high-definition (HD) touch screen monitor. The surgeon prepares a pre-operative plan for stereotaxic spine surgery by inserting guidance objects such as directional markers and virtual screws into the 3D scene. Surgical planning tools and functions are available on-screen and when using a virtual reality (VR) headset. The use of a VR headset for preoperative surgical planning further increases the surgeon's immersion level in the 3D scene by providing a 3D stereoscopic display of the same 3D scene displayed on the touch screen monitor.

By interfacing to a 3rd party navigation system such as a Medtronic StealthStation S8, the SpineAR SNAP extracts the navigation data (i.e. tool position and orientation) and presents the navigation data into the advanced interactive, high quality 3D image, with multiple point of views on a high-definition (HD) touch screen monitor. Once connected, the surgeon can then execute the plan through the intra-operative use of the SpineAR SNAP's enhanced visualization and guidance tools.

The SpineAR SNAP supports three (3) guidance options from which the surgeon selects the level of guidance that will be shown in the 3D scene. The guidance options are dotted line (indicates deviation distance), orientation line (indicates both distance and angular deviation), and ILS (indicates both distance and angular deviation using crosshairs). Visual color-coded cues indicate alignment of the tracker tip to the guidance object (e.g. green = aligned).

The SpineAR SNAP is capable of projecting all the live navigated and guidance information into an AR headset such as the Microsoft HoloLens2 that is worn by the surgeon during surgery. When activated, the surgeon sees a projection of the 3D model along with the optional live navigated DICOM (Floating DICOM) and guidance cues. This AR projection is placed above, not directly over the patient in order to not impede the surgeon's field of view, but still allow the surgeon to visualize all the desired information (navigation tracker, DICOM images, guidance data) while maintaining their focus on the patient and the surgical field of view (see Figure 1).

SpineAR Software Version SPR.2.0.0 incorporates AI/ML-enabled vertebra segmentation into the clinical workflow to optimize the preparation of a spine surgical plan for screw placement and decompression. The use of the AI/ML device software function is not intended as a diagnostic tool, but as visualization tool for surgical planning.

The use of AI/ML-enabled vertebrae segmentation streamlines the initial processing stage by generating a segmented poly object of each volume-rendered vertebra that requires only minimal to no manual processing, which may significantly reduce the overall processing time.

AI/ML Overview

Here's a detailed breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided FDA 510(k) clearance letter for SpineAR SNAP:

1. Table of Acceptance Criteria and Reported Device Performance

Acceptance Criteria (AI-Enabled Vertebra Segmentation)	Performance Metric	Reported Device Performance	Meets Criteria
Lower bound of the 95% confidence interval for Mean Dice Coefficient (MDC) must be > 0.8 for individual vertebrae (CT scans)	MDC 95% CI Lower Bound	0.907	Yes
Lower bound of the 95% confidence interval for Mean Dice Coefficient (MDC) must be > 0.8 for sacrum (excl. S1) (CT scans)	MDC 95% CI Lower Bound	0.861	Yes
Lower bound of the 95% confidence interval for Mean Dice Coefficient (MDC) must be > 0.8 for individual vertebrae (MRI scans)	MDC 95% CI Lower Bound	0.891	Yes

2. Sample Size Used for the Test Set and Data Provenance

CT Performance Validation:
- Sample Size: 95 scans from 92 unique patients.
- Data Provenance: Retrospective. The validation set was composed of the entire Spine-Mets-CT-SEG dataset and the original test set from the VerSe dataset.
  - Country of Origin: Diverse, with 60% of scans from the United States and 40% from Europe.
  - Representativeness: Included a balanced distribution of patient sex, a wide age range (18-90), and data from three major scanner manufacturers (Siemens, Philips, GE).
Sacrum Validation (CT):
- Sample Size: 38 scans.
- Data Provenance: A separate set from the TotalSegmentator dataset, reserved exclusively for testing. Implicitly retrospective.
MRI Performance Validation:
- Sample Size: 31 scans from 15 unique patients.
- Data Provenance: A portion of the publicly available SPIDER dataset, reserved exclusively for performance validation. Implicitly retrospective.
  - Country of Origin: The training data for the MRI model (SPIDER dataset) was collected from four different hospitals in the Netherlands, suggesting the validation data is also from the Netherlands.
  - Representativeness: Included data from both Philips and Siemens scanners and a balanced distribution of male and female patients.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

The document states that the ground truth segmentation was "provided by expert radiologists." It does not specify the number of experts or their specific qualifications (e.g., years of experience). This information would typically be found in a more detailed study report.

4. Adjudication Method for the Test Set

The document does not explicitly state the adjudication method used for establishing the ground truth for the test set. It only mentions that the ground truth was "provided by expert radiologists."

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done

No, a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was not mentioned. The study focused on the standalone performance of the AI algorithm for segmentation. The document mentions "Human Factors and Usability testing," which often involves user interaction, but does not describe a comparative study measuring human reader improvement with AI assistance.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

Yes, a standalone performance study of the AI algorithm was done. The document reports the Mean Dice Coefficient (MDC) and its 95% confidence interval for the AI model's segmentation accuracy against expert-provided ground truth, indicating an algorithm-only performance evaluation.

7. The Type of Ground Truth Used

The ground truth used for both training and validation sets was expert consensus / expert-provided segmentation. Specifically, the document states: "This score measures the degree of overlap between the AI's segmentation and the ground truth segmentation provided by expert radiologists."

8. The Sample Size for the Training Set

CT Vertebrae Model Development: A total of 1,244 scans were used for model development (training and tuning).
CT Sacrum Model Development: A total of 430 scans were used for model development.
MRI Vertebrae Model Development: A total of 348 scans were used for model development.

9. How the Ground Truth for the Training Set Was Established

The training data was aggregated from several independent, publicly-available academic datasets: VerSe 2020, TotalSegmentator, and SPIDER. For these datasets, the ground truth would have been established by medical experts (radiologists, clinicians) often as part of larger research initiatives, typically through manual or semi-automated segmentation and subsequent review, often involving expert consensus to ensure accuracy and consistency. The document mentions "sacrum ground-truth data" for the TotalSegmentator dataset, implying expert-derived ground truth.

Ask a Question

Ask a specific question about this device

K Number

K250023

Device Name

SMART PCFD

Manufacturer

Disior Ltd

Date Cleared

2025-09-29

(269 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K240642,K223757

Predicate For

N/A

Intended Use

SMART PCFD software includes AI-powered algorithms and is intended to be used to support orthopedic healthcare professionals in the diagnosis and surgical planning of Progressive Collapsing Foot Deformity (PCFD) in a hospital or clinic environment. The medical image modality intended to be used in the software is weight-bearing CT (WBCT).

SMART PCFD software provides for the user:

Visualization report of the three-dimensional (3D) mathematical models and measurements of the anatomical structures of foot and ankle and three-dimensional models of orthopedic fixation devices,
Measurement templates containing radiographic measures of foot and ankle, and
Surgical planning application for visualization of foot and ankle anatomical three-dimensional structures, radiographic measures, and surgical instrument parameters supporting the following common flatfoot procedures: Medial Displacement Calcaneal Osteotomy (MDCO), Lateral Column Lengthening (LCL), and Cotton Osteotomy (CO).

The visualization report containing the measurements is intended to be used to support orthopedic healthcare professionals in the diagnosis of PCFD. The surgical planning application contains the visualizations of the three-dimensional structural models, orthopedic fixation device models and surgical instrument parameters combined with the measurements is intended to be used to support orthopedic healthcare professionals in surgical planning of PCFD.

Device Description

The SMART PCFD software is intended to be used in reviewing and digitally processing computed tomography images for the purposes of interpretation by a specialized medical practitioner. The device segments the medical images and creates a 3D model of the bones of the foot and ankle. Measurements, including anatomical axes, are provided to the user and the device allows for presurgical planning.

The device includes the same machine learning derived outputs as the primary predicate SMART Bun-Yo-Matic CT (K240642) device and no new validations were conducted.

Details on the previously performed validation are summarized below. The testing for 82 CT image series presented 100% correctly identified bones of foot and ankle. The existence of metal was identified correctly for 98.8% of the images (specificity 98%, sensitivity 100%).

AI/ML Overview

Here's a breakdown of the acceptance criteria and study information for the SMART PCFD device, as extracted from the provided FDA 510(k) clearance letter:

1. Table of Acceptance Criteria and Reported Device Performance

The clearance letter does not explicitly state acceptance criteria in a formal table format with specific thresholds for each metric. Instead, it describes performance results. Based on the provided text, the acceptance criteria can be inferred from the reported performance, implying that these levels of performance were deemed acceptable.

Feature Assessed	Acceptance Criteria (Inferred from Performance)	Reported Device Performance
Bone Identification	100% correctly identified bones of foot and ankle	100% correctly identified bones of foot and ankle (for 82 CT image series)
Metal Identification	High specificity and sensitivity for metal identification	98.8% correctly identified metal (specificity 98%, sensitivity 100%) (for 82 CT image series)
Surgical Planning Component	Appropriate outputs for surgical planning (e.g., mathematical operations for estimated correction within certain tolerances)	Surgical planning executes mathematical operations for estimated correction ±1 degree for angular measurements and ±1.0 mm for distance measurements.

2. Sample Size Used for the Test Set and Data Provenance

Test Set Sample Size: 82 CT image studies.
Data Provenance:
- Country of Origin: Various sites across USA and Europe, with a minimum of 50% of the images originating from the USA.
- Retrospective/Prospective: Not explicitly stated, but the description of collected studies from "patients with different ages and racial groups" and "clinical subgroups ranging from control/normal feet to pre-/post-operative clinical conditions" suggests retrospective data collection.
- Patient Demographics: Different ages and racial groups, minimum of 35% male/female within each dataset, mean age approximately 47 years (SD 15 years), and representatives from White, (Non-)Hispanic, African American, and Native racial groups.
- Clinical Conditions: Balanced in terms of subjects with different foot alignment, and subjects from clinical subgroups ranging from control/normal feet (44% with test data) to pre-/post-operative clinical conditions such as Hallux Valgus, Progressive Collapsing Foot Deformity, fractures, or with metal implants (40% of the test data).

3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications

Number of Experts: Three (3).
Qualifications of Experts: U.S. Orthopedic surgeons. Specific years of experience are not mentioned.

4. Adjudication Method for the Test Set

Adjudication Method: Majority vote. "Based on the majority vote of three, two same responses were required to establish a ground truth on each of the DICOM series." This indicates a "2-out-of-3" or "2+1" adjudication where two experts must agree to establish ground truth.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

Was an MRMC study done? No. The document describes standalone algorithm performance, and comparison to human readers with or without AI assistance is not mentioned.

6. Standalone Performance Study

Was a standalone study done? Yes. The "Details on the previously performed validation are summarized below" section describes testing conducted on the algorithm itself, independently of human interaction. The reported device performance for bone and metal identification comes directly from this standalone evaluation.

7. Type of Ground Truth Used

Type of Ground Truth: Expert consensus. The ground truths for bone and metal identification were "independently established by three (3) U.S. Orthopedic surgeons" who "reviewed each of the DICOM series through axial/sagittal/coronal views and/or 3D reconstruction and marked on a spreadsheet the presence of a bone and metal."

8. Sample Size for the Training Set

AI algorithm for bone identification: 145 CT image studies.
Metal identification: 130 CT image studies.

9. How the Ground Truth for the Training Set Was Established

The document states that the "AI algorithm for bone identification was developed using 145 CT image studies and metal identification was developed using 130 CT image studies." It then goes on to describe how ground truths for the test set were established by three U.S. Orthopedic surgeons. However, the document does not explicitly describe how the ground truth for the training set was established. It's common practice for training data to also be annotated by experts, but the details of that process are not provided in this specific excerpt.

Ask a Question

Ask a specific question about this device

K Number

K252105

Device Name

Ligence Heart

Manufacturer

Ligence, UAB

Date Cleared

2025-09-26

(85 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K210791

Predicate For

N/A

Intended Use

Ligence Heart is a fully automated software platform that processes, analyses and makes measurements on acquired transthoracic cardiac ultrasound images, automatically producing a full report with measurements of several key cardiac structural and functional parameters. The data produced by this software is intended to be used to support qualified cardiologists or sonographers for clinical decision making. Ligence Heart is indicated for use in adult patients. Ligence Heart has not been validated for the assessment of congenital heart disease, valve disease, pericardial disease, and/or intra-cardiac lesions (e.g., tumors, thrombi).

Device Description

Ligence Heart is an image post-processing software used for viewing and quantifying adult cardiac ultrasound DICOM studies. The device is intended to generate structured measurement reports for echocardiography analysis and aid qualified sonographers and cardiologists in their decision-making process.

Ligence Heart automatically identifies standard transthoracic echo views with machine-learning–based view classification, cardiac cycle selection, and border detection, then generates reproducible quantitative left-ventricular volumetric and functional measurements. The results are inserted into a PACS-compatible report that the reviewing cardiologist or sonographer can accept, edit, supplement with additional manual measurements, or entirely replace with manual measurements. The software also organizes, displays, and compares each measurement with reference-guideline ranges. Completed reports are exported in PDF, streamlining routine echocardiography workflow while leaving final diagnostic responsibility with the clinician.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving Ligence Heart meets them, based on the provided FDA 510(k) clearance letter:

Acceptance Criteria and Reported Device Performance

Acceptance Criteria (Primary Clinical Performance Endpoint)	Reported Device Performance
Individual Equivalence Coefficient (IEC): Upper level of the 95% confidence interval for agreement between Ligence Heart and expert sonographers < 0.25 (non-inferiority margin).	All four parameters met the non-inferiority criterion, implying IEC < 0.25.

Secondary Agreement Metric (for consistency)
Intraclass Correlation Coefficients (ICC) for all four parameters.	ICCs were $\ge$ 0.90 for all four parameters.

Study Details

1. Sample size used for the test set and the data provenance:

Sample Size: 524 echocardiographic studies. (Initially, 600 were acquired, but 76 were excluded).
Data Provenance: Retrospective, acquired from a U.S.-based independent Echocardiography Core Laboratory.

2. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:

Number of Experts: Three independent expert sonographers.
Qualifications of Experts: The document explicitly states "expert sonographers." No further specific details (e.g., years of experience, board certification) are provided in this excerpt.

3. Adjudication method (e.g., 2+1, 3+1, none) for the test set:

The document states, "524 echocardiographic studies were assessed by all three human readers..." This implies that all three contributed to the ground truth, but it doesn't explicitly state an adjudication method like 2+1 or 3+1. Given the use of "agreement between Ligence Heart and three independent expert sonographers...quantified using the reference-scaled IEC," it suggests that each expert's measurements were compared against the device, rather than a consensus ground truth being established before comparison with the device.

4. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:

No, an MRMC comparative effectiveness study was not explicitly done to measure human reader improvement with AI assistance. The study focused on the standalone performance of the Ligence Heart device and its interchangeability with expert human measurements. It did not directly assess how human readers' performance would change when using the AI as an assistant.

5. If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:

Yes, a standalone performance study was done. The primary clinical performance endpoint measured the agreement between "Ligence Heart" (the automated measurements) and "three independent expert sonographers." This evaluates the device's output independently against expert human measurements.

6. The type of ground truth used (expert consensus, pathology, outcomes data, etc.):

Expert measurements/readings. The ground truth for the comparison was established by the measurements from "three independent expert sonographers."

7. The sample size for the training set:

The document explicitly states, "Test datasets were strictly segregated from algorithm training datasets, as they are from completely separate cohorts." However, the sample size for the training set is not provided in this excerpt.

8. How the ground truth for the training set was established:

The document states that the test datasets were "strictly segregated from algorithm training datasets." However, how the ground truth for the training set was established is not described in this excerpt. It can be inferred that it likely involved expert annotations or measurements, similar to the test set, but the details are not available here.

Ask a Question

Ask a specific question about this device

K Number

K251527

Device Name

Brain WMH

Manufacturer

Quantib BV

Date Cleared

2025-09-25

(129 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K213737

Predicate For

N/A

Intended Use

Brain WMH is intended for automatic labeling, visualization, and volumetric quantification of WMH from T2w FLAIR MR images. The output consists of segmentations, visualizations and volumetric measurements of WMH. It is intended to provide the trained medical professional with complementary information for the evaluation and assessment of MR brain images and to aid the trained medical professional in quantitative reporting.

Device Description

Brain WMH is a software as a medical device (SaMD) that provides automatic quantification of white matter hyperintensities (WMHs) based on magnetic resonance (MR) images to assist trained medical professionals. The device takes fluid-attenuated inversion recovery (FLAIR) MR images as input. Its output consists of a report in DICOM encapsulated pdf and DICOM secondary capture format, and DICOM secondary captures of the segmentations as color overlay on the input image.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study details for the Quantib Brain WMH device, based on the provided FDA 510(k) clearance letter:

Acceptance Criteria and Device Performance

Acceptance Criteria Category	Specific Metric/Description	Acceptance Criteria	Reported Device Performance
WMH Segmentation	Dice coefficient (standalone)	Within the range of interobserver variability	0.58 ± 0.21
Anatomical Location Labeling	Accuracy (standalone)	Within the range of interobserver performance	Within the range of interobserver performance
Longitudinal Validation	Correctly labeled WMH across scans from same patient	Not explicitly stated, but implies high accuracy	97.1% correctly labeled
Scan-Rescan Reproducibility	Consistency of WMH volumes between short-interval, same-subject study pairs	Consistency/high agreement	Showed consistent volumes

Note: The document indicates that the device's standalone segmentation performance (Dice coefficient) was higher than the predicate device's standalone performance, further suggesting it met or exceeded expectations.

Study Details

Sample Size Used for the Test Set and Data Provenance:
- Test Set Sample Size: 110 studies from 90 patients.
- Data Provenance: Multiple clinical sites, with the majority of data acquired in the United States. The data was retrospective, collected from ethnically diverse male and female adult patients.
- Scan-Rescan Reproducibility Test Set: A separate, unspecified number of short-interval, same-subject scan pairs.
Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications:
- Ground Truth Experts: Three experts.
- Adjudicator: One expert.
- Qualifications: The document states "Three experts served as truthers and one expert as an adjudicator," but does not specify their qualifications (e.g., "radiologist with 10 years of experience").
Adjudication Method for the Test Set:
- The document states: "The ground truth process included multiple experts. Three experts served as truthers and one expert as an adjudicator." This implies a 3+1 adjudication method, where three experts initially define the ground truth, and a fourth expert adjudicates any discrepancies or provides a final decision.
Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:
- An MRMC study was not explicitly mentioned for evaluating human reader improvement with AI assistance. The performance testing section primarily focuses on standalone performance and comparison to the predicate device's standalone performance.
- No effect size for human reader improvement with AI vs. without AI assistance is reported.
Standalone Performance (Algorithm-Only) Study:
- Yes, a standalone performance study was done. The document explicitly states: "The standalone performance of Brain WMH segmentation, as measured by Dice coefficient (0.58 ± 0.21) was higher than the standalone performance of the predicate device and fell within the range of interobserver variability."
- Further, "the anatomical location labeling performance was also within the range of the interobserver performance."
Type of Ground Truth Used:
- Expert Consensus/Labeling: The ground truth for the test set was established through "multiple experts," specifically "Three experts served as truthers and one expert as an adjudicator." This indicates an expert consensus approach to annotation.
Sample Size for the Training Set:
- Training Set Sample Size: 474 T2-weighted FLAIR scans.
How the Ground Truth for the Training Set was Established:
- The document states that the WMH segmentation model was "trained with 474 T2-weighted FLAIR scans." However, it does not explicitly describe the method used to establish the ground truth for this training set. It only mentions the data was "collected from different geographical areas, with the majority of the data acquired in the United States."

Ask a Question

Ask a specific question about this device

K Number

K251987

Device Name

Rapid Aortic Measurements

Manufacturer

iSchemaView, Inc.

Date Cleared

2025-09-23

(88 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K223396,K243350

Predicate For

N/A

Intended Use

Rapid Aortic Measurements (AM) is an image analysis and measurement device to evaluate aortic and iliac arteries in contrast enhanced and non-contrast CT imaging datasets acquired of the chest, abdomen, and/or pelvis. The module segments the aorta, iliacs, and major branching vessels and provides 2D and 3D visualizations of the segmented vessels.

Outputs of the device include: Centerline measurements of the aorta and iliacs, Aortic Zone Measurements (Maximum Oblique Diameter), Fixed Measurements of the aorta and left and right iliacs, 3D Volume Renderings, Rotations, Curved Planar Reformations (CPRs) of the isolated left and right iliacs, aortic oblique Multiplanar Reconstructions (MPRs), and Longitudinal Tracking visualizations.

Rapid Aortic Measurements is an aid to physician decision making. Its results are not intended to be used on a stand-alone basis for clinical decision-making or otherwise preclude clinical assessment.

Rapid Aortic Measurements is indicated for adults.

Precautions/Exclusions:

Series containing excessive patient motion or metal implants may impact module output quality.
The AM module will not process series that meet the following module exclusion criteria:
- Series acquired w/cone-beam CT scanners (c-arm CT)
- Series that are non-axial or axial oblique greater than 5 degrees
- Series containing improperly ordered or missing slices where the gap is larger than 3 times the median inter-slice distance (e.g., as a result of manual correction by an imaging technician)
- Series with less than 3cm of target anatomical zones (e.g. aorta or right/left iliac artery)
- NCCT, CECT, CTA, or CTPA datasets with:
  1. in-plane X and Y FOV < 160mm
  2. Z FOV (cranio-caudal transverse anatomical coverage) < 144 mm.
  3. in-plane pixel spacing (X & Y resolution) < 0.3 mm or > 1.0 mm.
  4. inter-slice distance of < 0.3 mm or > 3 mm.
  5. slice thickness > 3 mm.
  6. data acquired at x-ray tube voltage < 70kVp or > 150kVp, including single energy, dual energy, or virtual monochromatic datasets

Device Description

Rapid Aortic Measurements (AM) is a Software as a Medical Device (SaMD) image processing module and is part of the Rapid Platform. It provides analysis of chest, abdomen, and pelvis non-contrast CT (NCCT), contrast enhanced (CT, CTP (CT- Pulmonary Angiogram, and CTA (CT-Angiography)) for the reconstructed 3D visualization and measurement of arteries from the aortic root to the iliac arteries.

Rapid AM is integrated into the Rapid Platform which provides common functions and services to support image processing modules such as DICOM filtering and job and interface management along with external facing cyber security controls. The Integrated Module and Platform can be installed on-premises within customer's infrastructure behind their firewall or in a hybrid on-premises/cloud configuration. The Rapid Platform accepts DICOM images and, upon processing, returns the processed DICOM images to the source imaging modality or PACS.

AI/ML Overview

Here's a summary of the acceptance criteria and the study proving the device meets them, based on the provided FDA 510(k) clearance letter for "Rapid Aortic Measurements":

Acceptance Criteria and Device Performance

Acceptance Criteria Category	Specific Metric	Acceptance Criteria	Reported Device Performance	Study Type
Segmentation Quality (VR Outputs)	Clinical Accuracy (agreement with source DICOM)	100% agreement	100% agreement	Segmentation Quality Study
Segmentation Quality (CPR/MPR Outputs)	CPR/MPR Quality	100% agreement	100% agreement	Segmentation Quality Study
Segmentation Quality (CPR/MPR Outputs)	Anatomical Labeling	100% agreement between readers for all labels	100% agreement for all labels	Segmentation Quality Study
Segmentation Quality (Zone Measurement Outputs)	Maximum Oblique Diameter Location Accuracy	100% agreement between readers for all segments	100% agreement for all segments	Segmentation Quality Study
Segmentation Quality (Longitudinal Results)	Clinical Accuracy of Measurements	Clinically accurate measurements placed within respective zones	Deemed clinically accurate	Segmentation Quality Study
Segmentation Accuracy (VR Outputs)	Average Dice Coefficient	Not explicitly stated as acceptance criteria, but reported	0.93	Segmentation Accuracy Study
Segmentation Accuracy (VR Outputs)	Average Hausdorff Distance	Not explicitly stated as acceptance criteria, but reported	0.54 mm	Segmentation Accuracy Study
Segmentation Accuracy (CPR/MPR Visualizations)	Average Hausdorff Distance (centerline accuracy)	Not explicitly stated as acceptance criteria, but reported	0.59 mm	Segmentation Accuracy Study
Segmentation Accuracy (Ground Truth Reproducibility)	Average Dice Coefficient	Not explicitly stated as acceptance criteria, but reported	0.95	Segmentation Accuracy Study
Measurement Reports	Mean Absolute Error (MAE) compared to ground truth	Not explicitly stated as an acceptance criterion, but reported and stated to "compare favorably with the reference device"	0.22 cm	Segmentation Accuracy Study

Study Details:

1. Sample Sizes and Data Provenance:

Test Set Sample Size: 108 cases from 115 unique patients.
Data Provenance:
- Country of Origin: 54 US, 24 OUS (Outside US), 30 unknown.
- Retrospective/Prospective: Not explicitly stated, but the description "data used during model training" and "test dataset was independent" suggests a retrospective approach.

2. Number of Experts and Qualifications for Ground Truth (Test Set):

Number of Experts: Up to three clinical experts (for segmentation quality/clinical accuracy). The number of experts involved in establishing ground truth for quantitative segmentation and measurement accuracy metrics is not explicitly stated but implies expert involvement.
Qualifications of Experts: Not explicitly stated beyond "clinical experts."

3. Adjudication Method (Test Set):

Adjudication Method: "Consensus of up to three clinical experts" for the segmentation quality/clinical accuracy endpoint. For other endpoints where "agreement between readers" is mentioned, it implies a consensus or agreement-based adjudication. No specific scheme like "2+1" or "3+1" is detailed.

4. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:

Was it done? No, an MRMC comparative effectiveness study was not explicitly mentioned. The FDA letter describes standalone device performance against ground truth and expert consensus.
Effect Size of Human Readers with/without AI: Not applicable, as an MRMC study was not conducted or reported.

5. Standalone Performance Study:

Was it done? Yes, both a "Segmentation Quality Study" and a "Segmentation Accuracy Study" were conducted to assess the algorithm's standalone performance. The results reported in the table above are from these standalone evaluations.

6. Type of Ground Truth Used:

Ground Truth Type:
- Expert Consensus: Used for segmentation quality and clinical accuracy, determined by the "consensus of up to three clinical experts against the source DICOM images."
- Approved Ground Truth Segmentations: For measurement reports, AM measurements were compared to "measurements taken from approved ground truth segmentations using a validated technique." This implies expert-derived and validated segmentations serve as the reference for measurements.

7. Sample Size for Training Set:

Training Set Sample Size: Not explicitly stated. The document mentions "The test dataset was independent from the data used during model training," but does not provide details on the size of the training dataset itself.

8. How the Ground Truth for the Training Set was Established:

How Ground Truth Established: Not explicitly stated in the provided text. The document only mentions that the test dataset was independent from the training data.

Ask a Question

Ask a specific question about this device

K Number

K251483

Device Name

SwiftSight-Brain

Manufacturer

AIRS Medical Inc.

Date Cleared

2025-09-23

(132 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K242215

Predicate For

N/A

Intended Use

This device is intended for automatic labelling, visualization and volumetric quantification of segmentable brain structures and lesions from a set of MR images. Volumetric data may be compared to reference percentile data.

Device Description

SwiftSight-Brain is a fully automated MR image analysis software that provides automatic labeling, visualization, and volumetric quantification of brain structures from a set of MR images and returns segmented images and morphometric reports. The resulting output is provided in morphometric reports that can be displayed on Picture Archive and Communications Systems (PACS). The high throughput capability makes the software suitable for use in routine patient care as a support tool for clinicians in assessment of structural MRIs.

SwiftSight-Brain provides morphometric measurements based on 3D T1 weighted MRI series. The output of the software includes volumes that have been annotated with color overlays, with each color representing a particular segmented region, and morphometric reports that provide comparison of measured volumes to age and gender-matched reference percentile data. In addition, the adjunctive use of the T2 weighted FLAIR MR series allows for improved identification of some brain abnormalities such as lesions, which are often associated with T2 weighted FLAIR hyperintensities.

SwiftSight-Brain processing architecture includes a proprietary automated internal pipeline that performs segmentation, volume calculation and report generation.

The results are displayed in a dedicated graphical user interface, allowing the user to:

Browse the segmentations and the measures
Compare the results of segmented brain structures to a reference healthy population
Read, download and print a PDF report

Additionally, automated safety measures include automated quality control functions, such as scan protocol verification, which validate that the imaging protocols adhere to system requirements.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study details for SwiftSight-Brain, extracted from the provided FDA 510(k) clearance letter:

Acceptance Criteria and Device Performance for SwiftSight-Brain

1. Table of Acceptance Criteria and Reported Device Performance

For 3D T1 Weighted Images (Brain Structures):

Acceptance Criterion (Metric)	Target Acceptance Value	Reported Device Performance
Segmentation Accuracy (Dice's Coefficient)
Major Subcortical Brain Structures	Undefined, but implied high	Above 80%
Major Cortical Structures	Undefined, but implied high	Above 75%
Brain Structural Reproducibility (Mean Percentage Absolute Volume Differences)
All Major Subcortical and Cortical Structures	Not explicitly stated, but implied low	5% or less

For T2 Weighted FLAIR Images (Brain Lesions):

Acceptance Criterion (Metric)	Target Acceptance Value	Reported Device Performance
Lesion Segmentation Accuracy (Dice's Coefficient)	Not explicitly stated, but implied high	Exceeds 0.80
Brain Lesion Segmentation Reproducibility (Mean Absolute Lesion Volume Difference)	Not explicitly stated, but implied low	Less than 0.25cc

2. Sample Sizes and Data Provenance

For 3D T1 Weighted Images:

Test Set Sample Size: 72 cases for accuracy, 72 cases for reproducibility.
Data Provenance: Subjects were collected from multiple countries, including the United States, the United Kingdom, China, and Germany. The data primarily sourced from U.S. hospitals. Retrospective.
Specifics: Test dataset included MR images from Philips (54 subjects), Siemens Healthineers (53 subjects), and GE (37 subjects) scanners, using both 1.5T and 3.0T field strengths. Ages ranged from 20s to 90s. Included healthy subjects, mild cognitive impairment patients, Alzheimer's disease patients, and small vessel disease patients.

For T2 Weighted FLAIR Images:

Test Set Sample Size: 160 cases for accuracy, 85 cases for reproducibility.
Data Provenance: Subjects were collected from multiple countries, including the United States, Brazil, and Republic of Korea. The data primarily sourced from U.S. hospitals. Retrospective.
Specifics: Test dataset included MR images from Philips (92 subjects), Siemens Healthineers (65 subjects), and GE (88 subjects) scanners, using both 1.5T and 3.0T field strengths. Ages ranged from 20s to 90s. Included healthy subjects, mild cognitive impairment patients, Alzheimer's disease patients, and small vessel disease patients.

3. Number of Experts and Qualifications for Ground Truth Establishment (Test Set)

Number of Experts: Two neuroradiologists and one neurologist.
Qualifications: U.S. based. No specific years of experience or board certifications are mentioned in the provided text, only their specialties.

4. Adjudication Method for the Test Set

The adjudication method for establishing ground truth from the multiple experts is not explicitly stated in the provided text. It mentions that ground truth was "established by the U.S. based two neuroradiologists and one neurologist," but doesn't detail how their individual interpretations were combined (e.g., majority vote, consensus meeting, primary reader with adjudication by others).

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No MRMC comparative effectiveness study was mentioned in the provided text. The study focused on standalone algorithm performance against expert ground truth. Therefore, no effect size for human readers improving with AI vs. without AI assistance is available from this document.

6. Standalone (Algorithm Only) Performance Study

Yes, a standalone (algorithm only) performance study was done. The document describes the evaluation of SwiftSight-Brain's segmentation accuracy and reproducibility by comparing its output directly against expert manual segmentations (ground truth). It does not describe any human-in-the-loop performance evaluation for the clearance.

7. Type of Ground Truth Used

The ground truth for both 3D T1 weighted and T2 weighted FLAIR images was established by expert manual segmentations.

8. Sample Size for the Training Set

The sample size for the training set is not explicitly stated in the provided text. The document mentions that "The subjects upon whom the device was trained and tested include healthy subjects, mild cognitive impairment patients, Alzheimer's disease patients, and small vessle disease patients from young adults to elderlies," but only provides specific numbers for the test sets.

9. How the Ground Truth for the Training Set Was Established

The method for establishing ground truth for the training set is not explicitly stated. The document only details how the ground truth for the test set was established (expert manual segmentations by two neuroradiologists and one neurologist).

Ask a Question

Ask a specific question about this device

K Number

K250649

Device Name

Bunkerhill ECG-EF

Manufacturer

BunkerHill Health

Date Cleared

2025-09-19

(199 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K233409,K232699

Predicate For

N/A

Intended Use

Bunkerhill ECG-EF is software intended to aid in screening for Left Ventricular Ejection Fraction (LVEF) less than or equal to 40% in adults at risk for, but not already diagnosed with low LVEF.

Bunkerhill ECG-EF is not intended to be a stand-alone diagnostic device for cardiac conditions, should not be used for monitoring of patients, and should not be used on ECGs with a paced rhythm. A positive result may suggest the need for further clinical evaluation in order to establish a diagnosis of Left Ventricular Ejection Fraction (LVEF) less than or equal to 40%.

Additionally, if the patient is at high risk for the cardiac condition, a negative result should not rule out further non-invasive evaluation.

Bunkerhill ECG-EF is adjunctive and must be interpreted in conjunction with the clinician's judgment, the patient's medical history, symptoms, and additional diagnostic tests. For a final clinical diagnosis, further confirmatory testing, such as echocardiography, is required.

Device Description

ECG-EF is a software-only medical device that employs deep learning algorithms to analyze 12-lead ECG data for the detection of low left ventricular ejection fraction (LVEF < 40%). The algorithm processes 10-second ECG waveform snippets, providing predictions to assist healthcare professionals in the early identification of patients at risk for heart failure.

ECG-EF algorithm receives digital 12-lead ECG data and processes it through its machine learning model. The output of the analysis is transmitted to integrated third-party software systems, such as Electronic Medical Records (EMR) or ECG Management Systems (EMS). The results are displayed by the third-party software on a device such as a smartphone, tablet, or PC.

ECG-EF algorithm produces a result indicating "Low EF Screen Positive - High probability of low ejection fraction based on the ECG", " Low EF Screen Negative - Low probability of low ejection fraction based on the ECG" or "Error – device input criteria not met " for cases that do not meet data input requirements. These results are not intended to be definitive diagnostic outputs but rather serve as adjunctive information to support clinical decision-making. A disclaimer accompanies the output, stating: "Not for diagnostic use. The results are not final and must be reviewed alongside clinical judgment and other diagnostic methods."

The Low Ejection Fraction AI-ECG Algorithm device is intended to address the unmet need for a point-of-care screen for LVEF less than or equal to 40% and is expected to be used by cardiologists, front-line clinicians at primary care, urgent care, and emergency care settings, where cardiac imaging may not be available or may be difficult or unreliable for clinicians to operate. Clinicians will use the Low Ejection Fraction AI-ECG Algorithm to aid in screening for LVEF less than or equal to 40% and making a decision for further cardiac evaluation.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study details for the Bunkerhill ECG-EF device, based on the provided FDA 510(k) Clearance Letter:

1. Table of Acceptance Criteria and Reported Device Performance

Performance Metric	Acceptance Criteria	Reported Device Performance (Value and 95% Confidence Interval)	Pass/Fail
Sensitivity	Se ≥ 80%	82.66% (80.90–84.30)	Pass
Specificity	Sp ≥ 80%	83.20% (82.60–83.80)	Pass
PPV	PPV ≥ 25%	37.20% (35.70–38.76)	Pass
NPV	NPV ≥ 95%	97.54% (97.28–97.83)	Pass

2. Sample Size for the Test Set and Data Provenance

Sample Size for Test Set: 15,994 patient records.
Data Provenance:
- Country of Origin: United States.
- Source: Two health systems.
- Type: Retrospective study.
- Diversity: Representative of the U.S. population (65.5% White, 18.8% Hispanic, 5.7% American Indian or Alaska Native, 3.9% Asian, 3.0% Black/African American, 2.8% Other; 53% Male, 47% Female).
- Geographical Distribution: Curated from 5 geographically distributed sites throughout the United States.

3. Number of Experts and Qualifications for Ground Truth

The document does not explicitly state the number of experts used or their specific qualifications for establishing the ground truth. It only mentions that the ground truth was established from echocardiograms.

4. Adjudication Method for the Test Set

The document does not specify an adjudication method (e.g., 2+1, 3+1, none) for the test set. The ground truth was derived directly from echocardiogram measurements.

5. MRMC Comparative Effectiveness Study

The document does not mention a Multi-Reader Multi-Case (MRMC) comparative effectiveness study or any effect size of human readers improving with AI vs. without AI assistance. The study focuses solely on the standalone performance of the AI algorithm.

6. Standalone Performance Study (Algorithm Only)

Yes, a standalone study evaluating the algorithm's performance without human-in-the-loop was conducted. The performance metrics (Sensitivity, Specificity, PPV, NPV) and the confusion matrix presented are for the algorithm's direct output.

7. Type of Ground Truth Used

The ground truth used was Transthoracic Echocardiogram (TTE) with disease, specifically using the Simpson's Biplane measurement method to determine Left Ventricular Ejection Fraction (LVEF) less than or equal to 40%. The echocardiogram was taken less than 15 days apart from the ECG scan.

8. Sample Size for the Training Set

The document does not explicitly state the sample size used for the training set. It only mentions the retrospective study for validation involved 15,994 patient records.

9. How Ground Truth for the Training Set Was Established

The document states that the "Ground Truth for Model Training" was Transthoracic echocardiogram (TTE) with disease. It can be inferred that this same method (TTE, likely Simpson's Biplane) was used to establish ground truth for the training data, similar to the test set, but specific details on the process for the training set are not provided.

Ask a Question

Ask a specific question about this device

K Number

K251002

Device Name

Videa Dental AI

Manufacturer

VideaHealth Inc.

Date Cleared

2025-09-19

(171 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K232384

Predicate For

N/A

Intended Use

Videa Dental AI is a computer-assisted detection (CADe) device that analyzes intraoral radiographs to identify and localize the following features. Videa Dental AI is indicated for the review of bitewing, periapical, and panoramic radiographs acquired from patients aged 3 years or older.

Suspected Dental Findings:

Caries
Attrition
Broken/Chipped Tooth
Restorative Imperfection
Pulp Stones
Dens Invaginatus
Periapical Radiolucency
Widened Periodontal Ligament
Furcation
Calculus

Historical Treatments:

Crown
Filling
Bridge
Post and Core
Root Canal
Endosteal Implant
Implant Abutment
Bonded Orthodontic Retainer
Braces

Normal Anatomy:

Maxillary Sinus
Maxillary Tuberosity
Mental Foramen
Mandibular Canal
Inferior Border of the Mandible
Mandibular Tori
Mandibular Condyle
Developing Tooth
Erupting Teeth
Non-matured Erupted Teeth
Exfoliating Teeth
Impacted Teeth
Crowding Teeth

Device Description

Videa Dental AI (VDA) software is a cloud-based AI-powered medical device for the automatic detection of the features listed in the Indications For Use statement in dental radiographs. The device itself is available as a service via an API (Application Programming Interface) behind a firewalled network. Provided proper authentication and an eligible bitewing, periapical or panoramic image, the device returns a set of bounding boxes and/or segmentation outlines depending on the indication representing the suspect dental finding, historical treatment or normal anatomy detected.

VDA is accessed by the dental practitioner through their dental image viewer. From within the dental viewer the user can upload a radiograph to VDA and then review the results. The device outputs a binary indication to identify the presence or absence of findings for each indication. If findings are present the device outputs the number of findings by finding type and the coordinates of the bounding boxes/segmentation outlines for each finding. If no findings are present the device outputs a clear indication that there are no findings identified for each indication. The device output will show all findings from one radiograph regardless of the number of teeth present.

The intended users of Videa Dental AI are trained dental professionals such as dentists and dental hygienists. For the suspect dental findings indications specifically, VDA is intended to be used as an adjunct tool and should not replace a dentist's review of the image. Only dentists that are performing diagnostic activities shall use the suspect dental finding indications.

VDA should not be used in-lieu of full patient evaluation or solely relied upon to make or confirm a diagnosis. The system is to be used by trained dental professionals including, but not limited to, dentists and dental hygienists.

Depending on the specific VDA indication for use, the intended patients of Videa Dental AI are patients 3 years of age and older above with primary, mixed and/or permanent dentition undergoing routine dental visits or suspected of one of the suspected dental findings listed in the VDA indications for use statement above. VDA may be used on eligible bitewing, periapical or panoramic radiographs depending on the indication.

See Table 1 below for the specific patient age group and image modality that each VDA indication for use is designed and tested to meet. VDA uses the images metadata to only show the indications for the patient age and image modalities in scope as shown in Table 1. VDA will not show any findings output for an indication for use that is outside of the patient age and radiographic view scope.

AI/ML Overview

Here's a summary of the acceptance criteria and study details for Videa Dental AI, based on the provided FDA 510(k) Clearance Letter:

1. Table of Acceptance Criteria and Reported Device Performance:

The document doesn't explicitly state numeric acceptance criteria thresholds for all indications. However, it implicitly states that Videa Dental AI meets its performance requirements by demonstrating statistically significant improvement in detection performance for clinicians when aided by the device compared to unaided performance in the clinical study for certain indications. For standalone performance, DICE scores are provided for caries, calculus, and normal tooth anatomy segmentations.

Performance Metric / Indication	Acceptance Criteria (Implicit)	Reported Device Performance
Clinical Performance (MRMC Study)
AFROC FOM (Aided vs. Unaided)	Aided AFROC FOM > Unaided AFROC FOM (statistically significant improvement)	Clinicians showed statistically significant improvement in detection performance with VDA aid for caries and periapical radiolucency with a second operating point. The average aided improvement across 8 VDA indications was 0.002%.
Standalone Performance (Bench Testing)
Caries (DICE)	Not explicitly stated	0.720
Calculus (DICE)	Not explicitly stated	0.716
Enamel (DICE)	Not explicitly stated	0.907
Pulp (DICE)	Not explicitly stated	0.825
Crown Dentin (DICE)	Not explicitly stated	0.878
Root Dentin (DICE)	Not explicitly stated	0.874
Standalone Specificity - Caries (second operating point)	Not explicitly stated	0.867
Standalone Specificity - Periapical Radiolucency (second operating point)	Not explicitly stated	0.989

2. Sample Size Used for the Test Set and Data Provenance:

Standalone Performance Test Set:
- Sample Size: 1,445 radiographs
- Data Provenance: Collected from more than 35 US sites (retrospective, implied, as it's for ground-truthing/benchmarking).
Clinical Performance (MRMC) Test Set:
- Sample Size: 378 radiographs
- Data Provenance: Collected from over 25 US locations spread across the country (retrospective, implied, as it's for ground-truthing/benchmarking).

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications:

Standalone Performance Test Set:
- Number of Experts: Three
- Qualifications: US board-certified dentists.
Clinical Performance (MRMC) Test Set:
- Number of Experts: Not explicitly stated for the initial labeling, but a single US licensed dentist adjudicated the labels to establish the reference standard.
- Qualifications: US licensed dentists labeled the data, and a US licensed dentist adjudicated those labels.

4. Adjudication Method for the Test Set:

Standalone Performance Test Set: Ground-truthed by three US board-certified dentists. The specific adjudication method (e.g., consensus, majority) is not explicitly detailed beyond "ground-truthed by three...".
Clinical Performance (MRMC) Test Set: US licensed dentists labeled the data, and a US licensed dentist adjudicated those labels to establish a reference standard. This implies a consensus or expert-review model, possibly 2+1 or similar, where initial labels were reviewed and finalized by a single adjudicator.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was Done, What was the Effect Size of How Much Human Readers Improve with AI vs. Without AI Assistance:

Yes, an MRMC comparative effectiveness study was done.
Hypothesis Tested:
- H₀: AFROC FOMₐᵢdₑd - AFROC FOMᵤₙₐᵢdₑd ≤ 0
- H₁: AFROC FOMₐᵢdₑd - AFROC FOMᵤₙₐᵢdₑd > 0
Effect Size:
- Across 8 Videa Dental AI Suspect Dental Finding indications in the clinical study, the average amount of aided improvement over unaided performance was 0.002%.
- For the caries and periapical radiolucency VDA indications (with a second operating point), clinicians had statistically significant improvement in detection performance regardless of the operating point used. The specific AFROC FOM delta is not provided for these, only that it was statistically significant.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) was Done:

Yes, a standalone performance assessment was conducted.
It measured and reported the performance of Videa Dental AI by itself, in the absence of any interaction with a dental professional in identifying regions of interest for all suspect dental finding, historical treatment, and normal anatomy VDA indications.

7. The Type of Ground Truth Used:

Expert Consensus/Review: The ground truth for both standalone and clinical studies was established by US board-certified or licensed dentists who labeled and/or adjudicated the findings on the radiographs.

8. The Sample Size for the Training Set:

The document does not explicitly state the sample size for the training set. It mentions the AI algorithms were "trained with that patient population" and "trained with bitewing, periapical and panoramic radiographs," but gives no specific number of images or patients for the training dataset.

9. How the Ground Truth for the Training Set Was Established:

The document does not explicitly state how the ground truth for the training set was established. It only broadly states that the AI algorithms were trained with a specific patient population and image types. Given the general practice for medical AI, it can be inferred that expert labeling similar to the test set would have been used, but this is not confirmed in the provided text.

Ask a Question

Ask a specific question about this device

K Number

K250369

Device Name

Axial3D Insight

Manufacturer

Axial Medical Printing Limited

Date Cleared

2025-09-18

(220 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K232841

Predicate For

N/A

Intended Use

Axial3D Insight is intended for use as a cloud-based service and image segmentation framework for the transfer of DICOM imaging information from a medical scanner to an output file.

The Axial3D Insight output file can be used for the fabrication of physical replicas of the output file using additive manufacturing methods. The output file or physical replica can be used for treatment planning.

The output file or the physical replica can be used for diagnostic purposes in the field of trauma, orthopedic, maxillofacial and cardiovascular applications. Axial3D Insight should be used in conjunction with other diagnostic tools and expert clinical judgment.

Device Description

Axial3D Insight is a secure, highly available cloud-based image processing, segmentation and 3D modelling framework for the transfer of imaging information either as a digital file or as a 3D printed physical model.

AI/ML Overview

The FDA 510(k) clearance letter and supporting documentation for Axial3D Insight (K250369) details the device's acceptance criteria and the studies performed to demonstrate its performance.

1. Table of Acceptance Criteria and Reported Device Performance

The provided document describes two main validation studies: a "Clinical Segmentation Performance study" for the overall Axial3D Insight software, and "AxialML Machine Learning Validation" for the underlying machine learning models. The acceptance criteria for the Clinical Segmentation Performance study are described in terms of a peer-reviewed medical imaging review framework (RADPEER). For the AxialML Machine Learning Validation, the acceptance criteria are based on quantitative metrics demonstrating "equivalence or improvement" compared to the original model.

Acceptance Criteria Category	Specific Metric/Mechanism	Acceptance Threshold/Method	Reported Device Performance
Clinical Segmentation Performance (Axial3D Insight)	Radiologist Review via RADPEER Framework	All cases scored within RADPEER acceptance criteria of 1 or 2a.	All cases were scored within the acceptance criteria of 1 or 2a.
Intended Use Validation (Axial3D Insight)	Physician Review of 3D Models	Successfully validated, satisfying end user needs and indications for use.	Concluded successful validation; 3D models satisfied end user needs and indications for use.
AxialML Machine Learning Model Validation (PCCP)	Quantitative 3D Medical Image Segmentation Metric Analysis (Dice Coefficient, Pixel Accuracy, AUC, Precision, Recall)	Performance must demonstrate equivalence or improvement compared to the original submission model version.	Not explicitly reported as a single summary metric, but the document states these metrics are used to ensure the model "consistently meet performance standards" and for successful validation in line with the modification protocol.
AxialML Machine Learning Model Validation (PCCP)	Qualitative Assessment by Medical Visualization Engineers	Fixed evaluation methodology to define improved, equivalent, or reduced performance against AxialML Model Design Input Specifications.	Confirmed validation by producing objective evidence that each AxialML Model Design Input Specification has been met and the model output supports Axial Staff in completing anatomical segmentation.
AxialML Machine Learning Model Validation (PCCP)	Quantitative Assessment using Expert Reference Standard (DICE, AUC, Precision, Accuracy, Recall)	Mean of identified quantitative metrics must demonstrate equivalence or an improvement for the proposed modified AxialML model.	Not explicitly reported as a single summary metric, but this is the criterion for successful validation.

2. Sample Sizes Used for the Test Set and Data Provenance

The document provides details for two primary studies and for the AxialML model validation.

Clinical Segmentation Performance study (for Axial3D Insight software):

Sample Size: 12 cases
Data Provenance: Not explicitly stated, but it is implied to be clinical medical imaging data. Specific country of origin is not mentioned. The data type is retrospective as it refers to existing medical imaging.

Intended Use validation study (for 3D models produced by Axial3D Insight):

Sample Size: 12 cases (presumably the same cases as the Clinical Segmentation Performance study, though not explicitly stated that they are the exact same dataset).
Data Provenance: Not explicitly stated, but implied to be clinical medical imaging data for generating 3D models. Retrospective.

AxialML Machine Learning Model Validation (Validation Datasets):

Sample Sizes:
- Cardiac CT/CTa: 4,838 images
- Neuro CT/CTa: 4,041 images
- Ortho CT: 10,857 images
- Trauma CT: 19,134 images
Data Provenance: Not explicitly stated, but includes various scanner manufacturers and models (GE, Siemens, Phillips, Toshiba). The document states that for "Quantitative Assessment using Expert Reference Standard," independently sourced datasets commissioned from US only sites were used. This suggests at least a portion of the validation data is from the US. The nature of this data (e.g., existing scans) suggests a retrospective nature.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications

Clinical Segmentation Performance study:

Number of Experts: 3 radiologists
Qualifications: "Radiologists" - no additional experience or specific subspecialty is detailed.

Intended Use validation study:

Number of Experts: 9 physicians
Qualifications: "Physicians" - no additional experience or specific subspecialty is detailed.

AxialML Machine Learning Model Validation (for expert reference standard):

Number of Experts: Unspecified "expert radiologists" for independently segmenting and reviewing the expert reference standards. The number is not explicitly stated but implies more than one ("expert radiologists").
Qualifications: "Expert radiologists" - no additional experience or specific subspecialty is detailed beyond being an expert radiologist.
For Qualitative Assessment: A "pool, minimum of 3, of Axial3D Medical Visualization Engineers" review segmentations. These are internal staff, not external medical experts establishing ground truth.

4. Adjudication Method for the Test Set

Clinical Segmentation Performance study:

The document states "3 radiologists reviewing the segmentation of 12 cases" and that "all cases were scored within the acceptance criteria of 1 or 2a" using the RADPEER framework. This suggests an individual review by each radiologist, and potentially a consensus or adjudication if scores differed, but the specific adjudication method (e.g., 2+1, 3+1) is not detailed. The phrase "all cases were scored within the acceptance criteria" implies successful agreement or resolution.

AxialML Machine Learning Model Validation:

For the "Qualitative Assessment," a "fixed evaluation methodology" is used by a pool of Medical Visualization Engineers. This implies a standardized process for assessment, but not a specific consensus or adjudication method among the engineers beyond their individual reviews contributing to the overall assessment.
For the "Quantitative Assessment using Expert Reference Standard," the ground truth is established by "expert radiologists" who independently segmented and reviewed the datasets. This implies these expert interpretations form the ground truth without a further adjudication step by the study designers, or at least no explicit adjudication process is described in the provided text.

5. Multi Reader Multi Case (MRMC) Comparative Effectiveness Study

No explicit MRMC comparative effectiveness study is mentioned, nor is an effect size indicating human reader improvement with AI assistance vs. without AI assistance reported. The studies described focus on the device's performance in isolation or its output reviewed by human experts, rather than comparing human performance with and without the AI.

6. Standalone Performance Study

Yes, a standalone validation was performed for the AxialML machine learning models.
The document states that "AxialML machine learning models were independently verified and validated before inclusion in the Axial3D Insight device." This validation involved quantitative metrics (Dice Coefficient, Pixel Accuracy, AUC, Precision, Recall) directly assessing the performance of the ML models against ground truth.

However, the output of these ML models is not used in isolation in the final product. The text clarifies: "The segmentations produced by the AxialML machine learning models are used by Axial3D trained staff who complete the final segmentation and validation of the quality of each 3D patient specific model produced." This means the final device performance is human-in-the-loop, even if the ML component has a standalone validation.

7. Type of Ground Truth Used

Clinical Segmentation Performance study: Assessed by "3 radiologists" using the RADPEER framework. This is expert consensus/review (implicitly, given all cases met criteria).
Intended Use validation study: Assessed by "9 physicians" reviewing 3D models. This is expert review of the device output usability.
AxialML Machine Learning Model Validation: "Expert reference standards, independently sourced datasets... independently segmented and reviewed by expert radiologists." This is expert consensus/pathology-like reference (since it's a segmentation ground truth).

8. Sample Size for the Training Set

The document explicitly states that "The AxialML machine learning model training data used during the algorithm development was explicitly kept separate and independent from the validation data used." However, the sample size for the training set is not provided in the given text. Only the validation dataset sizes are listed (e.g., 4,838 images for Cardiac CT/CTa).

9. How the Ground Truth for the Training Set Was Established

While the document mentions that training data was "explicitly kept separate and independent from the validation data," it does not describe how the ground truth for the training set was established. It only details how ground truth for the validation sets used for the PCCP was established (expert radiologists independently segmenting and reviewing).

Ask a Question

Ask a specific question about this device

K Number

K250237

Device Name

InferOperate Suite

Manufacturer

Beijing Infervision Healthcare Medical Technology Co., Ltd.

Date Cleared

2025-09-15

(231 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K212896

Predicate For

N/A

Intended Use

InferOperate Suite is medical imaging software that is intended to provide trained medical professionals with tools to aid them in reading, interpreting, reporting, and treatment planning for patients, including both preoperative surgical planning and intraoperative image display. InferOperate Suite accepts DICOM compliant medical images acquired from a variety of imaging devices.

This product is not intended for use with or for the primary diagnostic interpretation of Mammography images.

It provides several categories of tools. It includes basic imaging tools for general images, including 2D viewing, volume rendering and 3D volume viewing, orthogonal Multi-Planar Reconstructions (MPR), surface rendering, measurements, surgical planning, reporting, storing, general image management and administration tools, etc.

It includes a basic image processing workflow and a custom UI to segment anatomical structures. The processing may include the generation of preliminary segmentations of anatomy using software that employs machine learning and other computer vision algorithms, as well as interactive segmentation tools, etc.

InferOperate Suite is designed for use by trained professionals and is intended to assist the clinician who is responsible for making all final patient management decisions.

InferOperate Suite utilizes machine learning-based algorithms for adult patients undergoing CT chest, abdominal, or pelvic scans. For image data of other anatomical regions or modalities, patients under 21 years of age, or patients with unknown age, we provide non-ML software functions, such as STL viewer.

Device Description

InferOperate Suite receives medical images in DICOM standard format and utilizes machine learning (ML) and other medical image processing techniques, along with interactive segmentation tools, to segment anatomical structures and target ROIs. InferOperate Suite performs 3D reconstruction and visualization, and provides several tools for surgical planning. The server receives DICOM images, analyzes the images, and provides 3D visualization of the anatomical structures. The system can be deployed on a dedicated on-premise server or a cloud server.

InferOperate Suite provides several categories of tools. It includes basic imaging tools for general image, including 2D viewing, volume rendering and 3D volume viewing, Multi-Planar Reconstructions (MPR), surface rendering, measurements, surgical planning, reporting, storing, general image management and administration tools

AI/ML Overview

Here's an analysis of the acceptance criteria and the study proving the device meets them, based on the provided FDA 510(k) clearance letter for the InferOperate Suite.

1. Table of Acceptance Criteria and Reported Device Performance

The acceptance criteria for segmentation performance are explicitly stated as "Target" values for the Dice coefficient (DSC) and 95% Hausdorff Distance (HD95). The reported device performance is presented as the Mean and 95% Confidence Interval (CI) for these metrics.

No.	Model	Metric	Mean Reported Performance	95% CI Reported Performance	Target Acceptance Criteria	Meets Criteria?
1	Bronchus	Dice	0.87	0.85-0.88	0.79	Yes
		HD95	2.33	2.07-2.59	3.5	Yes
2	Pulmonary artery	Dice	0.87	0.86-0.88	0.76	Yes
		HD95	3.35	2.79-3.90	5.55	Yes
	Pulmonary vein	Dice	0.85	0.84-0.86	0.77	Yes
		HD95	3.19	2.96-3.42	5.55	Yes
3	Pulmonary lobe	Dice	0.98	0.97-0.98	0.88	Yes
		HD95	2.63	2.34-2.91	4.15	Yes
	Pulmonary segment	Dice	0.88	0.88-0.89	0.79	Yes
		HD95	3.42	3.13-3.70	4.15	Yes
4	Liver	Dice	0.98	0.98-0.98	0.87	Yes
		HD95	2.15	2.09-2.22	4.95	Yes
5	Hepatic segment (Couinaud's method)	Dice	0.91	0.89-0.94	0.80	Yes
		HD95	2.54	2.18-2.89	4.95	Yes
	Hepatic segment (Vascular method)	Dice	0.91	0.89-0.94	0.80	Yes
		HD95	3.52	2.90-4.14	4.95	Yes
6	Hepatic artery	Dice	0.89	0.88-0.91	0.80	Yes
		HD95	2.36	1.98-2.74	5.55	Yes
7	Hepatic vein	Dice	0.91	0.90-0.91	0.80	Yes
		HD95	1.86	1.75-1.98	5.55	Yes
	Portal vein	Dice	0.86	0.85-0.86	0.80	Yes
		HD95	2.24	1.65-2.82	5.55	Yes
8	Portal vein segment	Dice	0.85	0.83-0.86	0.75	Yes
		HD95	3.46	2.74-4.18	5.55	Yes
9	Gallbladder	Dice	0.94	0.93-0.96	0.78	Yes
		HD95	2.19	1.74-2.63	3.5	Yes
10	Common hepatic-bile duct	Dice	0.83	0.79-0.88	0.73	Yes
		HD95	3.54	2.05-5.03	5.55	Yes
11	Pancreas	Dice	0.97	0.95-0.98	0.7	Yes
		HD95	2.49	1.23-3.75	10.63	Yes
12	Spleen	Dice	0.97	0.96-0.97	0.84	Yes
		HD95	2.79	1.64-3.94	4.94	Yes
13	Kidney	Dice	0.98	0.98-0.98	0.85	Yes
		HD95	1.79	1.63-2.09	4.86	Yes
	Bladder	Dice	0.98	0.97-0.99	0.80	Yes
		HD95	2.33	0.00-5.33	6.22	Yes
14	Renal vein	Dice	0.86	0.85-0.87	0.80	Yes
		HD95	3.03	2.01-4.13	5.55	Yes
15	Renal artery	Dice	0.85	0.85-0.86	0.80	Yes
		HD95	2.24	2.08-2.76	5.55	Yes
16	Upper urinary tract	Dice	0.84	0.82-0.85	0.70	Yes
		HD95	2.81	2.39-3.53	5.55	Yes
17	Adrenal gland	Dice	0.85	0.82-0.87	0.70	Yes
		HD95	2.69	1.98-3.70	10.63	Yes
18	Bone	Dice	0.97	0.97-0.98	0.80	Yes
		HD95	0.83	0.69-0.97	5.75	Yes
19	Skin	Dice	0.97	0.97-0.98	0.90	Yes
		HD95	0.40	0.32-0.48	10.00	Yes

All reported device performance metrics (Mean Dice and Mean HD95) meet or exceed their respective target acceptance criteria.

2. Sample Size Used for the Test Set and Data Provenance

Sample Size: A total of 188 cases were used for algorithm performance testing, broken down as:
- 70 cases of the chest
- 61 cases of the abdomen
- 57 cases of the pelvis
- Some individual segmentations (e.g., Gallbladder, Bladder, Spleen, Bone, Skin) had slightly fewer cases than the overall anatomical region, indicating not all structures were present or analyzed in every case (e.g., 56 for Gallbladder out of 61 abdominal cases).
Data Provenance: The dataset was composed of predominantly U.S. subjects. The letter specifies that the data for performance validation was independent of the training set, with no overlap in data sources. The imaging devices mainly included Siemens, GE, Philips, and Toshiba. The cases included a mix of contrast-enhanced and non-contrast enhanced CT scans for chest. All abdominal and pelvic cases were contrast-enhanced CT. The study design is retrospective, as cases were "collected" and analyzed, and the ground truth established post-acquisition.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications

Number of Experts: Three experts were used.
- Two Chinese radiologists: Their specific qualifications (e.g., years of experience, subspecialty) are not explicitly stated beyond "radiologists."
- One American board-certified radiologist: This expert served as an arbitrator. Their specific years of experience or subspecialty are not explicitly stated, but "board-certified" indicates a recognized standard of expertise in the U.S.

4. Adjudication Method for the Test Set

The adjudication method used was a 2+1 consensus with arbitration.

Two Chinese radiologists independently annotated the organs and anatomical structures.
An American board-certified radiologist served as an arbitrator.
If there were disagreements between the two initial radiologists' annotations, the arbitrator was responsible for resolving the discrepancies by either selecting the more accurate segmentation as the final ground truth or making any necessary modification.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No, a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was not explicitly detailed in the provided document. The performance evaluation focused solely on the standalone performance of the AI algorithm in segmenting anatomical structures against expert-established ground truth. There is no information provided regarding human readers improving with AI assistance vs. without AI assistance.

6. Standalone Performance

Yes, a standalone (algorithm only without human-in-the-loop performance) study was done. The entire "Performance testing" section quantifies the segmentation accuracy of the InferOperate Suite's machine learning algorithms directly against the established ground truth, using Dice coefficient and Hausdorff distance. This specifically measures the intrinsic performance of the algorithm.

7. Type of Ground Truth Used

The ground truth used was expert consensus with arbitration. This means the ground truth was established by human experts (radiologists) with a formal process for resolving disagreements.

8. Sample Size for the Training Set

The document explicitly states that the dataset for performance validation was "independent of the training set" and had "no overlap in data sources." However, the sample size for the training set is not provided in the given text.

9. How the Ground Truth for the Training Set Was Established

The document mentions that the ground truth for the test set was established by annotators independent of the algorithm development annotators, implying that ground truth was also established for the training set by human annotators. However, the specifics of how the ground truth for the training set was established (e.g., number of experts, qualifications, adjudication method) are not provided in the given text.

Ask a Question

Ask a specific question about this device

Page 1 of 79