Search Results

BoneView 1.1-US is intended to analyze radiographs using machine learning techniques to identify and highlight fractures during the review of radiographs of: Ankle, Foot, Knee, Tibia/Fibula, Wrist, Hand, Elbow, Forearm, Humerus, Shoulder, Clavicle, Pelvis, Hip, Femur, Ribs, Thoracic Spine, Lumbosacral Spine. BoneView 1.1-US is intended for use as a concurrent reading aid during the interpretation of radiographs. BoneView 1.1-US is for prescription use only.

Device Description

BoneView 1.1-US is a software-only device intended to assist clinicians in the interpretation of: . limbs radiographs of children/adolescents and . limbs, pelvis, rib cage, and dorsolumbar vertebra radiographs of adults. BoneView 1.1-US can be deployed on-premise or on cloud and be connected to several computing platforms and X-ray imaging platforms such as X-ray radiographic systems, or PACS. After the acquisition of the radiographs on the patient and their storage in the DICOM Source, the radiographs are automatically received by BoneView 1.1-US from the user's DICOM Source through an intermediate DICOM node. Once received by BoneView 1.1-US, the radiographs are automatically processed by the AI algorithm to identify regions of interest. Based on the processing result, BoneView 1.1-US generates result files in DICOM format. These result files consist of a summary table and result images (annotations on a copy of the original images or annotations to be toggled on/off). BoneView 1.1-US does not alter the original images, nor does it change the order of original images or delete any image from the DICOM Source. Once available, the result files are sent by BoneView 1.1-US to the DICOM Destination through the same intermediate DICOM node. The DICOM Destination can be used to visualize the result files provided by BoneView 1.1-US or to transfer the results to another DICOM host for visualization. The users are then as a concurrent reading aid to provide their diagnosis.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:

1. Table of Acceptance Criteria and Reported Device Performance

The acceptance criteria are not explicitly stated as numerical targets in a table. Instead, the study aims to demonstrate that the device performs with "high sensitivity and high specificity" and that its performance on children/adolescents is "similar" to that on adults. For the clinical study, the acceptance criteria are implicitly that the diagnostic accuracy of readers aided by BoneView is superior to that of readers unaided.

However, the document provides the performance metrics for both standalone testing and the clinical study.

Standalone Performance (Children/Adolescents Clinical Performance Study Dataset)

Operating Point	Metric	Value (95% Clopper-Pearson CI)	Description
High-sensitivity (DOUBT FRACT)	Sensitivity	0.909 [0.889 - 0.926]	The probability that the device correctly identifies a fracture when a fracture is present. This operating point is designed to be highly sensitive to possible fractures, potentially including subtle ones, and is indicated by a dotted bounding box.
High-sensitivity (DOUBT FRACT)	Specificity	0.821 [0.796 - 0.844]	The probability that the device correctly identifies the absence of a fracture when no fracture is present.
High-specificity (FRACT)	Sensitivity	0.792 [0.766 - 0.817]	The probability that the device correctly identifies a fracture when a fracture is present. This operating point is designed to be highly specific, meaning it provides a high degree of confidence that a detected fracture is indeed a fracture, and is indicated by a solid bounding box.
High-specificity (FRACT)	Specificity	0.965 [0.952 - 0.976]	The probability that the device correctly identifies the absence of a fracture when no fracture is present.

Comparative Standalone Performance (Children/Adolescents vs. Adult)

Operating Point	Dataset	Sensitivity (95% CI)	Specificity (95% CI)	95% CI on the difference (Sensitivity)	95% CI on the difference (Specificity)
High-sensitivity (DOUBT FRACT)	Adult clinical performance study	0.928 [0.919 - 0.936]	0.811 [0.8 - 0.821]	-0.019 [-0.039 - 0.001]	0.010 [-0.016 - 0.037]
High-sensitivity (DOUBT FRACT)	Children/adolescents clinical performance	0.909 [0.889 - 0.926]	0.821 [0.796 - 0.844]
High-specificity (FRACT)	Adult clinical performance study	0.841 [0.829 - 0.853]	0.932 [0.925 - 0.939]	-0.049 [-0.079 - -0.021]	0.033 [0.019 - 0.046]
High-specificity (FRACT)	Children/adolescents clinical performance	0.792 [0.766 - 0.817]	0.965 [0.952 - 0.976]

Clinical Study Performance (MRMC - Reader Performance with/without AI assistance)

Metric	Unaided Performance (95% bootstrap CI)	Aided Performance (95% bootstrap CI)	Increase
Specificity	0.906 (0.898-0.913)	0.956 (0.951-0.960)	+5%
Sensitivity	0.648 (0.640-0.656)	0.752 (0.745-0.759)	+10.4%

2. Sample sizes used for the test set and data provenance:

Standalone Performance Test Set:
- Children/Adolescents: 2,000 radiographs (52.8% males, age range [2 – 21]; mean 11.54 +/- 4.7). The anatomical areas of interest included all those in the Indications for Use for this population group.
- Adults (cited from predicate device K212365): 8,918 radiographs (47.2% males, age range [21 – 113]; mean 52.5 +/- 19.8). The anatomical areas of interest included all those in the Indications for Use for this population group.
Clinical Study Test Set (MRMC): 480 cases (31.9% males, age range [21 – 93]; mean 59.2 +/- 16.4). These cases were from all anatomical areas of interest included in BoneView's Indications for Use.
Data Provenance: The document states "various manufacturers" (e.g., Canon, Fujifilm, GE Healthcare, Konica Minolta, Philips, Primax, Samsung, Siemens for standalone data; GE Healthcare, Kodak, Konica Minolta, Philips, Samsung for clinical study data). The general context implies a European or North American source for the regulatory submission (France for the manufacturer, FDA for the review). It is explicitly stated that these datasets were independent of training data. The studies are described as retrospective.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:

Clinical Study (MRMC Test Set): Ground truth was established by a panel of three U.S. board-certified radiologists. No further details on their years of experience are provided, only their certification.
Standalone Test Sets (Children/Adolescents & Adult): The document doesn't explicitly state the number or qualifications of experts used to establish ground truth for the standalone test sets. However, it indicates these datasets were used for "diagnostic performances," implying a definitive ground truth. Given the rigorous nature of FDA submissions, it's highly probable that board-certified radiologists or other qualified medical professionals established this ground truth.

4. Adjudication method (e.g., 2+1, 3+1, none) for the test set:

Clinical Study (MRMC Test Set): The ground truth was established by a panel of three U.S. board-certified radiologists. The method of adjudication (e.g., majority vote, discussion to consensus) is not explicitly detailed, but it states they "assigned a ground truth label." This strongly suggests a consensus or majority-based method from the panel of three, rather than just 2+1 or 3+1 with a tie-breaker.
Standalone Test Sets: Not explicitly stated, though a panel or consensus method is standard for robust ground truth establishment.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, if so, what was the effect size of how much human readers improve with AI vs without AI assistance:

Yes, a fully-crossed multi-reader, multi-case (MRMC) retrospective reader study was conducted.
Effect Size of Improvement with AI Assistance:
- Specificity: Improved by +5% (from 0.906 unaided to 0.956 aided).
- Sensitivity: Improved by +10.4% (from 0.648 unaided to 0.752 aided).
- The study found that "the diagnostic accuracy of readers in the intended use population is superior when aided by BoneView than when unaided by BoneView."
- Subgroup analysis also found that "Sensitivity and Specificity were higher for Aided reads versus Unaided reads for all of the anatomical areas of interest."

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:

Yes, standalone performance testing was conducted for both the children/adolescent population and the adult population (the latter referencing the predicate device's data). The results are provided in the tables under section 1.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.):

Expert Consensus: The ground truth for the clinical MRMC study was established by a "panel of three U.S. board-certified radiologists who assigned a ground truth label indicating the presence of a fracture and its location." For the standalone testing, although not explicitly stated, it is commonly established by expert interpretation of the radiographs, often through consensus, to determine the presence or absence of fractures.

8. The sample size for the training set:

The training of BoneView was performed on a training dataset of 44,649 radiographs, representing 151,096 images. This dataset covered all anatomical areas of interest in the Indications for Use and was sourced from various manufacturers.

9. How the ground truth for the training set was established:

The document implies that the "training was performed on a training dataset... for all anatomical areas of interest." While it doesn't explicitly state how ground truth was established for this massive training set, it is standard practice for medical imaging AI that ground truth for training data is established through expert annotation (e.g., radiologists, orthopedic surgeons) of the images, typically through a labor-intensive review process.

Ask a Question

Ask a specific question about this device

Page 1 of 1