Search Results

Rayvolve is a computer-assisted detection and diagnosis (CAD) software device to assist radiologists and emergency physicians in detecting fractures during the review of radiographs of the musculosketal system. Rayvolve is indicated for adult and pediatric population (≥ 2 years).

Rayvolve is indicated for radiographs of the following industry-standard radiographic views and study types.

Study type (Anatomic Area of interest) / Radiographic Views* supported: Ankle/ AP, Lateral, Oblique Clavicle/ AP, AP Angulated View Elbow/ AP, Lateral Forearm/ AP, Lateral Hip /AP, Frog-leg lateral Humerus /AP, Lateral Knee/ AP, Lateral Pelvis /AP Shoulder/ AP, Lateral, Axillary Tibia/fibula/ AP, Lateral Wrist/ PA, Lateral, Oblique Hand / PA, Lateral, Oblique Foot/ AP, Lateral, Oblique.

Definitions of anatomic area of interest and radiographic views are consistent with the ACR-SPR-SSR Practice Parameter for the Performance of Radiography of the Extremities guideline.

Device Description

The medical device is called Rayvolve. It is a standalone software that uses deep learning techniques to detect and localize fractures on osteoarticular X-rays. Rayvolve is intended to be used as an aided-diagnosis device and does not operate autonomously.

Rayvolve has been developed to use the current edition of the DICOM image standard. DICOM is the international standard for transmitting, storing, printing, processing, and displaying medical imaging.

Using the DICOM standard allows Rayvolve to interact with existing DICOM Node servers (eg.: PACS) and clinical-grade image viewers. The device is designed for running on-premise, cloud platform, connected to the radiology center local network, and can interact with the DICOM Node server.

When remotely connected to a medical center DICOM Node server. Rayvolve directly interacts with the DICOM files to output the prediction (potential presence or absence of fracture) the initial image appears first, followed by the image processed by Ravvolve.

Rayvolve does not intend to replace medical doctors. The instructions for use are strictly and systematically transmitted to each user and used to train them on Ravvolve's use.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided FDA 510(k) summary for Rayvolve:

1. Table of Acceptance Criteria and Reported Device Performance

The acceptance criteria are not explicitly listed in a single table with defined thresholds. However, based on the performance data presented, the implicit acceptance criteria for standalone performance appear to be:

High Sensitivity, Specificity, and AUC for fracture detection.
Non-inferiority of the retrained algorithm (including pediatric population) compared to the predicate device, specifically by ensuring the lower bound of the difference in AUCs (Retrained - Predicate) for each anatomical area is greater than -0.05.
Superior diagnostic accuracy of readers when aided by Rayvolve compared to unaided readers, as measured by AUC in an MRMC study.
Improved sensitivity and specificity for readers when aided by Rayvolve.

Table: Acceptance Criteria (Implicit) and Reported Device Performance

Acceptance Criterion (Implicit)	Reported Device Performance (Standalone & MRMC Studies)
Standalone Performance (Pediatric Population Inclusion)
High Sensitivity for fracture detection in pediatric population (implicitly > 0.90 based on predicate).	0.9611 (95% CI: 0.9480; 0.9710)
High Specificity for fracture detection in pediatric population (implicitly > 0.80 based on predicate).	0.8597 (95% CI: 0.8434; 0.8745)
High AUC for fracture detection in pediatric population (implicitly > 0.90 based on predicate).	0.9399 (95% Bootstrap CI: 0.9330; 0.9470)
Non-inferiority of Retrained Algorithm (compared to Predicate for adult & pediatric)
Lower bound of difference in AUCs (Retrained - Predicate) > -0.05 for all anatomical areas.	"The lower bounds of the differences in AUCs for the Retrained model compared to the Predicate model are all greater than -0.05, indicating that the Retrained model's performance is not inferior to the Predicate model across all organs." (Specific values for each organ are not provided, only the conclusion that they meet the criterion.) The Total AUC for Retrained is 0.98781 (0.98247; 0.99048) compared to Predicate 0.98607 (0.98104; 0.99058). Overlapping CIs and the non-inferiority statement support this. This suggests the inclusion of pediatric data did not degrade performance in adult data.
MRMC Clinical Reader Study
Diagnostic accuracy (AUC) of readers aided by Rayvolve is superior to unaided readers.	Reader AUC improved from 0.84602 to 0.89327, a difference of 0.04725 (95% Cl: 0.03376; 0.061542) (p=0.0041). This demonstrates statistically significant superiority.
Reader sensitivity is improved with Rayvolve assistance.	Reader sensitivity improved from 0.86561 (95% Wilson's Cl: 0.84859, 0.88099) to 0.9554 (95% Wilson's CI: 0.94453, 0.96422).
Reader specificity is improved with Rayvolve assistance.	Reader specificity improved from 0.82645 (95% Wilson's Cl: 0.81187, 0.84012) to 0.83116 (95% Wilson's CI: 0.81673, 0.84467).

2. Sample Sizes and Data Provenance

Test Set (Pediatric Standalone Study):
- Sample Size: 3016 radiographs.
- Data Provenance: Not explicitly stated regarding country of origin. The study was retrospective.
Test Set (Adult Predicate Standalone Study - for comparison):
- Sample Size: 2626 radiographs.
- Data Provenance: Not explicitly stated regarding country of origin.
Test Set (MRMC Clinical Reader Study):
- Sample Size: 186 cases.
- Data Provenance: Not explicitly stated regarding country of origin. The study was retrospective.
Training Set:
- Sample Size: 150,000 osteoarticular radiographs. (Expanded from 115,000 for the predicate device).
- Data Provenance: Not explicitly stated regarding country of origin.

3. Number of Experts and Qualifications for Ground Truth (Test Set)

Number of Experts: A panel of three (3) US board-certified MSK radiologists.
Qualifications of Experts: US board-certified MSK (Musculoskeletal) radiologists. Years of experience are not specified, but board certification implies a certain level of expertise.

4. Adjudication Method for the Test Set (Ground Truth Establishment)

Method: "Each case had been previously evaluated by a panel of three US board-certified MSK radiologists to provide ground truth binary labeling the presence or absence of fracture and the localization information for fractures." This implies a consensus-based ground truth, likely achieved through discussion and agreement among the three radiologists. The term "panel" suggests a collaborative review. No specific "2+1" or "3+1" rule is mentioned, but "panel of three" indicates a rigorous approach to consensus.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

Was it done?: Yes, a fully crossed multi-reader, multi-case (MRMC) retrospective reader study was done.
Effect Size of Improvement:
- AUC Improvement: Reader AUC was significantly improved from 0.84602 (unaided) to 0.89327 (aided), resulting in a difference (effect size) of 0.04725 (95% Cl: 0.03376; 0.061542) (p=0.0041).
- Sensitivity Improvement: Reader sensitivity improved from 0.86561 (unaided) to 0.9554 (aided).
- Specificity Improvement: Reader specificity improved from 0.82645 (unaided) to 0.83116 (aided).

6. Standalone (Algorithm Only) Performance Study

Was it done?: Yes, standalone performance assessments were conducted for both the pediatric population inclusion and the retrained algorithm.
- Pediatric Standalone Study: Sensitivity (0.9611), Specificity (0.8597), and AUC (0.9399) were reported.
- Retrained Algorithm Standalone Study: Non-inferiority was assessed by comparing AUCs against the predicate device's standalone performance, showing improvements or non-inferiority across body parts (e.g., Total AUC for retrained was 0.98781 vs. predicate 0.98607).

7. Type of Ground Truth Used

For Test Sets (Standalone & MRMC): Expert consensus by a panel of three US board-certified MSK radiologists. They provided binary labeling (presence/absence of fracture) and localization information (bounding boxes) for fractures. This is a form of expert consensus.

8. Sample Size for the Training Set

Sample Size: 150,000 osteoarticular radiographs.

9. How Ground Truth for the Training Set was Established

The document states that the "training dataset for the subject device was expanded to include 150,000 osteoarticular radiographs". While it confirms the size and composition (mixed adult/pediatric, osteoarticular radiographs), it does not explicitly describe how the ground truth for this training set was established. It mentions that the "previous truthed predicate test dataset was strictly walled off and not included in the new training dataset," implying that the training data was "truthed," but the method (e.g., expert review, automated labeling, etc.) is not detailed. Given the large training set size, it is common for such datasets to be curated through a combination of established clinical reports, expert review, or semi-automated processes, but the specific methodology is not provided in this summary.

Ask a Question

Ask a specific question about this device

Page 1 of 1