(103 days)
Rayvolve LN is a computer-aided detection software device to identify and mark regions in relation to suspected pulmonary nodules from 6 to 30mm size. It is designed to aid radiologists in reviewing the frontal (AP/PA) chest radiographs of patients of 18 years of age or older acquired on digital radiographic systems as a second reader and be used with any DICOM Node server. Rayvolve LN provides adjunctive information only and is not a substitute for the original chest radiographic image.
The medical device is called Rayvolve LN. Rayvolve LN is one of the verticals of the Rayvolve product line. It is a standalone software that uses deep learning techniques to detect and localize pulmonary nodules on chest X-rays. Rayvolve LN is intended to be used as an aided-diagnosis device and does not operate autonomously.
Rayvolve LN has been developed to use the current edition of the DICOM image standard. DICOM is the international standard for transmitting, storing, retrieving, printing, processing, and displaying medical imaging.
Using the DICOM standard allows Rayvolve LN to interact with existing DICOM Node servers (eg.: PACS) and clinical-grade image viewers. The device is designed for running on-premise, cloud platform, connected to the radiology center local network, and can interact with the DICOM Node server.
When remotely connected to a medical center DICOM Node server, Rayvolve LN directly interacts with the DICOM files to output the prediction (potential presence of pulmonary nodules) the original image appears first, followed by the image processed by Rayvolve.
Rayvolve LN does not intend to replace medical doctors. The instructions for use are strictly and systematically transmitted to each user and used to train them on Rayvolve LN's use.
Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided text:
Acceptance Criteria and Device Performance
The document doesn't explicitly list a table of acceptance criteria in the sense of predefined thresholds for performance metrics. Instead, it describes a comparative study where the acceptance criterion is superiority to unaided readers and comparability to a predicate device. The reported device performance is then presented as the outcomes of these studies.
However, we can infer the performance metrics used for evaluation.
Inferred Acceptance Criteria & Reported Device Performance:
Performance Metric | Acceptance Criteria (Implied) | Rayvolve LN Performance (Unaided) | Rayvolve LN Performance (Aided) | Standalone Rayvolve LN Performance |
---|---|---|---|---|
Reader AUC (Diagnostic Accuracy) | Superior to unaided reader performance; comparable to predicate. | 0.8071 | 0.8583 | Not directly applicable |
Reader Sensitivity (per image) | Significantly improved from unaided reader. | 0.7975 | 0.8935 | Not directly applicable |
Reader Specificity (per image) | Improved from unaided reader. | 0.8235 | 0.8510 | Not directly applicable |
Standalone Sensitivity | Demonstrates accurate nodule detection. | Not applicable | Not applicable | 0.8847 |
Standalone Specificity | Demonstrates accurate nodule detection. | Not applicable | Not applicable | 0.8294 |
Standalone AUC (ROC) | Demonstrates accurate nodule detection. | Not applicable | Not applicable | 0.8408 |
Note: The direct "acceptance criteria" are implied by the study's primary and secondary objectives (i.e., improvement over unaided reading and comparability to a predicate device). The tables above synthesize the key performance metrics reported.
Study Details:
1. Sample Sizes and Data Provenance:
- Test Set (Standalone Performance): 2181 radiographs. The data provenance is not explicitly stated in terms of country of origin, nor whether it was retrospective or prospective. It is described as "all the study types and views in the indication for use."
- Test Set (Clinical Data - MRMC Study): 400 cases. These cases were "randomly sampled from the validation dataset used for the standalone performance study," implying they are a subset of the 2181 radiographs mentioned above.
- Training Set: The sample size for the training set is not provided in the document.
2. Number of Experts for Ground Truth & Qualifications:
- Number of Experts: The document does not explicitly state the number of experts used to establish the ground truth for the test set. It mentions "ground truth binary labeling indicating the presence or absence of pulmonary nodules" for the MRMC study but doesn't detail how this ground truth was derived.
- Qualifications of Experts: Not specified.
3. Adjudication Method for the Test Set:
- The adjudication method for establishing ground truth is not explicitly detailed. It merely states "ground truth binary labeling indicating the presence or absence of pulmonary nodules."
4. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:
- Yes, an MRMC study was done.
- Effect Size of Improvement:
- Reader AUC: Improved from 0.8071 (unaided) to 0.8583 (aided), a difference of 0.0511. (95% CI: 0.0501; 0.0518)
- Reader Sensitivity (per image): Improved from 0.7975 (unaided) to 0.8935 (aided), a difference of 0.096.
- Reader Specificity (per image): Improved from 0.8235 (unaided) to 0.8510 (aided), a difference of 0.0275.
5. Standalone Performance (Algorithm Only):
- Yes, a standalone performance assessment was done.
- Reported Metrics:
- Sensitivity: 0.8847 (95% CI: 0.8638; 0.9028)
- Specificity: 0.8294 (95% CI: 0.8066; 0.9028)
- AUC: 0.8408 (95% Bootstrap CI: 0.8272; 0.8548)
6. Type of Ground Truth Used:
- The ground truth for both the standalone and MRMC studies was described as "ground truth binary labeling indicating the presence or absence of pulmonary nodules." It does not specify if this was expert consensus, pathology, or outcomes data. However, the context of detecting nodules on chest radiographs for radiologists implies expert consensus as the most probable method.
7. Sample Size for the Training Set:
- Not provided in the document.
8. How Ground Truth for the Training Set was Established:
- Not provided in the document. The document only mentions that the device uses "deep learning techniques" and "supervised Deep learning," which implies labeled training data was used, but details on its establishment are absent.
§ 892.2070 Medical image analyzer.
(a)
Identification. Medical image analyzers, including computer-assisted/aided detection (CADe) devices for mammography breast cancer, ultrasound breast lesions, radiograph lung nodules, and radiograph dental caries detection, is a prescription device that is intended to identify, mark, highlight, or in any other manner direct the clinicians' attention to portions of a radiology image that may reveal abnormalities during interpretation of patient radiology images by the clinicians. This device incorporates pattern recognition and data analysis capabilities and operates on previously acquired medical images. This device is not intended to replace the review by a qualified radiologist, and is not intended to be used for triage, or to recommend diagnosis.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Design verification and validation must include:
(i) A detailed description of the image analysis algorithms including a description of the algorithm inputs and outputs, each major component or block, and algorithm limitations.
(ii) A detailed description of pre-specified performance testing methods and dataset(s) used to assess whether the device will improve reader performance as intended and to characterize the standalone device performance. Performance testing includes one or more standalone tests, side-by-side comparisons, or a reader study, as applicable.
(iii) Results from performance testing that demonstrate that the device improves reader performance in the intended use population when used in accordance with the instructions for use. The performance assessment must be based on appropriate diagnostic accuracy measures (
e.g., receiver operator characteristic plot, sensitivity, specificity, predictive value, and diagnostic likelihood ratio). The test dataset must contain a sufficient number of cases from important cohorts (e.g., subsets defined by clinically relevant confounders, effect modifiers, concomitant diseases, and subsets defined by image acquisition characteristics) such that the performance estimates and confidence intervals of the device for these individual subsets can be characterized for the intended use population and imaging equipment.(iv) Appropriate software documentation (
e.g., device hazard analysis; software requirements specification document; software design specification document; traceability analysis; description of verification and validation activities including system level test protocol, pass/fail criteria, and results; and cybersecurity).(2) Labeling must include the following:
(i) A detailed description of the patient population for which the device is indicated for use.
(ii) A detailed description of the intended reading protocol.
(iii) A detailed description of the intended user and user training that addresses appropriate reading protocols for the device.
(iv) A detailed description of the device inputs and outputs.
(v) A detailed description of compatible imaging hardware and imaging protocols.
(vi) Discussion of warnings, precautions, and limitations must include situations in which the device may fail or may not operate at its expected performance level (
e.g., poor image quality or for certain subpopulations), as applicable.(vii) Device operating instructions.
(viii) A detailed summary of the performance testing, including: test methods, dataset characteristics, results, and a summary of sub-analyses on case distributions stratified by relevant confounders, such as lesion and organ characteristics, disease stages, and imaging equipment.