Search Results
Found 1 results
510(k) Data Aggregation
(189 days)
MR Contour DL generates a Radiotherapy Structure Set (RTSS) DICOM with segmented organs at risk which can be used by trained medical professionals. It is intended to aid in radiation therapy planning by generating initial contours to accelerate workflow for radiation therapy planning. It is the responsibility of the user to verify the processed output contours and user-defined labels for each organ at risk and correct the contours/labels as needed. MR Contour DL is intended to be used with images acquired on MR scanners, in adult patients.
MR Contour DL is a post processing application intended to assist a clinician by generating contours of organ at risk (OAR) from MR images in the form of a DICOM Radiotherapy Structure Set (RTSS) series. MR Contour DL is designed to automatically contour the organs in the head/neck, and in the pelvis for Radiation Therapy (RT) planning of adult cases. The output of the MR Contour DL is intended to be used by radiotherapy (RT) practitioners after review and editing, if necessary, and confirming the accuracy of the contours for use in radiation therapy planning.
MR Contour DL uses customizable input parameters that define RTSS description, RTSS labeling, organ naming and coloring. MR Contour DL does not have a user interface of its own and can be integrated with other software and hardware platforms. MR Contour DL has the capability to transfer the input and output series to the customer desired DICOM destination(s) for review.
MR Contour DL uses deep learning segmentation algorithms that have been designed and trained specifically for the task of generating organ at risk contours from MR images. MR Contour DL is designed to contour 37 different organs or structures using the deep learning algorithms in the application processing workflow.
The input of the application is MR DICOM images in adult patients acquired from compatible MR scanners. In the user-configured profile, the user has the flexibility to choose both the covered anatomy of input scan and the specific organs for segmentation. The proposed device has been tested on GE HealthCare MR data.
Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided FDA 510(k) clearance letter for MR Contour DL:
1. Table of Acceptance Criteria and Reported Device Performance
Device: MR Contour DL
| Metric | Organ Anatomy Region | Acceptance Criteria | Reported Performance (Mean) | Outcome |
|---|---|---|---|---|
| DICE Similarity Coefficient (DSC) | Small Organs (e.g., chiasm, inner-ear) | ≥ 50% | 67.4% - 98.8% (across all organs) | Met |
| Medium Organs (e.g., brainstem, eye) | ≥ 65% | 79.6% - 95.5% (across relevant organs) | Met | |
| Large Organs (e.g., bladder, head-body) | ≥ 80% | 90.3% - 99.3% (across relevant organs) | Met | |
| 95th percentile Hausdorff Distance (HD95) Comparison | All Organs | Improved or Equivalent to Predicate Device | Improved or Equivalent in 24/28 organs analyzed; average HD95 of 4.7 mm (< predicate average) | Met |
| Likert Score (Reader Study) | All Organs | Mean Likert Score ≥ 3.0 (where 3 = good, some correction needed) | 3.0 - 4.5 (across all organs) | Met |
Note: The HD95 values for specific organs are provided in Table 4 of the document, showing individual comparisons (Improved, Not-Improved, Equivalent, N/A). The overall performance for HD95 is summarized as met based on the text "improved or equivalent HD95 value in 24/28 of the organs analyzed and an average HD95 performance of 4.7 mm, which is smaller than the average corresponding HD95 values of the predicate device."
2. Sample Sizes and Data Provenance
- Test Set (Non-Clinical/Bench Testing):
- Total Cases: 105 retrospectively collected exams.
- Head/Neck: 50 cases (23 from independently collected cohorts, 27 separated from development data)
- Pelvis: 55 cases (32 from independently collected cohorts, 23 separated from development data)
- Data Provenance:
- Country of Origin: USA (72% Head/Neck, 58% Pelvis) and Europe (NL 28% Head/Neck, UK 42% Pelvis)
- Retrospective/Prospective: Retrospectively collected
- Total Cases: 105 retrospectively collected exams.
- Test Set (Clinical/Reader Study):
- Total Cases: 70 cases (a subset of the non-clinical test data).
- Head/Neck: 30 cases
- Pelvis: 40 cases
- Data Provenance: Same as non-clinical testing, as it was a subset.
- Total Cases: 70 cases (a subset of the non-clinical test data).
- Training Set: Not explicitly stated. The document mentions "separated from the development data cohorts before the models were trained," implying a training set existed but its size is not given.
3. Number of Experts and Qualifications for Ground Truth (Test Set)
- Number of Experts: Three (3) board-certified radiation oncologists.
- Qualifications: Two (2) from the USA, one (1) from Europe. All certified radiation oncologists. Experience level (e.g., "10 years experience") is not specified beyond "board certified."
4. Adjudication Method (Test Set)
- Non-Clinical/Bench Testing (Ground Truth Generation):
- Manual contours delineated by GEHC operators trained using international guidelines (DAHANCA, RTOG).
- Manual contours were revised (corrected and approved) by the three board-certified radiation oncologists.
- All three independently validated ground-truth contours were incorporated in the performance evaluation. This suggests a form of consensus or a voting mechanism, but the exact "adjudication" (e.g., 2+1, averaging) is not detailed. It implies that the final ground truth was derived from the combination of all three expert reviews.
- Clinical/Reader Study:
- Automated contours were scored by the three certified radiation oncologists.
- Readers completed their assessments independently and were blinded to the results of other readers' assessments.
- All three independently provided Likert Scores were incorporated in the performance evaluation. Similar to ground truth generation, the exact method of combining scores beyond "incorporating" is not specified, but the final reported value is a "Likert score MEAN."
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- Was it done? Yes, a multi-reader study was conducted to assess the adequacy of the contours. The readers were radiologists providing an assessment (Likert score) of the AI-generated contours.
- Effect Size of Human Readers' Improvement with AI vs. without AI Assistance: This study was structured to evaluate the adequacy of AI-generated contours for use in RT planning, with human readers providing an assessment of these pre-generated AI contours. It was not a comparative effectiveness study designed to measure the improvement in human reader performance when assisted by AI versus unassisted human performance (e.g., human-only contouring vs. AI-assisted human contouring). The study aimed to show that the AI's output is acceptable for human review and correction, not how much faster or more accurate humans become with the AI.
6. Standalone (Algorithm Only Without Human-in-the-Loop Performance)
- Was it done? Yes, the "Non-Clinical Testing" or "Bench Testing" section directly assesses the algorithm's standalone performance using DSC and HD95 metrics, comparing its output to expert-generated ground truth. The algorithm generates the initial contours, which are then evaluated for accuracy against the established ground truth.
7. Type of Ground Truth Used
- Non-Clinical/Bench Testing: Expert consensus (manual contours by trained operators, revised and approved by three board-certified radiation oncologists).
- Clinical/Reader Study: Expert opinion/assessment (Likert scores provided by three board-certified radiation oncologists on the adequacy of the AI-generated contours).
8. Sample Size for the Training Set
- The sample size for the training set is not explicitly provided in the document. It only states that the test data cases (27 head/neck, 23 pelvis) were "separated from the development data cohorts before the models were trained."
9. How the Ground Truth for the Training Set was Established
- The method for establishing ground truth for the training set is not explicitly detailed. It can be inferred that it followed a similar process to the test set ground truth (manual contouring by trained operators, potentially reviewed by experts), as it mentions "development data cohorts," but the specifics are absent.
Ask a specific question about this device
Page 1 of 1