Search Results
Found 3 results
510(k) Data Aggregation
(189 days)
MR Contour DL generates a Radiotherapy Structure Set (RTSS) DICOM with segmented organs at risk which can be used by trained medical professionals. It is intended to aid in radiation therapy planning by generating initial contours to accelerate workflow for radiation therapy planning. It is the responsibility of the user to verify the processed output contours and user-defined labels for each organ at risk and correct the contours/labels as needed. MR Contour DL is intended to be used with images acquired on MR scanners, in adult patients.
MR Contour DL is a post processing application intended to assist a clinician by generating contours of organ at risk (OAR) from MR images in the form of a DICOM Radiotherapy Structure Set (RTSS) series. MR Contour DL is designed to automatically contour the organs in the head/neck, and in the pelvis for Radiation Therapy (RT) planning of adult cases. The output of the MR Contour DL is intended to be used by radiotherapy (RT) practitioners after review and editing, if necessary, and confirming the accuracy of the contours for use in radiation therapy planning.
MR Contour DL uses customizable input parameters that define RTSS description, RTSS labeling, organ naming and coloring. MR Contour DL does not have a user interface of its own and can be integrated with other software and hardware platforms. MR Contour DL has the capability to transfer the input and output series to the customer desired DICOM destination(s) for review.
MR Contour DL uses deep learning segmentation algorithms that have been designed and trained specifically for the task of generating organ at risk contours from MR images. MR Contour DL is designed to contour 37 different organs or structures using the deep learning algorithms in the application processing workflow.
The input of the application is MR DICOM images in adult patients acquired from compatible MR scanners. In the user-configured profile, the user has the flexibility to choose both the covered anatomy of input scan and the specific organs for segmentation. The proposed device has been tested on GE HealthCare MR data.
Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided FDA 510(k) clearance letter for MR Contour DL:
1. Table of Acceptance Criteria and Reported Device Performance
Device: MR Contour DL
| Metric | Organ Anatomy Region | Acceptance Criteria | Reported Performance (Mean) | Outcome |
|---|---|---|---|---|
| DICE Similarity Coefficient (DSC) | Small Organs (e.g., chiasm, inner-ear) | ≥ 50% | 67.4% - 98.8% (across all organs) | Met |
| Medium Organs (e.g., brainstem, eye) | ≥ 65% | 79.6% - 95.5% (across relevant organs) | Met | |
| Large Organs (e.g., bladder, head-body) | ≥ 80% | 90.3% - 99.3% (across relevant organs) | Met | |
| 95th percentile Hausdorff Distance (HD95) Comparison | All Organs | Improved or Equivalent to Predicate Device | Improved or Equivalent in 24/28 organs analyzed; average HD95 of 4.7 mm (< predicate average) | Met |
| Likert Score (Reader Study) | All Organs | Mean Likert Score ≥ 3.0 (where 3 = good, some correction needed) | 3.0 - 4.5 (across all organs) | Met |
Note: The HD95 values for specific organs are provided in Table 4 of the document, showing individual comparisons (Improved, Not-Improved, Equivalent, N/A). The overall performance for HD95 is summarized as met based on the text "improved or equivalent HD95 value in 24/28 of the organs analyzed and an average HD95 performance of 4.7 mm, which is smaller than the average corresponding HD95 values of the predicate device."
2. Sample Sizes and Data Provenance
- Test Set (Non-Clinical/Bench Testing):
- Total Cases: 105 retrospectively collected exams.
- Head/Neck: 50 cases (23 from independently collected cohorts, 27 separated from development data)
- Pelvis: 55 cases (32 from independently collected cohorts, 23 separated from development data)
- Data Provenance:
- Country of Origin: USA (72% Head/Neck, 58% Pelvis) and Europe (NL 28% Head/Neck, UK 42% Pelvis)
- Retrospective/Prospective: Retrospectively collected
- Total Cases: 105 retrospectively collected exams.
- Test Set (Clinical/Reader Study):
- Total Cases: 70 cases (a subset of the non-clinical test data).
- Head/Neck: 30 cases
- Pelvis: 40 cases
- Data Provenance: Same as non-clinical testing, as it was a subset.
- Total Cases: 70 cases (a subset of the non-clinical test data).
- Training Set: Not explicitly stated. The document mentions "separated from the development data cohorts before the models were trained," implying a training set existed but its size is not given.
3. Number of Experts and Qualifications for Ground Truth (Test Set)
- Number of Experts: Three (3) board-certified radiation oncologists.
- Qualifications: Two (2) from the USA, one (1) from Europe. All certified radiation oncologists. Experience level (e.g., "10 years experience") is not specified beyond "board certified."
4. Adjudication Method (Test Set)
- Non-Clinical/Bench Testing (Ground Truth Generation):
- Manual contours delineated by GEHC operators trained using international guidelines (DAHANCA, RTOG).
- Manual contours were revised (corrected and approved) by the three board-certified radiation oncologists.
- All three independently validated ground-truth contours were incorporated in the performance evaluation. This suggests a form of consensus or a voting mechanism, but the exact "adjudication" (e.g., 2+1, averaging) is not detailed. It implies that the final ground truth was derived from the combination of all three expert reviews.
- Clinical/Reader Study:
- Automated contours were scored by the three certified radiation oncologists.
- Readers completed their assessments independently and were blinded to the results of other readers' assessments.
- All three independently provided Likert Scores were incorporated in the performance evaluation. Similar to ground truth generation, the exact method of combining scores beyond "incorporating" is not specified, but the final reported value is a "Likert score MEAN."
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- Was it done? Yes, a multi-reader study was conducted to assess the adequacy of the contours. The readers were radiologists providing an assessment (Likert score) of the AI-generated contours.
- Effect Size of Human Readers' Improvement with AI vs. without AI Assistance: This study was structured to evaluate the adequacy of AI-generated contours for use in RT planning, with human readers providing an assessment of these pre-generated AI contours. It was not a comparative effectiveness study designed to measure the improvement in human reader performance when assisted by AI versus unassisted human performance (e.g., human-only contouring vs. AI-assisted human contouring). The study aimed to show that the AI's output is acceptable for human review and correction, not how much faster or more accurate humans become with the AI.
6. Standalone (Algorithm Only Without Human-in-the-Loop Performance)
- Was it done? Yes, the "Non-Clinical Testing" or "Bench Testing" section directly assesses the algorithm's standalone performance using DSC and HD95 metrics, comparing its output to expert-generated ground truth. The algorithm generates the initial contours, which are then evaluated for accuracy against the established ground truth.
7. Type of Ground Truth Used
- Non-Clinical/Bench Testing: Expert consensus (manual contours by trained operators, revised and approved by three board-certified radiation oncologists).
- Clinical/Reader Study: Expert opinion/assessment (Likert scores provided by three board-certified radiation oncologists on the adequacy of the AI-generated contours).
8. Sample Size for the Training Set
- The sample size for the training set is not explicitly provided in the document. It only states that the test data cases (27 head/neck, 23 pelvis) were "separated from the development data cohorts before the models were trained."
9. How the Ground Truth for the Training Set was Established
- The method for establishing ground truth for the training set is not explicitly detailed. It can be inferred that it followed a similar process to the test set ground truth (manual contouring by trained operators, potentially reviewed by experts), as it mentions "development data cohorts," but the specifics are absent.
Ask a specific question about this device
(87 days)
EFAI HCAPSeg is a software device intended to assist trained radiation oncology professionals, including, but not limited to, radiation oncologists, medical physicists, and dosimetrists, during their clinical workflows of radiation therapy treatment planning by providing initial contours of organs at risk on non-contrast CT images. EFAI HCAPSeg is intended to be used on adult patients only.
The contours are generated by deep-learning algorithms and then transferred to radiation therapy treatment planning systems. EFAI HCAPSeg must be used in conjunction with a DICOM-compliant treatment planning system to review and edit results generated. EFAI HCAPSeg is not intended to be used for decision making or to detect lesions.
EFAI HCAPSeg is an adjunct tool and is not intended to replace a clinician's judgment and manual contouring of the normal organs on CT. Clinicians must not use the software generated output alone without review as the primary interpretation.
EFAI RTSuite CT HCAP-Segmentation System, herein referred to as EFAI HCAPSeg, is a standalone software that is designed to be used by trained radiation oncology professionals to automatically delineate organs-at-risk (OARs) on CT images. This auto-contouring of OARs is intended to facilitate radiation therapy workflows.
The device receives CT images in DICOM format as input and automatically generates the contours of OARs, which are stored in DICOM format and in RTSTRUCT modality. The device does not offer a user interface and must be used in conjunction with a DICOM-compliant treatment planning system to review and edit results. Once data is routed to EFAI HCAPSeg, the data will be processed and no user interaction is required, nor provided.
The deployment environment is recommended to be in a local network with an existing hospital-grade IT system in place. EFAI HCAPSeg should be installed on a specialized server supporting deep learning processing. The configurations are only being operated by the manufacturer:
- Local network setting of input and output destinations;
- Presentation of labels and their color; ●
- Processed image management and output (RTSTRUCT) file management. ●
Here's a detailed breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided text:
Acceptance Criteria and Device Performance
| Acceptance Criteria Category | Specific Criteria | Reported Device Performance (EFAI HCAPSeg) | Statistical Result (p-value) |
|---|---|---|---|
| OARs Present in Both EFAI HCAPSeg and Comparison Device | The mean Dice Coefficient (DSC) of OARs for each body part (Head & Neck, Chest, Abdomen & Pelvis) should be non-inferior to that of the comparison device, with a pre-specified margin. | Overall Mean DSC: 0.83 (vs. 0.75 for Head & Neck, 0.84 for Chest, 0.82 for Abdomen & Pelvis in comparison device) | <.0001 (for all three body parts, indicating non-inferiority) |
| OARs Unique to EFAI HCAPSeg | The mean DSC of unique OARs should be superior to a pre-specified value. | Mean DSC: 0.82 | <.0001 (indicating superiority to pre-specified value) |
| Individual OAR DSC Performance | The referenced performance acts as the performance target for each individual OAR. For OARs present in both devices, the benchmark is non-inferiority to the comparison device. For unique OARs, the benchmark DSC is 0.80/0.65/0.50 for large/medium/small volume structures. | Detailed in Table G of the original document. EFAI HCAPSeg met or exceeded its referenced performance for all individual OARs where a comparison was made or a benchmark was set. For example, for A_Aorta, EFAI HCAPSeg Mean DSC was 0.86 compared to a referenced performance of 0.26 (presumably from the comparison device, indicating superiority). For Brain, EFAI HCAPSeg Mean DSC was 0.99 compared to a referenced performance of 0.86. | No specific p-values for each individual OAR against its reference are explicitly given in the summary tables but are generalized by the overall group p-values. The statement "The overall performance showed a mean DSC of 0.83 and the non-inferiority tests indicated that EFAI HCAPSeg successfully met the primary endpoint across all body parts" suggests these were met. |
| Overall Median 95% Hausdorff Distance (HD) | No explicit acceptance criterion stated as a specific value, but performance is compared against the comparison device. | Overall Median 95% HD: 2.23 mm | N/A (Overall median is reported, not directly compared to a specific acceptance value in this summary) |
| 95% HD for OARs Present in Both EFAI HCAPSeg and Comparison Device | While not explicitly stated as an "acceptance criterion" with a numerical threshold in the same way as DSC, the statistical results demonstrate significant improvement. | Median 95% HD (Head & Neck): 2.17 mm (vs. 3.09 mm for comparison device)Median 95% HD (Chest): 2.23 mm (vs. 3.87 mm for comparison device)Median 95% HD (Abdomen & Pelvis): 2.28 mm (vs. 3.90 mm for comparison device) | <.0001 (for all three body parts, indicating superiority) |
| 95% HD for OARs Unique to EFAI HCAPSeg | No explicit acceptance criterion stated, but the performance is presented. | Median 95% HD: 2.24 mm | <.0001 (indicating superiority to pre-specified performance for unique OARs, likely using an implicit benchmark similar to DSC) |
| Performance Across Subgroups | Device should consistently and reliably perform under varying gender, age group, CT manufacturer, and CT slice thickness. | Consistently High Performance (DSC: 0.82-0.89; HD: 2.20-2.30 mm) across all tested subgroups (Gender: Female, Male; Age: 18-49, 50-69, ≥70; CT Manufacturer: GE, Philips, Siemens, Others; CT Slice Thickness: 0.5-3.0 mm, 3.1-5.0 mm). | N/A (Subgroup analyses demonstrate consistency rather than direct pass/fail criteria) |
Study Details
-
Sample Size and Data Provenance (Test Set):
- Sample Size: 157 non-contrast CT cases.
- Data Provenance: Consecutively collected from the United States (U.S.). All data was acquired independently from product development training and internal testing. This indicates a prospective-like collection for validation.
- Demographics:
- 30.57% females, 57.96% males, 11.46% gender not reported.
- Mean age: 61.69 years (SD 11.90 years).
- CT Manufacturer: GE (43.31%), Philips (36.30%), Siemens (9.55%), Toshiba (2.55%), PNS (0.64%), not reported (7.64%).
- Location, Race, and Ethnic distribution: Unavailable.
- Cancer types: Head-and-neck, pancreas, colorectal, breast, bladder, prostate, stomach, gynecologic, sarcoma, and metastatic tumors from multiple origins.
-
Number of Experts and Qualifications (Ground Truth for Test Set):
- Number of Experts: Three (3) board-certified radiation oncologists.
- Qualifications: "Board-certified radiation oncologists." (Specific years of experience are not mentioned, but board certification implies a high level of expertise). Adjudication method is described next.
-
Adjudication Method (Test Set):
- The OAR contouring for the ground truth was generated by "three board-certified radiation oncologists as the ground truth (GT)." The text implicitly suggests a consensus or independent review that established the GT, but it doesn't specify a formal adjudication method like "2+1" or "3+1". It simply states that their combined work constituted the GT.
-
Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:
- No, a MRMC comparative effectiveness study focusing on human readers improving with AI vs. without AI assistance was not explicitly described.
- The study performed a standalone performance test comparing the algorithm's performance (EFAI HCAPSeg) against a "comparison device" (another algorithm/device, not human readers). The "comparison device" results are used as a benchmark for EFAI HCAPSeg's performance.
-
Standalone Performance Study (Algorithm Only):
- Yes, a standalone performance test was done.
- Method: The EFAI HCAPSeg's OAR contouring capabilities were compared against a "comparison device."
- Results (EFAI HCAPSeg vs. Comparison Device, Mean DSC):
- Head & Neck: 0.80 (EFAI HCAPSeg) vs. 0.75 (Comparison Device)
- Chest: 0.90 (EFAI HCAPSeg) vs. 0.84 (Comparison Device)
- Abdomen & Pelvis: 0.90 (EFAI HCAPSeg) vs. 0.82 (Comparison Device)
- Unique OARs of EFAI HCAPSeg: 0.82 (compared against a pre-specified value, not the comparison device)
- Results (EFAI HCAPSeg vs. Comparison Device, Median 95% HD mm):
- Head & Neck: 2.17 (EFAI HCAPSeg) vs. 3.09 (Comparison Device)
- Chest: 2.23 (EFAI HCAPSeg) vs. 3.87 (Comparison Device)
- Abdomen & Pelvis: 2.28 (EFAI HCAPSeg) vs. 3.90 (Comparison Device)
- Unique OARs of EFAI HCAPSeg: 2.24 (compared against a pre-specified value)
-
Type of Ground Truth Used (Test Set):
- Expert Consensus: The ground truth was established by "three board-certified radiation oncologists" who manually contoured each Organ-at-Risk (OAR).
-
Sample Size for Training Set:
- Total Cases: 1,410 adult cases.
- Number of Images: Varies significantly by OAR, ranging from hundreds to over 100,000 images per OAR (e.g., SpinalCord had 139,337 images). (Refer to Table C for detailed counts per OAR).
-
How the Ground Truth for the Training Set Was Established:
- The text states, "The process of ground-truthing, involving the manual contouring of each OAR, was undertaken by three board-certified radiation oncologists." It further adds, "The data collection and ground truth protocol was done following the identical procedures as those of the predicate device." While not explicitly stated for the training set ground truth, it is highly implied that the same process (manual contouring by three board-certified radiation oncologists) was used for both training and testing datasets for consistency, especially given the statement about identical procedures to the predicate device. The demographic information in Table B covers both training and testing datasets, reinforcing this.
Ask a specific question about this device
(175 days)
ClearCheck is intended to assist radiation therapy professionals in generating and assessing the quality of radiotherapy treatment plans. ClearCheck is also intended to assist radiation treatment planners in predicting when a treatment plan might result in a collision between the treatment machine and the patient or support structures.
The ClearCheck Model RADCC V2 device is software that uses treatment data, image data, and structure set data obtained from supported Treatment Planning System and Application Programming Interfaces to present radiotherapy treatment plans in a user-friendly way for user approval of the treatment plan. The ClearCheck device (Model RADCC V2) is also intended to assist users to identify where collisions between the treatment machine and the patient or support structures may occur in a treatment plan.
It is designed to run on Windows Operating Systems. ClearCheck Model RADCC V2 performs calculations on the incoming supported treatment data. Supported Treatment Planning Systems are used by trained medical professionals to simulate radiation therapy treatments for malignant or benign diseases.
The provided text describes the acceptance criteria and study for the ClearCheck Model RADCC V2 device, which assists radiation therapy professionals in generating and assessing treatment plans, including predicting potential collisions.
Here's a breakdown of the requested information:
1. Table of Acceptance Criteria and Reported Device Performance
| Acceptance Criteria | Reported Device Performance |
|---|---|
| BED / EQD2 Functionality | |
| Passing criteria for dose type constraints | 0.5% difference when compared to hand calculations using well-known BED/EQD2 formulas. |
| Passing criteria for Volume type constraints | 3% difference when compared to hand calculations using well-known BED/EQD2 formulas. |
| Deformed Dose Functionality | |
| Qualitative DVH analysis | Good agreement for all cases compared to known dose deformations. |
| Quantitative Dmax and Dmin differences | +/- 3% difference for deformed dose results compared to known dose deformations. |
| Overall Verification & Validation Testing | All test cases for BED/EQD2 and Deformed Dose functionalities passed. Overall software verification tests were performed to ensure intended functionality, and pass/fail criteria were used to verify requirements. |
2. Sample Size Used for the Test Set and Data Provenance
- Test Set Sample Size: The document does not explicitly state a specific numerical sample size for the test set used for the BED/EQD2 and Deformed Dose functionality validation. It mentions "all cases" for Deformed Dose and "a plan and plan sum" for BED/EQD2. This implies testing was done on an unspecified number of representative cases, but not a statistically powered cohort.
- Data Provenance: Not specified in the provided text. It does not mention the country of origin or whether the data was retrospective or prospective.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts
- The document does not mention the use of experts to establish ground truth for the test set.
- For BED/EQD2, the ground truth was established by "values calculated by hand using the well-known BED / EQD2 formulas."
- For Deformed Dose, the ground truth was established by "known dose deformations."
4. Adjudication Method for the Test Set
- Not applicable as there is no mention of expert review or adjudication for the test set. Ground truth was established by calculation or "known" deformations.
5. If a Multi Reader Multi Case (MRMC) Comparative Effectiveness Study Was Done, If So, What Was the Effect Size of How Much Human Readers Improve with AI vs Without AI Assistance
- No, a Multi Reader Multi Case (MRMC) comparative effectiveness study was not performed. The document explicitly states: "no clinical trials were performed for ClearCheck Model RADCC V2." The device is intended to "assist radiation therapy professionals," but its impact on human reader performance was not evaluated through a clinical study.
6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done
- Yes, performance evaluations for the de novo functionalities (BED/EQD2 and Deformed Dose) appear to be standalone algorithm performance assessments. The device's calculations were compared against established mathematical formulas (BED/EQD2) or known deformations (Deformed Dose) without human intervention in the evaluation process.
7. The Type of Ground Truth Used (expert consensus, pathology, outcomes data, etc.)
- BED/EQD2: Ground truth was based on "values calculated by hand using the well-known BED / EQD2 formulas." This is a computational/mathematical ground truth.
- Deformed Dose: Ground truth was based on "known dose deformations." This implies a physically or computationally derived ground truth where the expected deformation results were already established.
8. The Sample Size for the Training Set
- The document does not specify a sample size for the training set. It primarily focuses on the validation of new features against calculated or known results, rather than reporting on a machine learning model's training data.
9. How the Ground Truth for the Training Set Was Established
- The document does not provide information on how the ground truth for any training set was established. Given the nature of the device (software for calculations and collision prediction, building on predicate devices), it's possible that analytical methods and established physics/dosimetry principles form the basis, rather than a large labeled training dataset in the typical machine learning sense for image interpretation.
Ask a specific question about this device
Page 1 of 1