Search Results
Found 30 results
510(k) Data Aggregation
(54 days)
QDQ
Saige-Dx analyzes digital breast tomosynthesis (DBT) mammograms to identify the presence or absence of soft tissue lesions and calcifications that may be indicative of cancer. For a given DBT mammogram, Saige-Dx analyzes the DBT image stacks and the accompanying 2D images, including full field digital mammography and/or synthetic images. The system assigns a Suspicion Level, indicating the strength of suspicion that cancer may be present, for each detected finding and for the entire case. The outputs of Saige-Dx are intended to be used as a concurrent reading aid for interpreting physicians on screening mammograms with compatible DBT hardware.
Saige-Dx is a software device that processes screening mammograms using artificial intelligence to aid interpreting radiologists. By automatically detecting the presence or absence of soft tissue lesions and calcifications in mammography images, Saige-Dx can help improve reader performance, while also reducing reading time. The software takes as input a set of x-ray mammogram DICOM files from a single digital breast tomosynthesis (DBT) study and generates finding-level outputs for each image analyzed, as well as an aggregate case-level assessment. Saige-Dx processes both the DBT image stacks and the associated 2D images (full-field digital mammography (FFDM) and/or synthetic 2D images) in a DBT study. For each image, Saige-Dx outputs bounding boxes circumscribing any detected findings and assigns a Finding Suspicion Level to each finding, indicating the degree of suspicion that the finding is malignant. Saige-Dx uses the results of the finding-level analysis to generate a Case Suspicion Level, indicating the degree of suspicion for malignancy across the case. Saige-Dx encapsulates the finding and case-level results into a DICOM Structured Report (SR) object containing markings that can be overlaid on the original mammogram images using a viewing workstation and a DICOM Secondary Capture (SC) object containing a summary report of the Saige-Dx results.
Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided FDA 510(k) clearance letter for Saige-Dx:
1. Table of Acceptance Criteria and Reported Device Performance
The provided document indicates that the primary endpoint of the standalone performance testing was to demonstrate non-inferiority of the subject device (new Saige-Dx version) to the predicate device (previous Saige-Dx version). Specific quantitative acceptance criteria (e.g., AUC, sensitivity, specificity thresholds) are not explicitly stated in the provided text. However, the document states:
"The test met the pre-specified performance criteria, and the results support the safety and effectiveness of Saige-Dx updated AI model on Hologic and GE exams."
Acceptance Criteria (Not explicitly quantified in source) | Reported Device Performance |
---|---|
Non-inferiority of subject device performance to predicate device performance. | "The test met the pre-specified performance criteria, and the results support the safety and effectiveness of Saige-Dx updated AI model on Hologic and GE exams." |
Performance across breast densities, ages, race/ethnicities, and lesion types and sizes. | Subgroup analyses "demonstrated similar standalone performance trends across breast densities, ages, race/ethnicities, and lesion types and sizes." |
Software design and implementation meeting requirements. | Verification testing including unit, integration, system, and regression testing confirmed "the software, as designed and implemented, satisfied the software requirements and has no unintentional differences from the predicate device." |
2. Sample Size for the Test Set and Data Provenance
- Sample Size for Test Set: 2,002 DBT screening mammograms from unique women.
- 259 cancer cases
- 1,743 non-cancer cases
- Data Provenance:
- Country of Origin: United States (cases collected from 12 diverse clinical sites).
- Retrospective or Prospective: Retrospective.
- Acquisition Equipment: Hologic (standard definition and high definition) and GE images.
3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications
The document mentions: "The case collection and ground truth lesion localization processes of the newly collected cases were the same processes used for the previously collected test dataset (details provided in K220105)."
- While the specific number and qualifications of experts for the ground truth of the current test set are not explicitly detailed in this document, it refers back to K220105 for those details. It implies that a standardized process involving experts was used.
4. Adjudication Method for the Test Set
The document does not explicitly describe the adjudication method (e.g., 2+1, 3+1) used for establishing ground truth for the test set. It states: "The case collection and ground truth lesion localization processes of the newly collected cases were the same processes used for the previously collected test dataset (details provided in K220105)." This suggests a pre-defined and presumably robust method for ground truth establishment.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- Was it done? Yes.
- Effect Size: The document states: "a multi-reader multi-case (MRMC) study was previously conducted for the predicate device and remains applicable to the subject device." It does not provide details on the effect size (how much human readers improve with AI vs. without AI assistance) within this document. Readers would need to refer to the K220105 submission for that information if it was presented there.
6. Standalone (Algorithm Only) Performance Study
- Was it done? Yes.
- Description: "Validation of the software was conducted using a retrospective and blinded multicenter standalone performance testing under an IRB approved protocol..."
- Primary Endpoint: "to demonstrate that the performance of the subject device was non-inferior to the performance of the predicate device."
7. Type of Ground Truth Used
- The ground truth involved the presence or absence of cancer, with cases categorized as 259 cancer and 1,743 non-cancer. The mention of "ground truth lesion localization processes" implies a detailed assessment of findings, likely involving expert consensus and/or pathology/biopsy results to confirm malignancy. Given it's a diagnostic aid for cancer, pathology is the gold standard for confirmation.
8. Sample Size for the Training Set
- Training Dataset: 161,323 patients and 300,439 studies.
9. How the Ground Truth for the Training Set Was Established
- The document states: "The Saige-Dx algorithm was trained on a robust and diverse dataset of mammography exams acquired from multiple vendors including GE and Hologic equipment."
- While it doesn't explicitly detail the method of ground truth establishment for the training set (e.g., expert consensus, pathology reports), similar to the test set, for a cancer detection AI, it is highly probable that the ground truth for the training data was derived from rigorous clinical assessments, including follow-up, biopsy results, and/or expert interpretations, to accurately label cancer and non-cancer cases for the algorithm to learn from. The implied "robust and diverse" nature of the training data suggests a comprehensive approach to ground truth.
Ask a specific question about this device
(279 days)
QDQ
Genius AI Detection is a computer-aided detection and diagnosis (CADe/CADx) software device intended to be used with compatible digital breast tomosynthesis (DBT) systems to identify and mark regions of interest including soft tissue densities (masses, architectural distortions and asymmetries) and calcifications in DBT exams from compatible DBT systems and provide confidence scores that offer assessment for Certainty of Findings and a Case Score.
The device intends to aid in the interpretation of digital breast tomosynthesis exams in a concurrent fashion, where the interpreting physician confirms or dismisses the findings during the reading of the exam.
Genius AI Detection 2.0 is a software device intended to identify potential abnormalities in breast tomosynthesis images. Genius AI Detection 2.0 analyzes each standard mammographic view in a digital breast tomosynthesis examination using deep learning networks. For each detected lesion, Genius AI Detection 2.0 produces CAD results that include:
- the location of the lesion;
- an outline of the lesion;
- a confidence score for the lesion
- Genius AI Detection 2.0 also produces a case score for the entire breast tomosynthesis exam.
Genius AI Detection 2.0 packages all CAD findings derived from the corresponding analysis of a tomosynthesis exam into a DICOM Mammography CAD SR object and distributes it for display on DICOM compliant review workstations. The interpreting physician will have access to the CAD findings concurrently to the reading of the tomosynthesis exam. In addition, a combination of peripheral information such as number of marks and case scores may be used on the review workstation to enhance the interpreting physician's workflow by offering a better organization of the patient worklist.
Here's a breakdown of the acceptance criteria and study details for Genius AI Detection 2.0, based on the provided FDA 510(k) clearance letter:
Acceptance Criteria and Device Performance for Genius AI Detection 2.0
1. Table of Acceptance Criteria and Reported Device Performance
The provided document describes a non-inferiority study to demonstrate that the performance of Genius AI Detection 2.0 on Envision (ENV) images is equivalent to its performance on the predicate's Standard of Care (SOC) images (Hologic's Selenia Dimensions systems). The primary acceptance criterion was non-inferiority of the Area Under the Curve (AUC) of the ROC curve, with a 5% margin. Secondary metrics included sensitivity, specificity, and false marker rate per view.
Acceptance Criteria Category | Specific Metric | Predicate Device Performance (SOC Images) | Subject Device Performance (ENV Images) | Acceptance Criteria Met? |
---|---|---|---|---|
Primary Endpoint (Non-Inferiority) | AUC of ROC Curve (ENV-SOC) | N/A (Comparison study) | -0.0017 (95% CI -0.023 - 0.020) | Yes (p-value for difference = 0.87, indicating no significant difference, and within 5% non-inferiority margin) |
Secondary Metrics | Sensitivity | N/A (Comparison study) | No significant difference reported between modalities | Yes |
Specificity | N/A (Comparison study) | No significant difference reported between modalities | Yes | |
False Marker Rate per View | N/A (Comparison study) | No significant difference reported between modalities | Yes | |
CC-MLO Correlation | Accuracy on Malignant Lesions | N/A | 90% | Yes (Considered accurate) |
Accuracy on Negative Cases (Correlated pairs) | N/A | 73% | Yes (Considered accurate) | |
Implant Cases | Location-specific cancer detection sensitivity | N/A | 76% (CI 68%~84%) | Yes (Considered acceptable based on confidence intervals) |
Specificity | N/A | 67% (CI 62%~72%) | Yes (Considered acceptable based on confidence intervals) |
(Note: The document focuses on demonstrating equivalence to the predicate's performance on a new platform rather than absolute performance against a fixed threshold for all metrics, except for the implant case where specific CIs are given and deemed acceptable.)
2. Sample Size Used for the Test Set and Data Provenance
- Sample Size (Main Comparison Study): 1475 subjects
- 200 biopsy-proven cancer subjects
- 275 biopsy-proven benign subjects
- 78 BI-RADS 3 subjects (considered BI-RADS 1 or 2 upon diagnostic workup)
- 922 BI-RADS 1 and 2 subjects (at screening)
- Implant Case Test Set: 480 subjects
- 132 biopsy-proven cancer subjects
- 348 negative subjects (119 biopsy-proven benign, 229 screening negative)
- Data Provenance:
- Country of Origin: Not explicitly stated, but collected from a "national multi-center breast imaging network" within the U.S., implying U.S. origin.
- Retrospective or Prospective: The main comparison study data was collected for evaluating the safety and effectiveness of the Envision platform, with an IRB approved protocol. This suggests a retrospective study design, where existing images were gathered for evaluation. The implant cases were collected between 2015 and 2022, also indicating a retrospective approach.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications
- Number of Experts: Two
- Qualifications: Both were MQSA-certified radiologists with over 20 years of experience.
4. Adjudication Method for the Test Set
The document explicitly states that the "ground truthing to evaluate performance metrics including the locations of cancer lesions was done by two MQSA-certified radiologists with over 20 years of experience."
- Adjudication Method: It does not specify a particular adjudication method (e.g., 2+1, 3+1). It simply states that ground truthing was done by two experts. This implies either consensus was reached between the two, or potentially an unstated arbitration method if they disagreed, or that their individual findings were used for analysis. Given the phrasing, expert consensus is the most likely implied method, but not explicitly detailed.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done
- No, an MRMC comparative effectiveness study was NOT done. The study described is a standalone performance comparison of the AI algorithm on images from different modalities (Envision vs. Standard of Care), not a study involving human readers with and without AI assistance to measure effect size.
6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) was done
- Yes, a standalone study WAS done. The document explicitly states, "A standalone study was conducted to compare the detection performance of FDA cleared Genius AI Detection 2.0 (K221449) using Standard of Care (SOC) images acquired on the Dimensions systems against images acquired on the FDA approved Envision Mammography Platform (P080003/S009)." This study evaluated the algorithm's performance (fROC, ROC, sensitivity, specificity, false marker rate) directly against the ground truth without human intervention.
7. The Type of Ground Truth Used
- Ground Truth Type: A combination of biopsy-proven cancer and biopsy-proven benign cases, along with BI-RADS diagnostic outcomes (for negative cases). For the cancer cases, the "locations of cancer lesions" were part of the ground truth.
8. The Sample Size for the Training Set
- Not provided. The document states that the test dataset was "sequestered from any training datasets by isolating it on a secured server with controlled access permissions" and that the data for implant cases was "sequestered from the training datasets for Genius AI Detection." However, the actual sample size of the training set is not mentioned.
9. How the Ground Truth for the Training Set Was Established
- Not provided. Since the training set sample size and details are not disclosed, the method for establishing its ground truth is also not mentioned in this document. It is generally assumed that similar rigorous methods (e.g., biopsy-proven truth, expert review) would have been used for training data, but this specific filing does not detail it.
Ask a specific question about this device
(216 days)
QDQ
MammoScreen® 4 is a concurrent reading and reporting aid for physicians interpreting screening mammograms. It is intended for use with compatible full-field digital mammography and digital breast tomosynthesis systems. The device can also use compatible prior examinations in the analysis.
Output of the device includes graphical marks of findings as soft-tissue lesions or calcifications on mammograms along with their level of suspicion scores. The lesion type is characterized as mass/asymmetry, distortion, or calcifications for each detected finding. The level of suspicion score is expressed at the finding level, for each breast, and overall for the mammogram.
The location of findings, including quadrant, depth, and distance from the nipple, is also provided. This adjunctive information is intended to assist interpreting physicians during reporting.
Patient management decisions should not be made solely based on the analysis by MammoScreen 4.
MammoScreen 4 is a concurrent reading medical software device using artificial intelligence to assist radiologists in the interpretation of mammograms.
MammoScreen 4 processes the mammogram(s) and detects findings suspicious for breast cancer. Each detected finding gets a score called the MammoScreen Score™. The score was designed such that findings with a low score have a very low level of suspicion. As the score increases, so does the level of suspicion. For each mammogram, MammoScreen 4 outputs the detected findings with their associated score, a score per breast, driven by the highest finding score for each breast, and a score per case, driven by the highest finding score overall. The MammoScreen Score goes from one to ten.
MammoScreen 4 is available for 2D (FFDM images) and 3D processing (FFDM & DBT or 2DSM & DBT). Optionally, MammoScreen 4 can use prior examinations in the analysis.
The results indicating potential breast cancer, identified by MammoScreen 4, are accessible via a dedicated user interface and can seamlessly integrate into DICOM viewers (using DICOM-SC and DICOM-SR). Reporting aid outputs can be incorporated into the practice's reporting system to generate a preliminary report.
Note that the MammoScreen 4 outputs should be used as complementary information by radiologists while interpreting mammograms. For all cases, the medical professional interpreting the mammogram remains the sole decision-maker.
The provided text describes the acceptance criteria and a study to prove that MammoScreen® 4 meets these criteria. Here is a breakdown of the requested information:
Acceptance Criteria and Device Performance
1. Table of Acceptance Criteria and Reported Device Performance
Rationale for using "MammoScreen 2" data for comparison: The document states that the standalone testing for MammoScreen 4 compared its performance against "MammoScreen 2 on Dimension". While MammoScreen 3 is the predicate device, the provided performance data in the standalone test section specifically refers to MammoScreen 2. The PCCP section later references performance targets for MammoScreen versions 1, 2, and 3, but the actual "Primary endpoint" results for the current device validation are given in comparison to MammoScreen 2. Therefore, the table below uses the reported performance against MammoScreen 2 as per the "Primary endpoint" section.
Metric | Acceptance Criteria | Reported Device Performance (MammoScreen 4 vs. MammoScreen 2) |
---|---|---|
Primary Objective | Non-inferiority in standalone cancer detection performance compared to the previous version of MammoScreen (specifically MammoScreen 2 on Dimension). | Achieved. |
AUC at the mammogram level | Positive lower bound of the 95% CI of the difference in endpoints between MammoScreen 4 and MammoScreen 2. | MS4: 0.894 (0.870, 0.919) |
MS2: 0.867 (0.839, 0.896) | ||
Δ: 0.027 (0.002, 0.052), p |
Ask a specific question about this device
(193 days)
QDQ
QP-Prostate® CAD is a Computed Aided Detection and Diagnosis (CADe/CADx) image processing software that automatically detects and identifies suspected lesions in the prostate gland based on bi-parametric prostate MRI. The software is intended to be used as a concurrent read by physicians with proper training in a clinical setting as an aid for interpreting prostate MRI studies. The results can be displayed in a variety of DICOM outputs, including identified suspected lesions marked as an overlay onto source MR images. The output can be displayed on third-party DICOM workstations and Picture Archive and Communication Systems (PACS). Patient management decisions should not be based solely on the results of QP-Prostate® CAD.
QP-Prostate® CAD is an artificial intelligence-based Computed Aided Detection and Diagnosis (CADe/CADx) image processing software. QP-Prostate® CAD uses Al-based algorithms trained with pathology data to detect suspicious lesions for clinically significant prostate cancer. The device automatically detects and identifies suspected lesions in the prostate gland based on bi-parametric prostate MRI and provides marks over regions of the images that may contain suspected lesions. There are two possible markers that are provided in different colors suggesting different levels of suspicion of clinically significant prostate cancer (moderate or high suspicion level).
The software is intended to be used as a concurrent read by physicians with proper training in a clinical setting as an aid for interpreting prostate MRI studies. The results can be displayed in a variety of DICOM outputs, including identified suspected lesions marked as an overlay onto source MR images. The output can be displayed on third-party DICOM workstations and Picture Archive and Communication Systems (PACS). Based on biparametric input consisting of T2W and DWI series, the analysis is run automatically, and the output in standard DICOM formats is returned to PACS.
Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:
Acceptance Criteria and Reported Device Performance
Table 1: Acceptance Criteria and Reported Device Performance (Standalone)
Metric (lesion level) | Acceptance Criterion (Implicit) | Reported Device Performance |
---|---|---|
AUC-ROC | Evidence of good discriminatory ability (e.g., above a certain threshold) | 0.732 (95% CI: 0.668-0.791) |
Sensitivity (high suspicion marker) | Evidence of good detection rate for clinically significant findings | 0.677 (95% CI: 0.593-0.761) |
False Positive Rate per Case (high suspicion marker, any biopsy source) | Evidence of acceptable false positive rate | 0.417 (95% CI: 0.313-0.522) |
Sensitivity (high and moderate suspicion markers) | Evidence of good detection rate for clinically significant findings | 0.795 (95% CI: 0.722-0.861) |
False Positive Rate per Case (high and moderate suspicion markers, any biopsy source) | Evidence of acceptable false positive rate | 0.855 (95% CI: 0.709-0.996) |
Note: The document does not explicitly state numerical acceptance criteria thresholds for the standalone performance metrics (AUC-ROC, Sensitivity, FPR). Instead, it presents the results and implies that these values "demonstrate the safety and effectiveness" in comparison to the predicate device. The general implicit acceptance criterion for these metrics would be that they exhibit performance levels indicative of a useful diagnostic aid.
Table 2: Acceptance Criteria and Reported Device Performance (Multi-Reader Multi-Case Study)
Metric | Acceptance Criterion (Explicit) | Reported Device Performance |
---|---|---|
ΔAUC (AUCaided - AUCunaided) (Primary Endpoint) | A statistically significant improvement (p-value |
Ask a specific question about this device
(258 days)
QDQ
Prostate MR AI is a plug-in Radiological Computer Assisted Detection and Diagnosis Software device intended to be used · with a separate hosting application · as a concurrent reading aid to assist radiologists in the interpretation of a prostate MRI examination acquired according to the PI-RADS standard · in adult men (40 years and older) with suspected cancer in treatment naïve prostate glands The plug-in software analyzes non-contrast T2 weighted (T2W) and diffusion weighted image (DWI) series to segment the prostate gland and to provide an automatic detection and segmentation of regions suspicious for cancer. For each suspicious region detected, the algorithm moreover provides a lesion Score, by way of PI-RADS interpretation suggestion. Outputs of the device should be interpreted consistently with ACR recommendations using all available MR data (e.g., dynamic contrast enhanced images [if available]). Patient management decisions should not be made solely based on analysis by the Prostate MR AI algorithm.
This premarket notification addresses the Siemens Healthineers Prostate MR AI (VA10A) Radiological Computer Assisted Detection and Diagnosis Software (CADe/CADx). Prostate MR AI is a Computer Assisted Detection and Diagnosis algorithm designed to plug into a hosting workflow that assists radiologists in the detection of suspicious lesions and their classification. It is used as a concurrent reading aid to assist radiologists in the interpretation of a prostate MRI examination acquired according to the PI-RADS standard. The automatic lesion detection requires transversal T2W and DWI series as inputs. The device automatically exports a list of detected prostate regions that are suspicious for cancer (each list entry consists of contours and a classification by Score and Level of Suspicion (LoS)), a computed suspicion map, and a per-case LoS. The results of the Prostate MR AI plug-in (with the case-level LoS, lesion center points, lesion diameters, lesion ADC median, lesion 10th percentile, suspicion map, and non-PZ segmentation considered optional) are to be shown in a hosting application that allows the radiologist to view the original case, as well as confirm, reject, or edit lesion candidates with their contours and Scores as generated by the Prostate MR AI plug-in. Moreover, the radiologist can add lesions with contours and PI-RADS scores and finalize the case. In addition, the outputs include an automatically computed prostate segmentation, as well as sub-segmentations of the peripheral zone and the rest of the prostate (non-PZ). The algorithm will augment the prostate workflow of currently cleared syngo.MR General Engine if activated via a separate license on the General Engine.
Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:
Acceptance Criteria and Reported Device Performance
Acceptance Criteria | Reported Device Performance |
---|---|
Automatic Prostate Segmentation | |
Median Dice score between AI algorithm results and ground truth masks exceeds 0.9. | The median of the Dice score between the AI algorithm results and the corresponding ground truth masks exceeds the threshold of 0.9. |
Median normalized volume difference between algorithm results and ground truth masks is within ±5%. | The median of the normalized volume difference between the algorithm results and the corresponding ground truth masks is within a ±5% range. |
AI algorithm results are statistically non-inferior to individual reader variability (5% margin of error, 5% significance level). | The AI algorithm results as compared to any individual reader are statistically non-inferior based on variabilities that existed among the individual readers within the 5% margin of error and 5% significance level. |
Prostate Lesion Detection and Classification | |
Case-level sensitivity of lesion detection ≥ 0.80 for both radiology and pathology ground truth. | The case-level sensitivity of the lesion detection is equal or greater than 0.80 for both radiology and pathology ground truth. |
False positive rate per case of lesion detection |
Ask a specific question about this device
(20 days)
QDQ
Saige-Dx analyzes digital breast tomosynthesis (DBT) mammograms to identify the presence or absence of soft tissue lesions and calcifications that may be indicative of cancer. For a given DBT mammogram, Saige-Dx analyzes the DBT image stacks and the accompanying 2D images, including full field digital mammography and/or synthetic images. The system assigns a Suspicion Level, indicating the strength of suspicion that cancer may be present, for each detected finding and for the entire case. The outputs of Saige-Dx are intended to be used as a concurrent reading aid for interpreting physicians on screening mammograms with compatible DBT hardware.
Saige-Dx is a software device that processes screening mammograms using artificial intelligence to aid interpreting radiologists. By automatically detecting the presence or absence of soft tissue lesions and calcifications in mammography images, Saige-Dx can help improve reader performance, while also reducing time. The software takes as input a set of x-ray mammogram DICOM files from a single digital breast tomosynthesis (DBT) study and generates finding-level outputs for each image analyzed, as well as an aggregate case-level assessment. Saige-Dx processes both the DBT image stacks and the associated 2D images (full-field digital mammography (FFDM) and/or synthetic 2D images) in a DBT study. For each image, Saige-Dx outputs bounding boxes circumscribing any detected findings and assigns a Finding Suspicion Level to each finding, indicating the degree of suspicion that the finding is malignant. Saige-Dx uses the results of the finding-level analysis to generate a Case Suspicion Level, indicating the degree of suspicion for malignancy across the case. Saige-Dx encapsulates the finding and case-level results into a DICOM Structured Report (SR) object containing markings that can be overlaid on the original mammogram images using a viewing workstation and a DICOM Secondary Capture (SC) object containing a summary report of the Saige-Dx results.
The provided text describes the Saige-Dx (v.3.1.0) device and its performance testing as part of an FDA 510(k) submission (K243688). However, it does not contain specific acceptance criteria values or the quantitative results of the device's performance against those criteria. It states that "All tests met the pre-specified performance criteria," but does not list those criteria or the measured performance metrics.
Therefore, while I can extract information related to the different aspects of the study, I cannot create a table of acceptance criteria and reported device performance with specific values.
Here's a breakdown of the information available based on your request:
1. A table of acceptance criteria and the reported device performance
- Acceptance Criteria: Not explicitly stated in quantitative terms. The document only mentions that "All tests met the pre-specified performance criteria."
- Reported Device Performance: Not explicitly stated in quantitative terms (e.g., specific sensitivity, specificity, AUC values, or improvements in human reader performance).
2. Sample sized used for the test set and the data provenance (e.g. country of origin of the data, retrospective or prospective)
- Test Set Sample Size: Not explicitly stated for the validation performance study. The text mentions "Validation of the software was previously conducted using a multi-reader multi-case (MRMC) study and standalone performance testing conducted under approved IRB protocols (K220105 and K241747)." It also mentions that the tests included "DBT screening mammograms with Hologic standard definition and HD images, GE images, exams with unilateral breasts, and from patients with breast implants (on implant displaced views)."
- Data Provenance: The data for the training set was collected from "multiple vendors including GE and Hologic equipment" and from "diverse practices with the majority from geographically diverse areas within the United States, including New York and California." For the test set, it is implied to be similar in nature as it's part of the overall "performance testing," but specific details for the test set alone are not provided regarding country of origin or retrospective/prospective nature. However, since it involves IRB protocols, it suggests a structured, likely prospective collection or at least a carefully curated retrospective collection.
3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts (e.g. radiologist with 10 years of experience)
- Not explicitly stated for the test set. The document indicates that a Multi-Reader Multi-Case (MRMC) study was performed, which implies the involvement of expert readers, but the number of experts and their qualifications are not detailed.
4. Adjudication method (e.g. 2+1, 3+1, none) for the test set
- Not explicitly stated for the test set. The involvement of an MRMC study suggests a structured interpretation process, potentially including adjudication, but the method (e.g., consensus, majority rule with an adjudicator) is not described.
5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance
- Yes, an MRMC study was done: "Validation of the software was previously conducted using a multi-reader multi-case (MRMC) study..."
- Effect Size: The document does not provide the quantitative effect size of how much human readers improved with AI vs. without AI assistance. It broadly states that Saige-Dx "can help improve reader performance, while also reducing time."
6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done
- Yes, standalone performance testing was done: "...and standalone performance testing conducted under approved IRB protocols..."
- Results: The document states that "All tests met the pre-specified performance criteria" for the standalone performance, but does not provide the specific quantitative results (e.g., sensitivity, specificity, AUC).
7. The type of ground truth used (expert consensus, pathology, outcomes data, etc)
- Not explicitly stated. For a device identifying "soft tissue lesions and calcifications that may be indicative of cancer," ground truth would typically involve a combination of biopsy/pathology results, clinical follow-up, and potentially expert consensus on imaging in cases without definitive pathology. However, the document doesn't specify the exact method for establishing ground truth for either the training or test sets.
8. The sample size for the training set
- Training Set Sample Size: "A total of nine datasets comprising 141,768 patients and 316,166 studies were collected..."
9. How the ground truth for the training set was established
- Not explicitly stated. The document mentions the collection of diverse datasets for training but does not detail how the ground truth for these 141,768 patients and 316,166 studies was established (e.g., through radiologists' interpretations, pathology reports, clinical outcomes).
Ask a specific question about this device
(153 days)
QDQ
Transpara software is intended for use as a concurrent reading aid for physicians interpreting screening full-field digital mammography exams and digital breast tomosynthesis exams from compatible FFDM and DBT systems, to identify regions suspicious for breast cancer and assess their likelihood of malignancy. Output of the device includes locations of calcifications groups and soft-tissue regions, with scores indicating the likelihood that cancer is present, and an exam score indicating the likelihood that cancer is present in the exam. Patient management decisions should not be made solely on the basis of analysis by Transpara.
Transpara is a software only application designed to be used by physicians to improve interpretation of full-field digital mammography (FFMD) and digital breast tomosynthesis (DBT). Deep learning algorithms are applied to images for recognition of suspicious calcifications and soft tissue lesions (including densities, masses, architectural distortions, and asymmetries). Algorithms are trained with a large database of biopsy-proven examples of breast cancer, benign abnormalities, and examples of normal tissue.
Transpara offers the following functions which may be used at any time in the reading process, to improve detection and characterization of abnormalities and enhance workflow:
- AI findings for display in the images to highlight locations where the device detects suspicious calcifications or soft tissue lesions, along with region scores per finding on a scale ranging from 1-100, with higher scores indicating a higher level of suspicion.
- Links between corresponding regions in different views of the breast, which may be utilized to enhance user interfaces and workflow.
- An exam-based score which categorizes exams with increasing likelihood of cancer on a scale of 1-10 or in three risk categories labeled as 'low', 'intermediate' or 'elevated'.
The concurrent use indication implies that it is up to the users to decide how to use Transpara in the reading process. Transpara functions can be used before, during or after visual interpretation of an exam by a user.
Results of Transpara are computed in a standalone processing appliance which accepts mammograms in DICOM format as input, processes them, and sends the processing output to a destination using the DICOM protocol in a standardized mammography CAD DICOM format. Common destinations are medical workstations, PACS and RIS. The system can be configured using a service interface. Implementation of a user interface for end users in a medical workstation is to be provided by third parties.
The provided text describes the acceptance criteria and a study that proves the device, Transpara (2.1.0), meets these criteria.
Here's an organized breakdown of the information requested:
Acceptance Criteria and Reported Device Performance
The acceptance criteria are implicitly defined by the reported performance metrics. The study aims to demonstrate non-inferiority and superiority to the predicate device, Transpara 1.7.2. The key metrics reported are sensitivity at various specificity levels and Exam-based Area Under the Receiver Operating Characteristic Curve (AUC).
Table 1: Acceptance Criteria (Implied by Performance Goals) and Reported Device Performance (Standalone without Temporal Analysis)
Metric | Acceptance Criteria (Implied/Target) | Reported Performance (FFDM) | Reported Performance (DBT) |
---|---|---|---|
Sensitivity (Sensitive Mode @ 70% Specificity) | Non-inferior & Superior to Predicate Device 1.7.2 (quantitative value not specified, but implied by comparison) | 97.4% (96.3 - 98.5) | 96.9% (95.5 - 98.3) |
Sensitivity (Specific Mode @ 80% Specificity) | Non-inferior & Superior to Predicate Device 1.7.2 | 95.2% (93.7 - 96.7) | 95.1% (93.3 - 96.8) |
Sensitivity (Elevated Risk @ 97% Specificity) | Non-inferior & Superior to Predicate Device 1.7.2 | 80.8% (78.0 - 83.6) | 78.4% (75.1 - 81.7) |
Exam-based AUC | Non-inferior & Superior to Predicate Device 1.7.2 | 0.960 (0.953 - 0.966) | 0.955 (0.947 - 0.963) |
Table 2: Acceptance Criteria (Implied by Performance Goals) and Reported Device Performance (Standalone with Temporal Analysis - TA)
Metric | Acceptance Criteria (Implied/Target) | Reported Performance (FFDM with TA) | Reported Performance (DBT with TA) |
---|---|---|---|
Sensitivity (Sensitive Mode @ 70% Specificity) | Superior to performance without temporal comparison | 95.7% (93.7 - 97.6) | 94.6% (91.2 - 98.0) |
Sensitivity (Specific Mode @ 80% Specificity) | Superior to performance without temporal comparison | 95.4% (93.4 - 97.4) | 91.0% (86.7 - 95.4) |
Sensitivity (Elevated Risk @ 97% Specificity) | Superior to performance without temporal comparison | 82.7% (79.1 - 86.4) | 74.9% (68.3 - 81.4) |
Exam-based AUC | Superior to performance without temporal comparison | 0.958 (0.946 - 0.969) | 0.941 (0.921 - 0.958) |
Study Details
-
Sample Size Used for the Test Set and Data Provenance:
- Main Test Set (without temporal analysis): 10,207 exams (5,730 FFDM, 4,477 DBT).
- Normal: 8,587 exams
- Benign: 270 exams
- Cancer: 1,350 exams (750 FFDM, 600 DBT)
- Temporal Analysis Test Set: 5,724 exams (4,266 FFDM, 1,458 DBT).
- Normal: 4,998 exams
- Benign: 83 exams
- Cancer: 643 exams (471 FFDM, 172 DBT)
- Data Provenance: Independent dataset acquired from multiple centers in seven EU countries and the US. Retrospective in nature, as it was acquired and not used for algorithm development and included normal exams with at least one year follow-up. The data included images from various manufacturers (Hologic, GE, Philips, Siemens, Fujifilm).
- Main Test Set (without temporal analysis): 10,207 exams (5,730 FFDM, 4,477 DBT).
-
Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of those Experts:
- The document states that the cancer cases in the test set were "biopsy-proven cancer." It does not specify the number or qualifications of experts used to establish the ground truth for the entire test set (including normal and benign cases, and detailed lesion characteristics). The mechanism for establishing the "normal" and "benign" status is not explicitly detailed beyond "normal follow-up of at least one year."
-
Adjudication Method for the Test Set:
- The document does not explicitly describe an adjudication method involving multiple readers for establishing ground truth for the test set. The ground truth for cancer cases is stated as "biopsy-proven."
-
If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done:
- No, the document does not describe a Multi-Reader Multi-Case (MRMC) comparative effectiveness study. The performance assessment is a standalone evaluation of the algorithm's performance, not a human-in-the-loop study comparing human readers with and without AI assistance.
-
If a Standalone (i.e., algorithm only without human-in-the-loop performance) was done:
- Yes, standalone performance tests were conducted. The results presented in Tables 2 and 5 are for the algorithm's performance only.
-
The Type of Ground Truth Used:
- The primary ground truth for cancer cases is biopsy-proven cancer. For normal exams within the test set, the ground truth was established by "a normal follow-up of at least one year," implying outcomes data (absence of diagnosed cancer over a follow-up period).
-
The Sample Size for the Training Set:
- The document does not explicitly state the sample size of the training set. It mentions "Deep learning algorithms are applied to images for recognition of suspicious calcifications and soft tissue lesions... Algorithms are trained with a large database of biopsy-proven examples of breast cancer, benign abnormalities, and examples of normal tissue."
-
How the Ground Truth for the Training Set Was Established:
- The ground truth for the training set was established using a "large database of biopsy-proven examples of breast cancer, benign abnormalities, and examples of normal tissue." This implies a similar methodology to the test set for cancer cases (biopsy verification) and likely clinical follow-up or expert consensus for benign/normal cases, though not explicitly detailed for the training set.
Ask a specific question about this device
(153 days)
QDQ
Saige-Dx analyzes digital breast tomosynthesis (DBT) mammograms to identify the presence or absence of soft tissue lesions and calcifications that may be indicative of cancer. For a given DBT mammogram, Saige-Dx analyzes the DBT image stacks and the accompanying 2D images, including full field digital mammography and/or synthetic images. The system assigns a Suspicion Level, indicating the strength of suspicion that cancer may be present, for each detected finding and for the entire case. The outputs of Saige-Dx are intended to be used as a concurrent reading aid for interpreting physicians on screening mammograms with compatible DBT hardware.
Saige-Dx is a software device that processes screening mammograms using artificial intelligence to aid interpreting radiologists. By automatically detecting the presence or absence of soft tissue lesions and calcifications in mammography images, Saige-Dx can help improve reader performance, while also reducing time. The software takes as input a set of x-ray mammogram DICOM files from a single digital breast tomosynthesis (DBT) study and generates finding-level outputs for each image analyzed, as well as an aggregate case-level assessment. Saige-Dx processes both the DBT image stacks and the associated 2D images (full-field digital mammography (FFDM) and/or synthetic 2D images) in a DBT study. For each image, Saige-Dx outputs bounding boxes circumscribing any detected findings and assigns a Finding Suspicion Level to each finding, indicating the degree of suspicion that the finding is malignant. Saige-Dx uses the results of the finding-level analysis to generate a Case Suspicion Level, indicating the degree of suspicion for malignancy across the case. Saige-Dx encapsulates the finding and case-level results into a DICOM Structured Report (SR) object containing markings that can be overlaid on the original mammogram images using a viewing workstation and a DICOM Secondary Capture (SC) object containing a summary report of the Saige-Dx results.
Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:
1. A table of acceptance criteria and the reported device performance
Acceptance Criteria (Endpoint) | Reported Device Performance |
---|---|
Substantial equivalence demonstrating non-inferiority of the subject device (Saige-Dx) on compatible exams compared to the predicate device's performance on previously compatible exams. | The study endpoint was met. The lower bound of the 95% CI around the delta AUC between Hologic and GE cases, compared to Hologic-only exams, was greater than the non-inferiority margin. |
Case-level AUC on compatible exams: 0.910 (95% CI: 0.886, 0.933) | |
Generalizable standalone performance across confounders for GE and Hologic exams. | Demonstrated generalizable standalone performance on GE and Hologic exams across patient age, breast density, breast size, race, ethnicity, exam type, pathology classification, lesion size, and modality. |
Performance on Hologic HD images. | Met pre-specified performance criteria. |
Performance on unilateral breasts. | Met pre-specified performance criteria. |
Performance on breast implants (implant displaced views). | Met pre-specified performance criteria. |
2. Sample size used for the test set and the data provenance
- Sample Size: 1,804 women (236 cancer exams and 1,568 non-cancer exams).
- Data Provenance: Collected from 12 clinical sites across the United States. It's a retrospective dataset, as indicated by the description of cancer exams being confirmed by biopsy pathology and non-cancer exams by negatively interpreted subsequent screens.
3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts
- Number of Experts: At least two independent truthers, plus an additional adjudicator if needed (implying a minimum of two, potentially three).
- Qualifications of Experts: MQSA qualified, breast imaging specialists.
4. Adjudication method for the test set
- Adjudication Method: "Briefly, each cancer exam and supporting medical reports were reviewed by two independent truthers, plus an additional adjudicator if needed." This describes a 2+1 adjudication method.
5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance
- The provided text describes a standalone performance study ("The pivotal study compared the standalone performance between the subject device"). It does not mention an MRMC comparative effectiveness study and therefore no effect size for human reader improvement with AI assistance is reported. The device is intended as a concurrent reading aid, but the reported study focused on the algorithm's standalone performance.
6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done
- Yes, a standalone performance study was done. The text states: "Validation of the software was performed using standalone performance testing..." and "The pivotal study compared the standalone performance between the subject device."
7. The type of ground truth used
- For Cancer Exams: Confirmed by biopsy pathology.
- For Non-Cancer Exams: Confirmed by a negatively interpreted exam on the subsequent screen and without malignant biopsy pathology.
- For Lesions: Lesions for cancer exams were established by MQSA qualified breast imaging specialists, likely based on radiological findings and pathology reports.
8. The sample size for the training set
- Sample Size: 121,348 patients and 122,252 studies.
9. How the ground truth for the training set was established
- The document does not explicitly detail the method for establishing ground truth for the training set. It mentions the training dataset was "robust and diverse." However, given the rigorous approach described for the test set's ground truth (biopsy pathology, negative subsequent screens, expert review), it is reasonable to infer a similar, if not identical, standard was applied to the training data. The text emphasizes "no exam overlap between the training and testing datasets," indicating a careful approach to data separation.
Ask a specific question about this device
(269 days)
QDQ
ProFound Detection V4.0 is a computer-assisted detection and diagnosis (CAD) software device intended to be used concurrently by interpreting physicians while reading digital breast tomosynthesis (DBT) exams from compatible DBT system detects soft tissue densities (masses, architectural distortions and asymmetries) and calcifications in the 3D DBT slices. The detections and Certainty of Finding and Case Scores assist interpreting physicians in identifying soft tissue densities and calcifications that may be confirmed or dismissed by the interpreting Physician.
ProFound Detection V4.0 is a computer-assisted detection and diagnosis (CAD) software device that detects malignant soft-tissue densities and calcifications in digital breast tomosynthesis (DBT) images. The ProFound Detection V4.0 software allows an interpreting physician to quickly identify suspicious soft tissue densities and calcifications by marking the detected areas in the tomosynthesis images. When the ProFound Detection V4.0 marks are displayed by a user, the marks will appear as overlays on the tomosynthesis images. Each detected finding will also be assigned a "score" that corresponds to the ProFound Detection V4.0 algorithm's confidence that the detected finding is a cancer (Certainty of Finding). Certainty of Finding scores are a percentage in range of 0% to 100% to indicate CAD's confidence that the finding is malignant. ProFound Detection V4.0 also assigns a score to each case (Case Score) as a percentage in range of 0% to 100% to indicate CAD's confidence that the case has malignant findings. The higher the Certainty of Finding or Case Score, the higher the confidence that the detected finding is a cancer or that the case has malignant findings.
Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:
Acceptance Criteria and Device Performance
The core acceptance criterion is non-inferiority to the predicate device (ProFound AI V3.0) on key performance metrics.
Table of Acceptance Criteria and Reported Device Performance
Metric | Acceptance Criteria (Non-inferior to Predicate) | Reported ProFound Detection V4.0 Performance (with priors) | Reported ProFound Detection V4.0 Performance (without priors) | Reported Predicate Performance (ProFound AI V3.0) |
---|---|---|---|---|
Sensitivity | Not inferior to 0.8725 | 0.9004 (0.8633-0.9374) | 0.9004 (0.8633-0.9374) | 0.8725 (0.8312-0.9138) |
Specificity | Not inferior to 0.5278 | 0.6205 (0.5846-0.6565) | 0.5863 (0.5498-0.6228) | 0.5278 (0.4909-0.5648) |
AUC | Not inferior to 0.8230 | 0.8753 (0.8475-0.9032) | 0.8714 (0.8423-0.9007) | 0.8230 (0.7878-0.8570) |
Summary of Performance vs. Criteria:
The study demonstrated that ProFound Detection V4.0, particularly when using prior images, achieved superior performance across all three metrics (Sensitivity, Specificity, and AUC) compared to the predicate device, thus meeting the non-inferiority acceptance criteria and additionally showing superiority in specificity.
Study Details
2. Sample size used for the test set and the data provenance:
- Sample Size: 952 cases
- 251 biopsy-proven cancer cases (with 256 malignant lesions)
- 701 non-cancer cases
- Data Provenance:
- Country of Origin: U.S. image acquisition sites
- Retrospective or Prospective: Retrospectively collected
- Independence: The data was collected from sites independent of those included in the training and development sets. iCAD ensured this independence by sequestering the data.
- Manufacturer: 100% Hologic DBT system exam data.
- Exam Dates: 2018 - 2022.
3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:
- Number of Experts: The text states, "Each cancer case was a biopsy proven positive, truthed by an expert breast imaging radiologist". While it explicitly mentions "an expert breast imaging radiologist" in the singular for truthing, it does not specify the exact number of unique "expert breast imaging radiologists" involved in truthing the entire dataset or their specific years of experience.
4. Adjudication method (e.g., 2+1, 3+1, none) for the test set:
- The text does not specify a formal adjudication method (like 2+1 or 3+1) for establishing ground truth from multiple readers. Ground truth was established based on clinical data including radiology report, follow-up biopsy, and pathology data, and then truthed by an expert radiologist.
5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, if so, what was the effect size of how much human readers improve with AI vs. without AI assistance:
- No, an MRMC comparative effectiveness study was NOT done. The study described is a standalone performance assessment of the AI algorithm itself, comparing it to a predicate AI algorithm. It does not evaluate the performance of human readers, either with or without AI assistance.
6. If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:
- Yes, a standalone study was done. The text explicitly states: "A standalone study was conducted, which evaluated the performance of ProFound Detection version 4.0 without an interpreting physician." This study directly compared the algorithm's performance (V4.0) against the predicate (V3.0) on an independent test set.
7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.):
- The ground truth was a combination of biopsy-proven pathology data and clinical data, including radiology reports and follow-up data. Specifically, "These reference standards were derived from clinical data including radiology report, follow-up biopsy and pathology data. Each cancer case was a biopsy proven positive, truthed by an expert breast imaging radiologist who outlined the location and extent of cancer lesions in the case."
8. The sample size for the training set:
- The sample size for the training set is not provided. The text only refers to the test set being "independent of those included in the training and development" and that iCAD "ensures the independence of this dataset by sequestering the data and keeping it separate from the test and development datasets."
9. How the ground truth for the training set was established:
- How the ground truth for the training set was established is not explicitly detailed. The text mentions that the test set's ground truth was established by "biopsy proven cancer cases" and "truthed by an expert breast imaging radiologist." While it implies a similar process would likely be used for training data, the specific method for the training set's ground truth establishment is not provided in the submitted document.
Ask a specific question about this device
(30 days)
QDQ
Lunit INSIGHT DBT is a computer-assisted detection and diagnosis (CADe/x) software intended to be used concurrently by interpreting physicians to aid in the detection and characterization of suspected lesions for breast cancer in digital breast tomosynthesis (DBT) exams from compatible DBT systems. Through the analysis. the regions of soft tissue lesions and calcifications are marked with an abnormality score indicating the likelihood of the presence of malignancy for each lesion. Lunit INSIGHT DBT uses screening mammograms of the female population.
Lunit INSIGHT DBT is not intended as a replacement for a complete interpreting physician's review or their clinical judgment that takes into account other relevant information from the image or patient history.
Lunit INSIGHT DBT is a computer-assisted detection/diagnosis (CADe/x) software as a medical device that provides information about the presence, location and characteristics of lesions suspicious for breast cancer to assist interpreting physicians in making diagnostic decisions when reading digital breast tomosynthesis (DBT) images. The software automatically analyzes digital breast tomosynthesis slices via artificial intelligence technology that has been trained via deep learning.
For each DBT case, Lunit INSIGHT DBT generates an artificial intelligence analysis results that include the lesion type, location, lesion-level/case-level score, and outline of the regions suspected of breast cancer. This peripheral information intends to augment the physician's workflow to better aid in detection and diagnosis of breast cancer.
The provided text describes the 510(k) submission for Lunit INSIGHT DBT v1.1, a computer-assisted detection and diagnosis (CADe/x) software for breast cancer in digital breast tomosynthesis (DBT) exams. The document primarily focuses on demonstrating substantial equivalence to its predicate device, Lunit INSIGHT DBT v1.0.
Here's an analysis of the acceptance criteria and the study that proves the device meets them, based on the provided text:
Acceptance Criteria and Reported Device Performance
The core acceptance criterion explicitly mentioned for the standalone performance testing is an AUROC (Area Under the Receiver Operating Characteristic curve) greater than 0.903. This is directly compared to the predicate device's performance.
Acceptance Criterion (Primary Endpoint) | Reported Device Performance (Lunit INSIGHT DBT v1.1) |
---|---|
AUROC in standalone performance > 0.903 | AUROC = 0.931 (95% CI: 0.920 - 0.941) |
Statistical Significance | p 0.903)**, which was the same criterion used for its predicate device. However, it lacks detailed information regarding the specifics of the data used (sample sizes, provenance), the ground truth establishment process (experts, adjudication), and the absence of an MRMC study is notable for a CADe/x device, though not explicitly required for this specific 510(k) submission that highlights substantial equivalence based on standalone performance to a predicate. |
Ask a specific question about this device
Page 1 of 3