Search Results

Saige-Dx analyzes digital breast tomosynthesis (DBT) mammograms to identify the presence or absence of soft tissue lesions and calcifications that may be indicative of cancer. For a given DBT mammogram, Saige-Dx analyzes the DBT image stacks and the accompanying 2D images, including full field digital mammography and/or synthetic images. The system assigns a Suspicion Level, indicating the strength of suspicion that cancer may be present, for each detected finding and for the entire case. The outputs of Saige-Dx are intended to be used as a concurrent reading aid for interpreting physicians on screening mammograms with compatible DBT hardware.

Device Description

Saige-Dx is a software device that processes screening mammograms using artificial intelligence to aid interpreting radiologists. By automatically detecting the presence or absence of soft tissue lesions and calcifications in mammography images, Saige-Dx can help improve reader performance, while also reducing reading time. The software takes as input a set of x-ray mammogram DICOM files from a single digital breast tomosynthesis (DBT) study and generates finding-level outputs for each image analyzed, as well as an aggregate case-level assessment. Saige-Dx processes both the DBT image stacks and the associated 2D images (full-field digital mammography (FFDM) and/or synthetic 2D images) in a DBT study. For each image, Saige-Dx outputs bounding boxes circumscribing any detected findings and assigns a Finding Suspicion Level to each finding, indicating the degree of suspicion that the finding is malignant. Saige-Dx uses the results of the finding-level analysis to generate a Case Suspicion Level, indicating the degree of suspicion for malignancy across the case. Saige-Dx encapsulates the finding and case-level results into a DICOM Structured Report (SR) object containing markings that can be overlaid on the original mammogram images using a viewing workstation and a DICOM Secondary Capture (SC) object containing a summary report of the Saige-Dx results.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided FDA 510(k) clearance letter for Saige-Dx:

1. Table of Acceptance Criteria and Reported Device Performance

The provided document indicates that the primary endpoint of the standalone performance testing was to demonstrate non-inferiority of the subject device (new Saige-Dx version) to the predicate device (previous Saige-Dx version). Specific quantitative acceptance criteria (e.g., AUC, sensitivity, specificity thresholds) are not explicitly stated in the provided text. However, the document states:

"The test met the pre-specified performance criteria, and the results support the safety and effectiveness of Saige-Dx updated AI model on Hologic and GE exams."

Acceptance Criteria (Not explicitly quantified in source)	Reported Device Performance
Non-inferiority of subject device performance to predicate device performance.	"The test met the pre-specified performance criteria, and the results support the safety and effectiveness of Saige-Dx updated AI model on Hologic and GE exams."
Performance across breast densities, ages, race/ethnicities, and lesion types and sizes.	Subgroup analyses "demonstrated similar standalone performance trends across breast densities, ages, race/ethnicities, and lesion types and sizes."
Software design and implementation meeting requirements.	Verification testing including unit, integration, system, and regression testing confirmed "the software, as designed and implemented, satisfied the software requirements and has no unintentional differences from the predicate device."

2. Sample Size for the Test Set and Data Provenance

Sample Size for Test Set: 2,002 DBT screening mammograms from unique women.
- 259 cancer cases
- 1,743 non-cancer cases
Data Provenance:
- Country of Origin: United States (cases collected from 12 diverse clinical sites).
- Retrospective or Prospective: Retrospective.
- Acquisition Equipment: Hologic (standard definition and high definition) and GE images.

3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications

The document mentions: "The case collection and ground truth lesion localization processes of the newly collected cases were the same processes used for the previously collected test dataset (details provided in K220105)."

While the specific number and qualifications of experts for the ground truth of the current test set are not explicitly detailed in this document, it refers back to K220105 for those details. It implies that a standardized process involving experts was used.

4. Adjudication Method for the Test Set

The document does not explicitly describe the adjudication method (e.g., 2+1, 3+1) used for establishing ground truth for the test set. It states: "The case collection and ground truth lesion localization processes of the newly collected cases were the same processes used for the previously collected test dataset (details provided in K220105)." This suggests a pre-defined and presumably robust method for ground truth establishment.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

Was it done? Yes.
Effect Size: The document states: "a multi-reader multi-case (MRMC) study was previously conducted for the predicate device and remains applicable to the subject device." It does not provide details on the effect size (how much human readers improve with AI vs. without AI assistance) within this document. Readers would need to refer to the K220105 submission for that information if it was presented there.

6. Standalone (Algorithm Only) Performance Study

Was it done? Yes.
Description: "Validation of the software was conducted using a retrospective and blinded multicenter standalone performance testing under an IRB approved protocol..."
Primary Endpoint: "to demonstrate that the performance of the subject device was non-inferior to the performance of the predicate device."

7. Type of Ground Truth Used

The ground truth involved the presence or absence of cancer, with cases categorized as 259 cancer and 1,743 non-cancer. The mention of "ground truth lesion localization processes" implies a detailed assessment of findings, likely involving expert consensus and/or pathology/biopsy results to confirm malignancy. Given it's a diagnostic aid for cancer, pathology is the gold standard for confirmation.

8. Sample Size for the Training Set

Training Dataset: 161,323 patients and 300,439 studies.

9. How the Ground Truth for the Training Set Was Established

The document states: "The Saige-Dx algorithm was trained on a robust and diverse dataset of mammography exams acquired from multiple vendors including GE and Hologic equipment."
While it doesn't explicitly detail the method of ground truth establishment for the training set (e.g., expert consensus, pathology reports), similar to the test set, for a cancer detection AI, it is highly probable that the ground truth for the training data was derived from rigorous clinical assessments, including follow-up, biopsy results, and/or expert interpretations, to accurately label cancer and non-cancer cases for the algorithm to learn from. The implied "robust and diverse" nature of the training data suggests a comprehensive approach to ground truth.

Ask a Question

Ask a specific question about this device

K Number

K243688

Device Name

Saige-Dx (3.1.0)

Manufacturer

DeepHealth, Inc.

Date Cleared

2024-12-19

(20 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K220105,K241747

Predicate For

K251873

Intended Use

Device Description

Saige-Dx is a software device that processes screening mammograms using artificial intelligence to aid interpreting radiologists. By automatically detecting the presence or absence of soft tissue lesions and calcifications in mammography images, Saige-Dx can help improve reader performance, while also reducing time. The software takes as input a set of x-ray mammogram DICOM files from a single digital breast tomosynthesis (DBT) study and generates finding-level outputs for each image analyzed, as well as an aggregate case-level assessment. Saige-Dx processes both the DBT image stacks and the associated 2D images (full-field digital mammography (FFDM) and/or synthetic 2D images) in a DBT study. For each image, Saige-Dx outputs bounding boxes circumscribing any detected findings and assigns a Finding Suspicion Level to each finding, indicating the degree of suspicion that the finding is malignant. Saige-Dx uses the results of the finding-level analysis to generate a Case Suspicion Level, indicating the degree of suspicion for malignancy across the case. Saige-Dx encapsulates the finding and case-level results into a DICOM Structured Report (SR) object containing markings that can be overlaid on the original mammogram images using a viewing workstation and a DICOM Secondary Capture (SC) object containing a summary report of the Saige-Dx results.

AI/ML Overview

The provided text describes the Saige-Dx (v.3.1.0) device and its performance testing as part of an FDA 510(k) submission (K243688). However, it does not contain specific acceptance criteria values or the quantitative results of the device's performance against those criteria. It states that "All tests met the pre-specified performance criteria," but does not list those criteria or the measured performance metrics.

Therefore, while I can extract information related to the different aspects of the study, I cannot create a table of acceptance criteria and reported device performance with specific values.

Here's a breakdown of the information available based on your request:

1. A table of acceptance criteria and the reported device performance

Acceptance Criteria: Not explicitly stated in quantitative terms. The document only mentions that "All tests met the pre-specified performance criteria."
Reported Device Performance: Not explicitly stated in quantitative terms (e.g., specific sensitivity, specificity, AUC values, or improvements in human reader performance).

2. Sample sized used for the test set and the data provenance (e.g. country of origin of the data, retrospective or prospective)

Test Set Sample Size: Not explicitly stated for the validation performance study. The text mentions "Validation of the software was previously conducted using a multi-reader multi-case (MRMC) study and standalone performance testing conducted under approved IRB protocols (K220105 and K241747)." It also mentions that the tests included "DBT screening mammograms with Hologic standard definition and HD images, GE images, exams with unilateral breasts, and from patients with breast implants (on implant displaced views)."
Data Provenance: The data for the training set was collected from "multiple vendors including GE and Hologic equipment" and from "diverse practices with the majority from geographically diverse areas within the United States, including New York and California." For the test set, it is implied to be similar in nature as it's part of the overall "performance testing," but specific details for the test set alone are not provided regarding country of origin or retrospective/prospective nature. However, since it involves IRB protocols, it suggests a structured, likely prospective collection or at least a carefully curated retrospective collection.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts (e.g. radiologist with 10 years of experience)

Not explicitly stated for the test set. The document indicates that a Multi-Reader Multi-Case (MRMC) study was performed, which implies the involvement of expert readers, but the number of experts and their qualifications are not detailed.

4. Adjudication method (e.g. 2+1, 3+1, none) for the test set

Not explicitly stated for the test set. The involvement of an MRMC study suggests a structured interpretation process, potentially including adjudication, but the method (e.g., consensus, majority rule with an adjudicator) is not described.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

Yes, an MRMC study was done: "Validation of the software was previously conducted using a multi-reader multi-case (MRMC) study..."
Effect Size: The document does not provide the quantitative effect size of how much human readers improved with AI vs. without AI assistance. It broadly states that Saige-Dx "can help improve reader performance, while also reducing time."

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Yes, standalone performance testing was done: "...and standalone performance testing conducted under approved IRB protocols..."
Results: The document states that "All tests met the pre-specified performance criteria" for the standalone performance, but does not provide the specific quantitative results (e.g., sensitivity, specificity, AUC).

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc)

Not explicitly stated. For a device identifying "soft tissue lesions and calcifications that may be indicative of cancer," ground truth would typically involve a combination of biopsy/pathology results, clinical follow-up, and potentially expert consensus on imaging in cases without definitive pathology. However, the document doesn't specify the exact method for establishing ground truth for either the training or test sets.

8. The sample size for the training set

Training Set Sample Size: "A total of nine datasets comprising 141,768 patients and 316,166 studies were collected..."

9. How the ground truth for the training set was established

Not explicitly stated. The document mentions the collection of diverse datasets for training but does not detail how the ground truth for these 141,768 patients and 316,166 studies was established (e.g., through radiologists' interpretations, pathology reports, clinical outcomes).

Ask a Question

Ask a specific question about this device

K Number

K241747

Device Name

Saige-Dx

Manufacturer

DeepHealth, Inc

Date Cleared

2024-11-18

(153 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K220105

Predicate For

K243688

Intended Use

Device Description

Saige-Dx is a software device that processes screening mammograms using artificial intelligence to aid interpreting radiologists. By automatically detecting the presence or absence of soft tissue lesions and calcifications in mammography images, Saige-Dx can help improve reader performance, while also reducing time. The software takes as input a set of x-ray mammogram DICOM files from a single digital breast tomosynthesis (DBT) study and generates finding-level outputs for each image analyzed, as well as an aggregate case-level assessment. Saige-Dx processes both the DBT image stacks and the associated 2D images (full-field digital mammography (FFDM) and/or synthetic 2D images) in a DBT study. For each image, Saige-Dx outputs bounding boxes circumscribing any detected findings and assigns a Finding Suspicion Level to each finding, indicating the degree of suspicion that the finding is malignant. Saige-Dx uses the results of the finding-level analysis to generate a Case Suspicion Level, indicating the degree of suspicion for malignancy across the case. Saige-Dx encapsulates the finding and case-level results into a DICOM Structured Report (SR) object containing markings that can be overlaid on the original mammogram images using a viewing workstation and a DICOM Secondary Capture (SC) object containing a summary report of the Saige-Dx results.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:

1. A table of acceptance criteria and the reported device performance

Acceptance Criteria (Endpoint)	Reported Device Performance
Substantial equivalence demonstrating non-inferiority of the subject device (Saige-Dx) on compatible exams compared to the predicate device's performance on previously compatible exams.	The study endpoint was met. The lower bound of the 95% CI around the delta AUC between Hologic and GE cases, compared to Hologic-only exams, was greater than the non-inferiority margin.
	Case-level AUC on compatible exams: 0.910 (95% CI: 0.886, 0.933)
Generalizable standalone performance across confounders for GE and Hologic exams.	Demonstrated generalizable standalone performance on GE and Hologic exams across patient age, breast density, breast size, race, ethnicity, exam type, pathology classification, lesion size, and modality.
Performance on Hologic HD images.	Met pre-specified performance criteria.
Performance on unilateral breasts.	Met pre-specified performance criteria.
Performance on breast implants (implant displaced views).	Met pre-specified performance criteria.

2. Sample size used for the test set and the data provenance

Sample Size: 1,804 women (236 cancer exams and 1,568 non-cancer exams).
Data Provenance: Collected from 12 clinical sites across the United States. It's a retrospective dataset, as indicated by the description of cancer exams being confirmed by biopsy pathology and non-cancer exams by negatively interpreted subsequent screens.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

Number of Experts: At least two independent truthers, plus an additional adjudicator if needed (implying a minimum of two, potentially three).
Qualifications of Experts: MQSA qualified, breast imaging specialists.

4. Adjudication method for the test set

Adjudication Method: "Briefly, each cancer exam and supporting medical reports were reviewed by two independent truthers, plus an additional adjudicator if needed." This describes a 2+1 adjudication method.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

The provided text describes a standalone performance study ("The pivotal study compared the standalone performance between the subject device"). It does not mention an MRMC comparative effectiveness study and therefore no effect size for human reader improvement with AI assistance is reported. The device is intended as a concurrent reading aid, but the reported study focused on the algorithm's standalone performance.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Yes, a standalone performance study was done. The text states: "Validation of the software was performed using standalone performance testing..." and "The pivotal study compared the standalone performance between the subject device."

7. The type of ground truth used

For Cancer Exams: Confirmed by biopsy pathology.
For Non-Cancer Exams: Confirmed by a negatively interpreted exam on the subsequent screen and without malignant biopsy pathology.
For Lesions: Lesions for cancer exams were established by MQSA qualified breast imaging specialists, likely based on radiological findings and pathology reports.

8. The sample size for the training set

Sample Size: 121,348 patients and 122,252 studies.

9. How the ground truth for the training set was established

The document does not explicitly detail the method for establishing ground truth for the training set. It mentions the training dataset was "robust and diverse." However, given the rigorous approach described for the test set's ground truth (biopsy pathology, negative subsequent screens, expert review), it is reasonable to infer a similar, if not identical, standard was applied to the training data. The text emphasizes "no exam overlap between the training and testing datasets," indicating a careful approach to data separation.

Ask a Question

Ask a specific question about this device

K Number

K220105

Device Name

Saige-Dx

Manufacturer

DeepHealth, Inc.

Date Cleared

2022-05-12

(120 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K181704

Predicate For

K241747

Intended Use

Saige-Dx analyzes digital breast tomosynthesis (DBT) mammograms to identify the presence of soft tissue lesions and calcifications that may be indicative of cancer. For a given DBT mammogram, Saige-Dx analyzes the DBT image stacks and the accompanying 2D images, including full field digital mammography and/or synthetic images. The system assigns a Suspicion Level, indicating the strength of suspicion that cancer may be present, for each detected finding and for the entire case. The outputs of Saige-Dx are intended to be used as a concurrent reading aid for interpreting physicians on screening mammograms with compatible DBT hardware.

Device Description

Saige-Dx is a software device that processes screening mammograms using artificial intelligence to aid interpreting radiologists. By automatically detecting the presence or absence of soft tissue lesions and calcifications in mammography images, Saige-Dx can help improve reader performance, while also reducing time. The software takes as input a set of x-ray mammogram DICOM files from a single digital breast tomosynthesis (DBT) study and generates finding-level outputs for each image analyzed, as well as an aggregate case-level assessment. Saige-Dx processes both the DBT image stacks and the associated 2D images (full-field digital mammography (FFDM) and/or synthetic 2D images) in a DBT study. For each image, Saige-Dx outputs bounding boxes circumscribing any detected findings and assigns a Finding Suspicion Level to each finding, indicating the degree of suspicion that the finding is malignant. Saige-Dx uses the results of the finding-level analysis to generate a Case Suspicion Level, indicating the degree of suspicion for malignancy across the case. Saige-Dx encapsulates the finding and caselevel results into a DICOM Structured Report (SR) object containing markings that can be overlaid on the original mammogram images using a viewing workstation and a DICOM Secondary Capture (SC) object containing a summary report of the Saige-Dx results.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device meets those criteria, based on the provided text:

Acceptance Criteria and Reported Device Performance

Acceptance Criteria (Implicit)	Reported Device Performance
Reader Performance Improvement (MRMC Study)
- Increase in Radiologist AUC when aided by Saige-Dx.	The average AUC of radiologists increased from 0.865 (unaided) to 0.925 (aided), a difference of 0.06 (95% CI: 0.041, 0.079, p < 0.00001). All 18 readers showed an increase.
- Increase in Radiologist Sensitivity when aided by Saige-Dx.	Average reader sensitivity increased by 8.8% (95% CI: 7.0%, 10.6%).
- Stability/Improvement in Radiologist Specificity when aided by Saige-Dx.	Average reader specificity increased by 0.9% (95% CI: -0.9%, 2.7%).
- Consistent performance across various subgroups (breast densities, ages, race/ethnicities, lesion types/sizes, radiologist specialization).	Similar trends observed:
	- Lesion type: AUC increased from 0.866 to 0.918 for soft tissue, and 0.795 to 0.899 for calcifications.
	- Radiologist specialization: AUC for breast imaging specialists increased from 0.885 to 0.931; for generalists, from 0.826 to 0.911.
Standalone Performance (Algorithm Only)
- Demonstrate strong standalone performance (e.g., high AUC).	Saige-Dx exhibited an AUC of 0.930 (95% CI: 0.902, 0.958) on the dataset, demonstrating strong performance relative to the unaided reader performance in the reader study.
- Consistent standalone performance across various subgroups.	Similar standalone performance trends were observed across breast densities, ages, race/ethnicities, and lesion types and sizes. Assessed on recalled/non-recalled and visible/non-visible cancers.
Safety and Effectiveness	Non-clinical and clinical testing confirmed that Saige-Dx is safe and effective. Minor differences from predicate do not alter intended use or affect safety/effectiveness.
Substantial Equivalence	Information presented in the 510(k) submission demonstrates Saige-Dx is substantially equivalent to the predicate device, with similar indications for use, patient population, technical characteristics, and principles of operation. Differences do not alter suitability for intended use or safety/effectiveness.

Study Details

Here's a breakdown of the studies conducted:

1. Multi-Reader Multi-Case (MRMC) Reader Study (Performance Testing: Reader Study)

Sample Size for Test Set: 240 cases (100 pathology-proven cancer cases, 140 confirmed non-cancer cases).
- Data Provenance: Retrospectively collected DBT mammogram exams from unique female patients 35 years of age or older, acquired from Hologic equipment. Data was from different clinical sites than those used for AI algorithm training. Patients represented a racially and ethnically diverse population in the US.
Number of Experts for Ground Truth: Two MQSA qualified, highly experienced (>10 years in practice) breast imaging specialists, plus a third as an adjudicator.
Qualifications of Experts for Ground Truth: MQSA qualified, highly experienced (>10 years in practice) breast imaging specialists.
Adjudication Method: For exams with discrepancies between the two truthers' assessment of density, lesion type, and/or lesion location, a third truther served as the adjudicator.
MRMC Comparative Effectiveness Study: Yes.
- Effect Size (Human Reader Improvement with AI vs. without AI):
  - Average AUC increased by 0.06 (from 0.865 unaided to 0.925 aided).
  - Average reader sensitivity increased by 8.8%.
  - Average reader specificity increased by 0.9%.
Standalone Performance: No, this specific study was for human reader performance with and without AI.
Type of Ground Truth: Expert consensus with pathology confirmation for cancer cases. Each mammogram had a ground truth status of "cancer" or "non-cancer." For cancer exams, malignant lesions were annotated based on the biopsied location that led to malignant pathology.
Sample Size for Training Set: Not explicitly stated, but the text mentions "six datasets across various geographic locations in the US and the UK," indicating a large, diverse dataset.
How Ground Truth for Training Set was Established: Not explicitly detailed for the training set, but it is stated that "DeepHealth ensured that there was no overlap between the data used to train and test the Saige-Dx Al algorithm." It can be inferred that similar robust methods (likely expert review and pathology confirmation) were used, given the thoroughness described for the test set.

2. Standalone Study (Performance Testing: Standalone Study)

Sample Size for Test Set: 1304 cases (136 cancer, 1168 non-cancer).
- Data Provenance: Retrospective, blinded, multi-center study. Collected from 9 clinical sites in the United States. All data came from clinical sites that had never been used previously for training or testing of the Saige-Dx AI algorithm.
Number of Experts for Ground Truth: "Truthed using similar procedures to those used for the reader study," which implies two highly experienced breast imaging specialists and a third adjudicator.
Qualifications of Experts for Ground Truth: Implied to be MQSA qualified, highly experienced (>10 years in practice) breast imaging specialists, consistent with the reader study.
Adjudication Method: Implied to be consistent with the reader study (third truther for discrepancies).
MRMC Comparative Effectiveness Study: No, this was a standalone performance study of the algorithm only.
Standalone Performance: Yes. Saige-Dx exhibited an AUC of 0.930 (95% CI: 0.902, 0.958).
Type of Ground Truth: Implied to be expert consensus with pathology confirmation, consistent with the reader study, as data was "collected and truthed using similar procedures."
Sample Size for Training Set: Not explicitly stated, but the data used was specifically excluded from the test set for this study, confirming separation.
How Ground Truth for Training Set was Established: Implied to be through expert review and pathology confirmation, given the "similar procedures" used for test set truthing and the isolation of training data.

Ask a Question

Ask a specific question about this device

Page 1 of 1