Search Results

Saige-Dx analyzes digital breast tomosynthesis (DBT) mammograms to identify the presence or absence of soft tissue lesions and calcifications that may be indicative of cancer. For a given DBT mammogram, Saige-Dx analyzes the DBT image stacks and the accompanying 2D images, including full field digital mammography and/or synthetic images. The system assigns a Suspicion Level, indicating the strength of suspicion that cancer may be present, for each detected finding and for the entire case. The outputs of Saige-Dx are intended to be used as a concurrent reading aid for interpreting physicians on screening mammograms with compatible DBT hardware.

Device Description

Saige-Dx is a software device that processes screening mammograms using artificial intelligence to aid interpreting radiologists. By automatically detecting the presence or absence of soft tissue lesions and calcifications in mammography images, Saige-Dx can help improve reader performance, while also reducing reading time. The software takes as input a set of x-ray mammogram DICOM files from a single digital breast tomosynthesis (DBT) study and generates finding-level outputs for each image analyzed, as well as an aggregate case-level assessment. Saige-Dx processes both the DBT image stacks and the associated 2D images (full-field digital mammography (FFDM) and/or synthetic 2D images) in a DBT study. For each image, Saige-Dx outputs bounding boxes circumscribing any detected findings and assigns a Finding Suspicion Level to each finding, indicating the degree of suspicion that the finding is malignant. Saige-Dx uses the results of the finding-level analysis to generate a Case Suspicion Level, indicating the degree of suspicion for malignancy across the case. Saige-Dx encapsulates the finding and case-level results into a DICOM Structured Report (SR) object containing markings that can be overlaid on the original mammogram images using a viewing workstation and a DICOM Secondary Capture (SC) object containing a summary report of the Saige-Dx results.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided FDA 510(k) clearance letter for Saige-Dx:

1. Table of Acceptance Criteria and Reported Device Performance

The provided document indicates that the primary endpoint of the standalone performance testing was to demonstrate non-inferiority of the subject device (new Saige-Dx version) to the predicate device (previous Saige-Dx version). Specific quantitative acceptance criteria (e.g., AUC, sensitivity, specificity thresholds) are not explicitly stated in the provided text. However, the document states:

"The test met the pre-specified performance criteria, and the results support the safety and effectiveness of Saige-Dx updated AI model on Hologic and GE exams."

Acceptance Criteria (Not explicitly quantified in source)	Reported Device Performance
Non-inferiority of subject device performance to predicate device performance.	"The test met the pre-specified performance criteria, and the results support the safety and effectiveness of Saige-Dx updated AI model on Hologic and GE exams."
Performance across breast densities, ages, race/ethnicities, and lesion types and sizes.	Subgroup analyses "demonstrated similar standalone performance trends across breast densities, ages, race/ethnicities, and lesion types and sizes."
Software design and implementation meeting requirements.	Verification testing including unit, integration, system, and regression testing confirmed "the software, as designed and implemented, satisfied the software requirements and has no unintentional differences from the predicate device."

2. Sample Size for the Test Set and Data Provenance

Sample Size for Test Set: 2,002 DBT screening mammograms from unique women.
- 259 cancer cases
- 1,743 non-cancer cases
Data Provenance:
- Country of Origin: United States (cases collected from 12 diverse clinical sites).
- Retrospective or Prospective: Retrospective.
- Acquisition Equipment: Hologic (standard definition and high definition) and GE images.

3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications

The document mentions: "The case collection and ground truth lesion localization processes of the newly collected cases were the same processes used for the previously collected test dataset (details provided in K220105)."

While the specific number and qualifications of experts for the ground truth of the current test set are not explicitly detailed in this document, it refers back to K220105 for those details. It implies that a standardized process involving experts was used.

4. Adjudication Method for the Test Set

The document does not explicitly describe the adjudication method (e.g., 2+1, 3+1) used for establishing ground truth for the test set. It states: "The case collection and ground truth lesion localization processes of the newly collected cases were the same processes used for the previously collected test dataset (details provided in K220105)." This suggests a pre-defined and presumably robust method for ground truth establishment.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

Was it done? Yes.
Effect Size: The document states: "a multi-reader multi-case (MRMC) study was previously conducted for the predicate device and remains applicable to the subject device." It does not provide details on the effect size (how much human readers improve with AI vs. without AI assistance) within this document. Readers would need to refer to the K220105 submission for that information if it was presented there.

6. Standalone (Algorithm Only) Performance Study

Was it done? Yes.
Description: "Validation of the software was conducted using a retrospective and blinded multicenter standalone performance testing under an IRB approved protocol..."
Primary Endpoint: "to demonstrate that the performance of the subject device was non-inferior to the performance of the predicate device."

7. Type of Ground Truth Used

The ground truth involved the presence or absence of cancer, with cases categorized as 259 cancer and 1,743 non-cancer. The mention of "ground truth lesion localization processes" implies a detailed assessment of findings, likely involving expert consensus and/or pathology/biopsy results to confirm malignancy. Given it's a diagnostic aid for cancer, pathology is the gold standard for confirmation.

8. Sample Size for the Training Set

Training Dataset: 161,323 patients and 300,439 studies.

9. How the Ground Truth for the Training Set Was Established

The document states: "The Saige-Dx algorithm was trained on a robust and diverse dataset of mammography exams acquired from multiple vendors including GE and Hologic equipment."
While it doesn't explicitly detail the method of ground truth establishment for the training set (e.g., expert consensus, pathology reports), similar to the test set, for a cancer detection AI, it is highly probable that the ground truth for the training data was derived from rigorous clinical assessments, including follow-up, biopsy results, and/or expert interpretations, to accurately label cancer and non-cancer cases for the algorithm to learn from. The implied "robust and diverse" nature of the training data suggests a comprehensive approach to ground truth.

Ask a Question

Ask a specific question about this device

K Number

K243703

Device Name

TechLive

Manufacturer

DeepHealth, Inc.

Date Cleared

2025-06-05

(188 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K232744

Predicate For

N/A

Intended Use

TechLive is a software application intended to provide remote access for real-time image acquisition, assistance, review, monitoring and standardization of imaging devices across multiple locations. It is a vendor neutral solution allowing read-only or full access control to connected devices. TechLive is also intended for training of medical personnel working on medical imaging devices. TechLive is not intended for diagnostic use.

Device Description

Clinical users can remotely access imaging devices from a computer via a secure software connection that streams video and audio, including access to keyboard and mouse controls. This setup allows remote users to assist local in-suite assistants or other clinical users by means of audio/video connection or perform remote acquisitions themselves. Depending upon the device used to acquire the images, the remote access can allow for "view mode", to support the in-suite assistant, or "control mode", where the remote user controls the console software of the imaging device. Remote access to the local imaging device can only be granted by the local in-suite assistant and can be granted or revoked as needed using an on-premises computer installed with the assistance touch interface. "Control mode" access may be granted on-demand or prior to acquisition to access the scanner. TechLive offers a secure collaboration platform with live audio and video, enabling remote acquisition and seamless collaboration among healthcare professionals.

TechLive is vendor-neutral and compatible with existing imaging devices, including Ultrasound (US), Magnetic Resonance Imaging (MRI), Computer Tomography (CT), and Positron Emission Tomography (PET/CT).

AI/ML Overview

The provided FDA 510(k) clearance letter for DeepHealth, Inc.'s TechLive device does not contain the specific acceptance criteria or detailed study results related to numerical performance metrics. The document primarily focuses on establishing substantial equivalence to a predicate device, outlining the device's intended use, technological characteristics, and a summary of non-clinical testing performed.

Based on the provided text, here's what can be extracted and what information is missing:

1. Table of Acceptance Criteria and Reported Device Performance

This information is not explicitly provided in the document. The clearance letter states that "Verification and validation demonstrated that the TechLive software, as designed and implemented, meets all pre-defined performance specifications, user needs and its intended use," but it does not list these specific performance specifications or their corresponding results in a quantified manner.

2. Sample Size Used for the Test Set and Data Provenance

This information is not explicitly provided in the document. The text mentions "Verification and validation testing" and "Usability Studies" but does not detail the sample sizes for these tests or the provenance of any data used. Given the device is for remote access and not diagnostic use, it's possible the "test set" might refer to testing with various imaging devices and configurations, rather than a dataset of patient images.

3. Number of Experts Used to Establish Ground Truth and Qualifications

This information is not explicitly provided in the document. Since the device is "not intended for diagnostic use," the concept of "ground truth" related to medical diagnosis as established by experts would likely not apply in the traditional sense for direct performance evaluation. The "ground truth" for this device would more likely be related to the successful operation of remote access, control, and communication functionalities.

4. Adjudication Method for the Test Set

This information is not explicitly provided in the document.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

A MRMC comparative effectiveness study was not conducted and is not applicable for this device. The document explicitly states: "TechLive is not intended for diagnostic use." MRMC studies are typically performed to assess the diagnostic performance of a device when used by multiple readers across various cases, often comparing it to human performance with or without AI assistance for diagnostic tasks. Since TechLive is for operational support and remote collaboration, not diagnosis, such a study would not be relevant.

6. Standalone (Algorithm Only) Performance Study

A standalone performance study in the context of diagnostic accuracy was not conducted and is not applicable for this device. Again, the device is not intended for diagnostic use. The non-clinical verification and validation testing would have assessed the standalone functional performance of the software (e.g., successful remote connection, video/audio quality, control fidelity, security), but not diagnostic performance.

7. Type of Ground Truth Used

The type of "ground truth" used for this device would likely relate to objective measures of functional performance, system integration, and user interaction, rather than clinical outcomes or pathology. Examples of "ground truth" would be:

Successful establishment and maintenance of remote connection.
Accurate and responsive remote control of imaging devices.
Clear and synchronized audio/video communication.
Successful completion of tasks during usability testing.
However, the document does not explicitly state the specific ground truths used for their non-clinical tests.

8. Sample Size for the Training Set

This information is not explicitly provided in the document. As TechLive is a software application for remote access and control, it may not involve a "training set" in the traditional machine learning sense for diagnostic image analysis. If it uses AI components for other functionalities (e.g., adaptive streaming), this information is not disclosed.

9. How Ground Truth for the Training Set Was Established

This information is not explicitly provided and is likely not applicable or detailed given the stated purpose of the device.

Summary of Device Acceptance Criteria and Study Information from the Document:

Given the lack of specific numerical performance data, the "acceptance criteria" can be inferred from the stated purpose and the type of testing performed, rather than quantified metrics.

Feature / Criterion	Description from Document (Inferred Acceptance)
Intended Use Fulfillment	Device provides remote access for real-time image acquisition, assistance, review, monitoring, and standardization. Not for diagnostic use.
Vendor Neutrality	Interoperable across scanner types and manufacturers (Ultrasound, MRI, CT, PET/CT).
Functional Performance	Meets all specifications for functional, long-duration, and interoperability tests; seamless console operation.
Safety & Effectiveness	Differences from predicate device do not affect safety and effectiveness.
Risk Management	Conforms to ISO 14971:2019 (Medical Devices – Application of Risk Management to Medical Devices).
Software Life Cycle	Conforms to IEC 62304:2015 (Medical Device Software – Software Life Cycles Processes).
Product Safety	Conforms to IEC 82304-1 (Health Software – Part 1: General Requirements for product safety).
Premarket Submissions for Device Software Functions	Developed in accordance with FDA Guidance (June 2023).
Cybersecurity	Developed in accordance with FDA Guidance (September 2023); uses secure connections (WebRTC over HTTPS).
Usability (Human Factors)	Users can successfully and safely perform critical tasks with high efficiency and minimal errors.
Equivalence to Predicate	Functionally and performance equivalent to Syngo Virtual Cockpit (K232744).

Study Information:

Clinical Testing: None performed.
Non-Clinical Verification and Validation Testing: Conducted to demonstrate meeting pre-defined performance specifications, user needs, and intended use. Included functional, long-duration, and interoperability tests.
Usability Studies (Human Factors): Conducted (Formative and Summative testing) to demonstrate safe and effective performance of critical tasks.
Vendor Neutrality and Multi-Vendor Validation Testing: Performed to confirm interoperability across various scanner types and manufacturers.

Missing Key Information (Relevant to the questions asked):

Specific, quantified acceptance criteria for functional performance (e.g., latency, video resolution, control responsiveness).
Numerical results of performance testing against any defined criteria.
Sample sizes for non-clinical verification and validation testing, and usability studies.
Data provenance for any test data.
Details on "ground truth" establishment for these functional tests (e.g., how "seamless console operation" was objectively measured and verified).
Any information regarding training sets or AI model performance, as the device's primary function is remote access and not diagnostic AI.

In summary, the provided document focuses on regulatory compliance through substantial equivalence and adherence to standards for software development, risk management, and cybersecurity, rather than detailed quantitative performance metrics typically seen for diagnostic AI products.

Ask a Question

Ask a specific question about this device

K Number

K243688

Device Name

Saige-Dx (3.1.0)

Manufacturer

DeepHealth, Inc.

Date Cleared

2024-12-19

(20 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K220105,K241747

Predicate For

K251873

Intended Use

Device Description

Saige-Dx is a software device that processes screening mammograms using artificial intelligence to aid interpreting radiologists. By automatically detecting the presence or absence of soft tissue lesions and calcifications in mammography images, Saige-Dx can help improve reader performance, while also reducing time. The software takes as input a set of x-ray mammogram DICOM files from a single digital breast tomosynthesis (DBT) study and generates finding-level outputs for each image analyzed, as well as an aggregate case-level assessment. Saige-Dx processes both the DBT image stacks and the associated 2D images (full-field digital mammography (FFDM) and/or synthetic 2D images) in a DBT study. For each image, Saige-Dx outputs bounding boxes circumscribing any detected findings and assigns a Finding Suspicion Level to each finding, indicating the degree of suspicion that the finding is malignant. Saige-Dx uses the results of the finding-level analysis to generate a Case Suspicion Level, indicating the degree of suspicion for malignancy across the case. Saige-Dx encapsulates the finding and case-level results into a DICOM Structured Report (SR) object containing markings that can be overlaid on the original mammogram images using a viewing workstation and a DICOM Secondary Capture (SC) object containing a summary report of the Saige-Dx results.

AI/ML Overview

The provided text describes the Saige-Dx (v.3.1.0) device and its performance testing as part of an FDA 510(k) submission (K243688). However, it does not contain specific acceptance criteria values or the quantitative results of the device's performance against those criteria. It states that "All tests met the pre-specified performance criteria," but does not list those criteria or the measured performance metrics.

Therefore, while I can extract information related to the different aspects of the study, I cannot create a table of acceptance criteria and reported device performance with specific values.

Here's a breakdown of the information available based on your request:

1. A table of acceptance criteria and the reported device performance

Acceptance Criteria: Not explicitly stated in quantitative terms. The document only mentions that "All tests met the pre-specified performance criteria."
Reported Device Performance: Not explicitly stated in quantitative terms (e.g., specific sensitivity, specificity, AUC values, or improvements in human reader performance).

2. Sample sized used for the test set and the data provenance (e.g. country of origin of the data, retrospective or prospective)

Test Set Sample Size: Not explicitly stated for the validation performance study. The text mentions "Validation of the software was previously conducted using a multi-reader multi-case (MRMC) study and standalone performance testing conducted under approved IRB protocols (K220105 and K241747)." It also mentions that the tests included "DBT screening mammograms with Hologic standard definition and HD images, GE images, exams with unilateral breasts, and from patients with breast implants (on implant displaced views)."
Data Provenance: The data for the training set was collected from "multiple vendors including GE and Hologic equipment" and from "diverse practices with the majority from geographically diverse areas within the United States, including New York and California." For the test set, it is implied to be similar in nature as it's part of the overall "performance testing," but specific details for the test set alone are not provided regarding country of origin or retrospective/prospective nature. However, since it involves IRB protocols, it suggests a structured, likely prospective collection or at least a carefully curated retrospective collection.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts (e.g. radiologist with 10 years of experience)

Not explicitly stated for the test set. The document indicates that a Multi-Reader Multi-Case (MRMC) study was performed, which implies the involvement of expert readers, but the number of experts and their qualifications are not detailed.

4. Adjudication method (e.g. 2+1, 3+1, none) for the test set

Not explicitly stated for the test set. The involvement of an MRMC study suggests a structured interpretation process, potentially including adjudication, but the method (e.g., consensus, majority rule with an adjudicator) is not described.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

Yes, an MRMC study was done: "Validation of the software was previously conducted using a multi-reader multi-case (MRMC) study..."
Effect Size: The document does not provide the quantitative effect size of how much human readers improved with AI vs. without AI assistance. It broadly states that Saige-Dx "can help improve reader performance, while also reducing time."

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Yes, standalone performance testing was done: "...and standalone performance testing conducted under approved IRB protocols..."
Results: The document states that "All tests met the pre-specified performance criteria" for the standalone performance, but does not provide the specific quantitative results (e.g., sensitivity, specificity, AUC).

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc)

Not explicitly stated. For a device identifying "soft tissue lesions and calcifications that may be indicative of cancer," ground truth would typically involve a combination of biopsy/pathology results, clinical follow-up, and potentially expert consensus on imaging in cases without definitive pathology. However, the document doesn't specify the exact method for establishing ground truth for either the training or test sets.

8. The sample size for the training set

Training Set Sample Size: "A total of nine datasets comprising 141,768 patients and 316,166 studies were collected..."

9. How the ground truth for the training set was established

Not explicitly stated. The document mentions the collection of diverse datasets for training but does not detail how the ground truth for these 141,768 patients and 316,166 studies was established (e.g., through radiologists' interpretations, pathology reports, clinical outcomes).

Ask a Question

Ask a specific question about this device

K Number

K243705

Device Name

Saige-Density (2.5.0)

Manufacturer

DeepHealth, Inc

Date Cleared

2024-12-19

(20 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K222275

Predicate For

N/A

Intended Use

Saige-Density is a software application intended for use with compatible full-field digital mammography (FFDM) and digital breast tomosynthesis (DBT) systems. Saige-Density provides an ACR BI-RADS Atlas 5th Edition breast density category to aid interpreting physicians in the assessment of breast tissue composition. Saige-Density produces adjunctive information. It is not a diagnostic aid.

Device Description

Saige-Density is Software as a Medical Device that processes screening and diagnostic digital mammograms using deep learning techniques and generates outputs that serve as an aid for interpreting radiologists in assessing breast density. The software takes as input a single x-ray mammogram study and processes all acceptable 2D image DICOM files (FFDM and/or 2D synthetics) and generates a single study-level breast density category. Two DICOM files are outputted as a result: 1) a structured report (SR) DICOM object containing the case-level breast density category and 2) a secondary capture (SC) DICOM object containing a summary report with the study-level density category. Both output files contain the same breast density category ranging from "A" through "D" following Breast Imaging Reporting and System (BI-RADS) 5th Edition reporting guidelines. The SC report and/or the SR file may be viewed on a mammography viewing workstation.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device meets those criteria, based on the provided text:

Acceptance Criteria and Reported Device Performance

The provided text doesn't explicitly list a table of acceptance criteria with specific numerical targets. Instead, it states that the device was validated through a retrospective study (as described in a prior submission, K222275) and that "Verification and Validation testing conducted to support this submission confirm that Saige-Density is safe and effective for its intended use."

The key performance described is the ability to produce an ACR BI-RADS Atlas 5th Edition breast density category to aid interpreting physicians. The device outputs a study-level breast density category ranging from "A" through "D."

To infer the de facto acceptance criterion for performance, we must assume it aligns with demonstrating substantial equivalence to the predicate device (Saige-Density v2.0.0, K222275). This implies that the current version (v2.5.0) performs at least as well as, or equivalently to, the predicate in its ability to classify breast density according to the BI-RADS standard. While no specific performance metrics (like accuracy, sensitivity, specificity, or agreement rates) are stated in this document for this specific submission's validation, the statement of substantial equivalence implies that these metrics were deemed acceptable in the original K222275 submission.

Study Details:

The provided text primarily refers back to the validation performed for the predicate device (K222275) for its clinical performance data. The current submission focuses on verifying that minor technological changes in v2.5.0 do not impact safety or effectiveness.

A table of acceptance criteria and the reported device performance:
As noted above, no explicit table of numerical acceptance criteria or performance metrics for this specific submission is provided. The acceptance hinges on demonstrating "safety and effectiveness for its intended use" and "substantial equivalence" to the predicate, which implies the previous validation (K222275) satisfied performance requirements.
Sample size used for the test set and the data provenance (e.g., country of origin of the data, retrospective or prospective):
- Sample Size (Test Set): Not explicitly stated in this document. It refers to the validation study described in K222275.
- Data Provenance: Retrospective study. Data was obtained from "different clinical sites than those used to develop the Saige-Density algorithm." Geographic locations for the training data included "various geographic locations within the US, including racially diverse regions such as New York City and Los Angeles." It's reasonable to infer the test set likely drew from similar diverse US populations to ensure generalizability.
Number of experts used to establish the ground truth for the test set and the qualifications of those experts (e.g., radiologist with 10 years of experience):
Not explicitly stated in this document. This information would typically be found in the K222275 submission details.
Adjudication method (e.g., 2+1, 3+1, none) for the test set:
Not explicitly stated in this document. This information would typically be found in the K222275 submission details.
If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:
Not explicitly stated in this document. The device "provides an ACR BI-RADS Atlas 5th Edition breast density category to aid interpreting physicians," suggesting it's an adjunctive tool, but this document does not describe an MRMC study comparing human performance with and without the AI.
If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:
Yes, the device outputs "a single study-level breast density category" and DICOM files containing this category. The validation study references in K222275 would have assessed the algorithm's performance in categorizing density. The use of "retrospective study" suggests an assessment of the algorithm's output against a ground truth.
The type of ground truth used (expert consensus, pathology, outcomes data, etc.):
Not explicitly stated in this document. Given that the output is an "ACR BI-RADS Atlas 5th Edition breast density category," the ground truth was most likely established by expert radiologists (likely through consensus or a similar process using their interpretation of the mammograms). Pathology or outcomes data are less likely to directly establish BI-RADS density categories.
The sample size for the training set:
Not explicitly stated in this document. It mentions the training data consisted of "four datasets across various geographic locations within the US."
How the ground truth for the training set was established:
Not explicitly stated in this document. It is implied that the ground truth for training would also be established by similar expert interpretation of BI-RADS density categories. The text notes "DeepHealth ensured that there was no overlap between the data used to train and test the Saige-Density algorithm," indicating good practice in study design.

Ask a Question

Ask a specific question about this device

K Number

K241747

Device Name

Saige-Dx

Manufacturer

DeepHealth, Inc

Date Cleared

2024-11-18

(153 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K220105

Predicate For

K243688

Intended Use

Device Description

Saige-Dx is a software device that processes screening mammograms using artificial intelligence to aid interpreting radiologists. By automatically detecting the presence or absence of soft tissue lesions and calcifications in mammography images, Saige-Dx can help improve reader performance, while also reducing time. The software takes as input a set of x-ray mammogram DICOM files from a single digital breast tomosynthesis (DBT) study and generates finding-level outputs for each image analyzed, as well as an aggregate case-level assessment. Saige-Dx processes both the DBT image stacks and the associated 2D images (full-field digital mammography (FFDM) and/or synthetic 2D images) in a DBT study. For each image, Saige-Dx outputs bounding boxes circumscribing any detected findings and assigns a Finding Suspicion Level to each finding, indicating the degree of suspicion that the finding is malignant. Saige-Dx uses the results of the finding-level analysis to generate a Case Suspicion Level, indicating the degree of suspicion for malignancy across the case. Saige-Dx encapsulates the finding and case-level results into a DICOM Structured Report (SR) object containing markings that can be overlaid on the original mammogram images using a viewing workstation and a DICOM Secondary Capture (SC) object containing a summary report of the Saige-Dx results.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:

1. A table of acceptance criteria and the reported device performance

Acceptance Criteria (Endpoint)	Reported Device Performance
Substantial equivalence demonstrating non-inferiority of the subject device (Saige-Dx) on compatible exams compared to the predicate device's performance on previously compatible exams.	The study endpoint was met. The lower bound of the 95% CI around the delta AUC between Hologic and GE cases, compared to Hologic-only exams, was greater than the non-inferiority margin.
	Case-level AUC on compatible exams: 0.910 (95% CI: 0.886, 0.933)
Generalizable standalone performance across confounders for GE and Hologic exams.	Demonstrated generalizable standalone performance on GE and Hologic exams across patient age, breast density, breast size, race, ethnicity, exam type, pathology classification, lesion size, and modality.
Performance on Hologic HD images.	Met pre-specified performance criteria.
Performance on unilateral breasts.	Met pre-specified performance criteria.
Performance on breast implants (implant displaced views).	Met pre-specified performance criteria.

2. Sample size used for the test set and the data provenance

Sample Size: 1,804 women (236 cancer exams and 1,568 non-cancer exams).
Data Provenance: Collected from 12 clinical sites across the United States. It's a retrospective dataset, as indicated by the description of cancer exams being confirmed by biopsy pathology and non-cancer exams by negatively interpreted subsequent screens.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

Number of Experts: At least two independent truthers, plus an additional adjudicator if needed (implying a minimum of two, potentially three).
Qualifications of Experts: MQSA qualified, breast imaging specialists.

4. Adjudication method for the test set

Adjudication Method: "Briefly, each cancer exam and supporting medical reports were reviewed by two independent truthers, plus an additional adjudicator if needed." This describes a 2+1 adjudication method.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

The provided text describes a standalone performance study ("The pivotal study compared the standalone performance between the subject device"). It does not mention an MRMC comparative effectiveness study and therefore no effect size for human reader improvement with AI assistance is reported. The device is intended as a concurrent reading aid, but the reported study focused on the algorithm's standalone performance.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Yes, a standalone performance study was done. The text states: "Validation of the software was performed using standalone performance testing..." and "The pivotal study compared the standalone performance between the subject device."

7. The type of ground truth used

For Cancer Exams: Confirmed by biopsy pathology.
For Non-Cancer Exams: Confirmed by a negatively interpreted exam on the subsequent screen and without malignant biopsy pathology.
For Lesions: Lesions for cancer exams were established by MQSA qualified breast imaging specialists, likely based on radiological findings and pathology reports.

8. The sample size for the training set

Sample Size: 121,348 patients and 122,252 studies.

9. How the ground truth for the training set was established

The document does not explicitly detail the method for establishing ground truth for the training set. It mentions the training dataset was "robust and diverse." However, given the rigorous approach described for the test set's ground truth (biopsy pathology, negative subsequent screens, expert review), it is reasonable to infer a similar, if not identical, standard was applied to the training data. The text emphasizes "no exam overlap between the training and testing datasets," indicating a careful approach to data separation.

Ask a Question

Ask a specific question about this device

K Number

K222275

Device Name

Saige-Density

Manufacturer

DeepHealth, Inc.

Date Cleared

2022-12-16

(140 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K192973

Predicate For

K243705

Intended Use

Device Description

Saige-Density is Software as a Medical Device that processes screening and diagnostic digital mammograms using deep learning techniques and generates outputs that serve as an aid for interpreting radiologists in assessing breast density. The software takes as input a single x-ray mammogram study and processes all acceptable 2D image DICOM files (FFDM and/or 2D synthetics) and generates a single study-level breast density category. Two DICOM files are outputted as a result: 1) a structured report (SR) DICOM object containing the case-level breast density category and 2) a secondary capture (SC) DICOM object containing a summary report with the study-level density category. Both output files contain the same breast density category ranging from "A" through "D" following Breast Imaging Reporting and Data System (BI-RADS) 5th Edition reporting guidelines. The SC report and/or the SR file may be viewed on a mammography viewing workstation.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:

Acceptance Criteria and Reported Device Performance

The acceptance criteria are implied by the reported performance metrics of the Saige-Density device. The primary objective of the standalone performance testing was to quantify the accuracy of Saige-Density's density category outputs. The reported performance is the accuracy of the device in classifying breast density into four categories (A, B, C, or D) and two categories (nondense: A, B; dense: C, D) compared to a consensus ground truth.

Acceptance Criteria	Reported Device Performance
Accuracy (Four-class categorization: A, B, C, D) vs. Ground Truth	81.28% (95% CI: 78.42, 83.84)
Accuracy (Two-class categorization: Nondense, Dense) vs. Ground Truth	Implicitly represented by the confusion matrix, not a single percentage explicitly stated for this metric. - Nondense correctly classified: 87.8% - Dense correctly classified: 95.2%

Study Details

Sample size used for the test set and the data provenance:
- Sample Size: A total of 796 mammogram cases (representing 6,170 images) were retrospectively collected for the standalone performance testing.
- Data Provenance: The data was collected from five breast imaging centers in the United States. The collection sites selected for the pivotal study did not overlap with those used previously to collect data for training or testing the Saige-Density AI algorithm.
Number of experts used to establish the ground truth for the test set and the qualifications of those experts:
- Number of Experts: Five expert radiologists were used to establish the ground truth.
- Qualifications of Experts: The text refers to them as "expert radiologists," implying they are qualified to interpret mammograms, but specific details about their experience (e.g., years of experience) are not provided.
Adjudication method for the test set:
- Adjudication Method: Ground truth for each case was established as the consensus of the five expert radiologists' breast density categories on the same set of cases, and calculated as the median of the reported categories for each case. This suggests a form of consensus-based adjudication, specifically using the median.
If a multi-reader multi-case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:
- The provided text does not indicate that an MRMC comparative effectiveness study was conducted to evaluate human readers' improvement with AI assistance. The performance testing described is "Standalone Performance Testing," focusing on the algorithm's performance only.
If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:
- Yes, a standalone performance study was explicitly conducted and detailed: "Standalone Performance Testing: A multi-site retrospective study was conducted to evaluate the standalone performance of Saige-Density on DBT and FFDM mammograms."
The type of ground truth used:
- The type of ground truth used was expert consensus of five expert radiologists, based on ACR BI-RADS 5th Edition guidelines.
The sample size for the training set:
- The exact sample size for the training set is not explicitly stated. However, the text mentions that the training data consisted of "four datasets across various geographic locations within the US."
How the ground truth for the training set was established:
- The text does not explicitly describe how the ground truth for the training set was established. It only states that the data used for training the algorithm was distinct from the test set and came from "four datasets across various geographic locations within the US, including racially diverse regions such as New York City and Los Angeles."

Ask a Question

Ask a specific question about this device

K Number

K220105

Device Name

Saige-Dx

Manufacturer

DeepHealth, Inc.

Date Cleared

2022-05-12

(120 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K181704

Predicate For

K241747

Intended Use

Saige-Dx analyzes digital breast tomosynthesis (DBT) mammograms to identify the presence of soft tissue lesions and calcifications that may be indicative of cancer. For a given DBT mammogram, Saige-Dx analyzes the DBT image stacks and the accompanying 2D images, including full field digital mammography and/or synthetic images. The system assigns a Suspicion Level, indicating the strength of suspicion that cancer may be present, for each detected finding and for the entire case. The outputs of Saige-Dx are intended to be used as a concurrent reading aid for interpreting physicians on screening mammograms with compatible DBT hardware.

Device Description

Saige-Dx is a software device that processes screening mammograms using artificial intelligence to aid interpreting radiologists. By automatically detecting the presence or absence of soft tissue lesions and calcifications in mammography images, Saige-Dx can help improve reader performance, while also reducing time. The software takes as input a set of x-ray mammogram DICOM files from a single digital breast tomosynthesis (DBT) study and generates finding-level outputs for each image analyzed, as well as an aggregate case-level assessment. Saige-Dx processes both the DBT image stacks and the associated 2D images (full-field digital mammography (FFDM) and/or synthetic 2D images) in a DBT study. For each image, Saige-Dx outputs bounding boxes circumscribing any detected findings and assigns a Finding Suspicion Level to each finding, indicating the degree of suspicion that the finding is malignant. Saige-Dx uses the results of the finding-level analysis to generate a Case Suspicion Level, indicating the degree of suspicion for malignancy across the case. Saige-Dx encapsulates the finding and caselevel results into a DICOM Structured Report (SR) object containing markings that can be overlaid on the original mammogram images using a viewing workstation and a DICOM Secondary Capture (SC) object containing a summary report of the Saige-Dx results.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device meets those criteria, based on the provided text:

Acceptance Criteria and Reported Device Performance

Acceptance Criteria (Implicit)	Reported Device Performance
Reader Performance Improvement (MRMC Study)
- Increase in Radiologist AUC when aided by Saige-Dx.	The average AUC of radiologists increased from 0.865 (unaided) to 0.925 (aided), a difference of 0.06 (95% CI: 0.041, 0.079, p < 0.00001). All 18 readers showed an increase.
- Increase in Radiologist Sensitivity when aided by Saige-Dx.	Average reader sensitivity increased by 8.8% (95% CI: 7.0%, 10.6%).
- Stability/Improvement in Radiologist Specificity when aided by Saige-Dx.	Average reader specificity increased by 0.9% (95% CI: -0.9%, 2.7%).
- Consistent performance across various subgroups (breast densities, ages, race/ethnicities, lesion types/sizes, radiologist specialization).	Similar trends observed:
	- Lesion type: AUC increased from 0.866 to 0.918 for soft tissue, and 0.795 to 0.899 for calcifications.
	- Radiologist specialization: AUC for breast imaging specialists increased from 0.885 to 0.931; for generalists, from 0.826 to 0.911.
Standalone Performance (Algorithm Only)
- Demonstrate strong standalone performance (e.g., high AUC).	Saige-Dx exhibited an AUC of 0.930 (95% CI: 0.902, 0.958) on the dataset, demonstrating strong performance relative to the unaided reader performance in the reader study.
- Consistent standalone performance across various subgroups.	Similar standalone performance trends were observed across breast densities, ages, race/ethnicities, and lesion types and sizes. Assessed on recalled/non-recalled and visible/non-visible cancers.
Safety and Effectiveness	Non-clinical and clinical testing confirmed that Saige-Dx is safe and effective. Minor differences from predicate do not alter intended use or affect safety/effectiveness.
Substantial Equivalence	Information presented in the 510(k) submission demonstrates Saige-Dx is substantially equivalent to the predicate device, with similar indications for use, patient population, technical characteristics, and principles of operation. Differences do not alter suitability for intended use or safety/effectiveness.

Study Details

Here's a breakdown of the studies conducted:

1. Multi-Reader Multi-Case (MRMC) Reader Study (Performance Testing: Reader Study)

Sample Size for Test Set: 240 cases (100 pathology-proven cancer cases, 140 confirmed non-cancer cases).
- Data Provenance: Retrospectively collected DBT mammogram exams from unique female patients 35 years of age or older, acquired from Hologic equipment. Data was from different clinical sites than those used for AI algorithm training. Patients represented a racially and ethnically diverse population in the US.
Number of Experts for Ground Truth: Two MQSA qualified, highly experienced (>10 years in practice) breast imaging specialists, plus a third as an adjudicator.
Qualifications of Experts for Ground Truth: MQSA qualified, highly experienced (>10 years in practice) breast imaging specialists.
Adjudication Method: For exams with discrepancies between the two truthers' assessment of density, lesion type, and/or lesion location, a third truther served as the adjudicator.
MRMC Comparative Effectiveness Study: Yes.
- Effect Size (Human Reader Improvement with AI vs. without AI):
  - Average AUC increased by 0.06 (from 0.865 unaided to 0.925 aided).
  - Average reader sensitivity increased by 8.8%.
  - Average reader specificity increased by 0.9%.
Standalone Performance: No, this specific study was for human reader performance with and without AI.
Type of Ground Truth: Expert consensus with pathology confirmation for cancer cases. Each mammogram had a ground truth status of "cancer" or "non-cancer." For cancer exams, malignant lesions were annotated based on the biopsied location that led to malignant pathology.
Sample Size for Training Set: Not explicitly stated, but the text mentions "six datasets across various geographic locations in the US and the UK," indicating a large, diverse dataset.
How Ground Truth for Training Set was Established: Not explicitly detailed for the training set, but it is stated that "DeepHealth ensured that there was no overlap between the data used to train and test the Saige-Dx Al algorithm." It can be inferred that similar robust methods (likely expert review and pathology confirmation) were used, given the thoroughness described for the test set.

2. Standalone Study (Performance Testing: Standalone Study)

Sample Size for Test Set: 1304 cases (136 cancer, 1168 non-cancer).
- Data Provenance: Retrospective, blinded, multi-center study. Collected from 9 clinical sites in the United States. All data came from clinical sites that had never been used previously for training or testing of the Saige-Dx AI algorithm.
Number of Experts for Ground Truth: "Truthed using similar procedures to those used for the reader study," which implies two highly experienced breast imaging specialists and a third adjudicator.
Qualifications of Experts for Ground Truth: Implied to be MQSA qualified, highly experienced (>10 years in practice) breast imaging specialists, consistent with the reader study.
Adjudication Method: Implied to be consistent with the reader study (third truther for discrepancies).
MRMC Comparative Effectiveness Study: No, this was a standalone performance study of the algorithm only.
Standalone Performance: Yes. Saige-Dx exhibited an AUC of 0.930 (95% CI: 0.902, 0.958).
Type of Ground Truth: Implied to be expert consensus with pathology confirmation, consistent with the reader study, as data was "collected and truthed using similar procedures."
Sample Size for Training Set: Not explicitly stated, but the data used was specifically excluded from the test set for this study, confirming separation.
How Ground Truth for Training Set was Established: Implied to be through expert review and pathology confirmation, given the "similar procedures" used for test set truthing and the isolation of training data.

Ask a Question

Ask a specific question about this device

K Number

K203517

Device Name

Saige-Q

Manufacturer

DeepHealth, Inc.

Date Cleared

2021-04-16

(137 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K183285

Predicate For

N/A

Intended Use

Saige-Q is a software workflow tool designed to aid radiologists in prioritizing exams within the standard-of-care image worklist for compatible full-field digital mammography (FFDM) and digital breast tomosynthesis (DBT) screening mammograms. Saige-Q uses an artificial intelligence algorithm to generate a code for a given mammogram, indicative of the software's suspicion that the mammogram contains at least one suspicious finding. Saige-Q makes the assigned codes available to a PACS/EPR/RIS/workstation for worklist prioritization or triage.

Saige-Q is intended for passive notification only and does not provide any diagnostic information beyond triage and prioritization. Thus, it is not intended to replace the review of images or be used on a stand-alone basis for clinical decision-making. The decision to use Saige-Q codes and how to use those codes is ultimately up to the interpreting radiologist. The interpreting radiologist is reviewing each exam on a diagnostic viewer and evaluating each patient according to the current standard of care.

Device Description

Saige-Q is a software workflow device that processes Digital Breast Tomosynthesis (DBT) and Full-Field Digital Mammography (FFDM) screening mammograms using artificial intelligence to act as a prioritization tool for interpreting radiologists. By automatically indicating whether a given mammogram is suspicious for malignancy. Saige-Q can help the user prioritize or triage cases in their worklist (or queue) that may benefit from prioritized review.

Saige-Q takes as input a set of x-ray mammogram DICOM files from a single screening mammography study (FFDM or DBT). The software first checks that the study is appropriate for Saige-Q analysis and then extracts, processes and analyses the DICOM images using an artificial intelligence algorithm. As a result of the analysis, the software generates a Saige-Q code indicating the software's suspicion of the presence of findings suggestive of breast cancer. For mammograms given a Saige-Q code of "Suspicious," the software also generates a compressed preview image, which is for informational purposes only and is not intended for diagnostic use.

The Saige-Q code can be viewed by radiologists on a picture archiving and communication system (PACS), Electronic Patient Record (EPR), and/or Radiology Information System (RIS) worklist and can be used to reorder the worklist. As a software-only device, Saige-Q can be hosted on a compatible host server connected to the necessary clinical IT systems such that DICOM studies can be received and the resulting outputs returned where they can be incorporated into the radiology worklist.

The Saige-Q codes can be used for triage or prioritization. For example, "Suspicious" studies could be given prioritized review. With a worklist that supports sorting, batches of mammograms could also be sorted based on the Saige-Q code.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device meets them, based on the provided text:

Acceptance Criteria and Device Performance

1. Table of Acceptance Criteria and Reported Device Performance

Acceptance Criterion	Saige-Q FFDM Performance (Reported Value)	Saige-Q DBT Performance (Reported Value)	BCSC Data (Baseline/Target)	Predicate Device (cmTriage)
Overall AUC	0.966 (95% CI: [0.957, 0.975])	0.985 (95% CI: [0.979, 0.990])	>0.95 (QFM product code requirement for effective triage)	Meets or exceeds predicate performance
Specificity at 86.9% Sensitivity	92.2% (95% CI: [90.2%, 93.8%])	98.3% (95% CI: [97.3%, 99.0%])	>80% CI	-
Sensitivity at 88.9% Specificity	91.2% (95% CI: [88.4%, 93.4%])	95.7% (95% CI: [93.6%, 97.2%])	>80% CI	-
Median Processing Time	15.5 seconds	196.8 seconds	Within clinical operational expectations	-
Performance by Lesion Type (Soft Tissue Densities) - AUC	0.964 (95% CI: [0.954, 0.974])	0.983 (95% CI: [0.977, 0.990])	Similar performance across subcategories	-
Performance by Lesion Type (Calcifications) - AUC	0.973 (95% CI: [0.958, 0.988])	0.989 (95% CI: [0.983, 0.996])	Similar performance across subcategories	-
Performance by Breast Density (Dense) - AUC	0.959 (95% CI: [0.945, 0.973])	0.980 (95% CI: [0.971, 0.988])	Similar performance across subcategories	-
Performance by Breast Density (Non-Dense) - AUC	0.972 (95% CI: [0.961, 0.984])	0.988 (95% CI: [0.981, 0.996])	Similar performance across subcategories	-

2. Sample Size Used for the Test Set and Data Provenance

FFDM Study Test Set:
- Malignant Exams: 501
- Normal Exams: 832
- Total: 1333
DBT Study Test Set:
- Malignant Exams: 517
- Normal Exams: 1011
- Total: 1528
Data Provenance:
- Country of Origin: United States (across two states)
- Retrospective or Prospective: Retrospective
- Sites: Data was collected from eight clinical sites for FFDM and six clinical sites for DBT. DeepHealth had never collected data from these sites previous to this study for either training or testing, ensuring an independent test set.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications

Number of Experts: Two independent expert radiologists.
Qualifications of Experts: The document does not explicitly state the qualifications (e.g., years of experience) of the expert radiologists.

4. Adjudication Method for the Test Set

Adjudication Method: 2+1 (Two independent expert radiologists reviewed each case. If discordance was observed between the two initial readers, an adjudicator was used to establish the final reference standard).

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

Was an MRMC study done? No, the document describes retrospective, blinded, multi-center studies to evaluate the standalone performance of Saige-Q. It does not mention a comparative effectiveness study involving human readers with and without AI assistance.
Effect Size of Human Improvement with AI vs. Without AI Assistance: Not applicable, as no MRMC study was conducted to assess human reader improvement with AI assistance.

6. Standalone (Algorithm Only) Performance Study

Was a standalone study done? Yes, the document explicitly states: "DeepHealth conducted two retrospective, blinded, multi-center studies to evaluate the standalone performance of Saige-Q..."

7. Type of Ground Truth Used

Ground Truth Type:
- Malignant Exams: Confirmed using pathology reports from biopsied lesions.
- Normal Exams: Confirmed with a negative clinical interpretation (BI-RADS 1 or 2) followed by another negative clinical interpretation at least two years later.
- Expert Consensus: Each case in the test set was reviewed by two independent expert radiologists (and an adjudicator if discordance was observed) to establish the reference standard for each case, building upon the pathology/clinical follow-up.

8. Sample Size for the Training Set

The document states that the AI algorithm was trained on "large numbers of mammograms where cancer status is known." However, it does not provide a specific sample size for the training set.

9. How the Ground Truth for the Training Set Was Established

The document implies the ground truth for the training set was established based on "cancer status is known" for the mammograms used for training. While not explicitly detailed, this would typically involve a combination of:
- Pathology reports for confirmed cancers.
- Long-term clinical follow-up for confirmed benign cases.
  It's also mentioned that the AI algorithm uses "deep neural networks that have been trained on large numbers of mammograms where cancer status is known," suggesting similar rigorous ground truth establishment as for the test set, but no specific methodology for the training set's ground truth is provided.

Ask a Question

Ask a specific question about this device

Page 1 of 1