Search Results

Koios DS is an artificial intelligence (Al)/machine learning (ML)-based computer-aided diagnosis (CADx) software device intended for use as an adjunct to diagnostic ultrasound examinations of lesions or nodules suspicious for breast or thyroid cancer.

Koios DS allows the user to select or confirm regions of interest (ROIs) within an image representing a single lesion or nodule to be analyzed. The software then automatically characterizes the selected image data to generate an AI/ML-derived cancer risk assessment and selects applicable lexicon-based descriptors designed to improve overall diagnostic accuracy as well as reduce interpreting physician variability.

Koios DS may also be used as an image viewer of multi-modality digital images, including ultrasound and mammography. The software includes tools that allow users to adjust, measure and document images, and output into a structured report.

Koios DS software is designed to assist trained interpreting physicians in analyzing the breast ultrasound images of adult (>= 22 years) female patients with soft tissue breast lesions and/or thyroid ultrasounds of all adult (>= 22 years) patients with thyroid nodules suspicious for cancer. When utilized by an interpreting physician who has completed the prescribed training, this device provides information that may be useful in recommending appropriate clinical management.

Limitations:

· Patient management decisions should not be made solely on the results of the Koios DS analysis.

· Koios DS software is not to be used for the evaluation of normal tissue, on sites of post-surgical excision, or images with doppler, elastography, or other overlays present in them.

· Koios DS software is not intended for use on portable handheld devices (e.g. smartphones or tablets) or as a primary diagnostic viewer of mammography images.

• The software does not predict the thyroid nodule margin descriptor, extra-thyroidal extension. In the event that this condition is present, the user may select this category manually from the margin descriptor list.

Device Description

Koios DS is a software application designed to assist trained interpreting physicians in analyzing breast and thyroid ultrasound images. The software device is a web application that is deployed to a Microsoft IIS web server and accessed by a user through a compatible client. Once logged in and granted access to the Koios DS application, the user examines selected breast or thyroid ultrasound DICOM images. The user selects Regions of Interest (ROls) of orthogonal views of a breast lesion or thyroid nodule for processing by Koios DS. The ROI(s) are transmitted electronically to the Koios DS server for image processing and the results are returned to the user for review.

AI/ML Overview

The Koios Medical, Inc. Koios DS device is an AI/ML-based computer-aided diagnosis (CADx) software that assists in the analysis of breast and thyroid ultrasound images.

Here's an overview of its acceptance criteria and the studies proving it meets them:

Acceptance Criteria and Reported Device Performance

Criteria (Metric)	Acceptance Criteria (Target)	Reported Device Performance (Koios DS)
Breast Functionality	(Based on predicate device K190442 performance)
System AUC (Standalone)	Not explicitly stated as a minimum threshold, but improvement expected over predicate.	0.929 [0.913, 0.945 95% CI] (on 900 cases)Compared to predicate (Koios DS Breast v2.0): Significant increase in AUC (5%), no change in sensitivity, significant increase in specificity (24%).0.930 [0.914, 0.946 95% Cl] (on 50 additional cases, demonstrating robustness to dataset drift).
System Sensitivity (Standalone)	Not explicitly stated as a minimum threshold.	0.97 [0.96, 0.99]
System Specificity (Standalone)	Not explicitly stated as a minimum threshold.	0.61 [0.57, 0.66]
Reader AUC Improvement (MRMC)	Significant improvement in AUC with Koios DS assistance.	0.0370 [0.030, 0.044] (mean AUC improvement at α = .05) from an earlier study (K190442). The subject device's updated breast engine showed superior standalone performance, implying equivalent or greater benefit in reader performance.
Inter-operator Variability	Reduction in variability.	Average Kendall Tau-B of USE + DS was 0.6797 [0.6653, 0.6941] compared to USE Alone at 0.5404 [0.5301, 0.5507], demonstrating a significant increase (reduction in variability).
Intra-operator Variability	Reduction in variability.	USE + DS class switching rate was 10.8% compared to USE Alone at 13.6% (p = 0.042), demonstrating a statistically significant reduction.
Thyroid Functionality	(New functionality, establishing performance thresholds)
System AUC (Standalone)	Not explicitly stated as a minimum threshold, but acceptable performance.	0.798 when applied to ACR TI-RADS guidelines.
System Sensitivity (Standalone) (Biopsy recommendation)	Not explicitly stated as a minimum threshold.	0.644 [0.545, 0.744]
System Specificity (Standalone) (Biopsy recommendation)	Not explicitly stated as a minimum threshold.	0.612 [0.566, 0.658]
Reader AUC Improvement (MRMC) (All readers, all data)	Significant improvement in AUC with Koios DS assistance.	+0.083 [0.066, 0.099] (parametric); +0.079 [0.062, 0.096] (non-parametric).
Reader AUC Improvement (MRMC) (US readers, US data)	Significant improvement in AUC with Koios DS assistance.	+0.074 [0.051, 0.098] (parametric); +0.073 [0.049, 0.096] (non-parametric). This met the explicit criterion for the Thyroid module.
Reader AUC Improvement (MRMC) (EU readers, EU data)	Significant improvement in AUC with Koios DS assistance.	+0.022 [0.005, 0.039] (parametric); +0.019 [0.001, 0.037] (non-parametric).
Inter-Reader Variability	Reduction in variability.	40.7% relative change (all readers, all data); 37.4% (US readers, US data); 49.7% (EU Readers, EU Data) in association of TI-RADS points assigned.
Interpretation Time (MRMC)	Reduction in interpretation time.	-23.6% (all readers, all data); -22.7% (US readers, US data); -32.4% (EU Readers, EU Data).

Study Details:

2. Sample Sizes and Data Provenance:

Test Set (Clinical Study):
- Breast Functionality: 900 lesions from 900 different patients. (From predicate K190442, used for comparison). An additional 50 new cases were added to the breast set to test for robustness to dataset drift.
- Thyroid Functionality: 650 retrospectively collected cases (lesions) from 650 different patients.
  - 500 cases from United States locations.
  - 150 cases from European locations.
- Data Provenance: Retrospective for both breast and thyroid. Sourced from a wide variety of ultrasound hardware.
Training Set:
- Breast Engine: "A large database of known cases." (Specific number not provided in the summary, but the test set of 900 lesions was "set aside from the system's training data").
- Thyroid Engine: "A large database of known cases." (Specific number not provided, but the test set of 500 lesions was "set aside from the system's training data"). The training data was separate from the independent site data used in bench testing.

3. Number of Experts and Qualifications for Ground Truth:

The document implies that ground truth for the clinical studies relied on pathology/follow-up outcomes, meaning clinical experts (pathologists, clinicians) established the definitive diagnosis.
For the reader studies (MRMC), the "readers" themselves were the experts whose performance was being evaluated.
- Breast Study (K190442): 15 readers. Their qualifications varied:
  - Board Certification/Specialty: Diagnostic Radiology, Breast Surgeon, OB/GYN, Interventional Radiology.
  - Breast Fellowship Trained and/or Dedicated Breast Imager: 6 out of 15 had this.
  - Years of Experience (Mammography and/or Breast Ultrasound): Ranged from 0 years to 30 years.
  - Academic Institution Affiliation: Mixed (Yes/No).
  - MQSA Qualified Interpreting Physician: Mixed (Yes/No).
- Thyroid Study (CRRS-3): 15 readers. Their qualifications varied:
  - Reader Category: Domestic Endocrinologist (End), Domestic Radiologist (Rad), European Rad, European End.
  - Experience (post-residency): Ranged from < 10 years to ≥ 20 years.
  - 11/15 (73%) were US-based, and 4/15 (27%) were European.

4. Adjudication Method for the Test Set:

Not explicitly stated for the clinical reader studies. However, the use of "ground truth" (pathology/follow-up) suggests that reader interpretations were compared against this established truth, not necessarily adjudicated among themselves for the purpose of determining the definitive diagnosis for study cases. The MRMC study design inherently handles variability across readers statistically.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:

Yes, MRMC studies were performed for both breast (from predicate K190442, results replicated/improved upon) and thyroid functionalities.
Effect Size of Human Reader Improvement with AI vs. without AI Assistance:
- Breast: The prior study (K190442) demonstrated a mean AUC improvement of 0.0370 [0.030, 0.044] with Koios DS assistance (USE + DS) compared to USE Alone. The updated breast engine in the subject device showed statistically significant standalone performance improvements, implying superior or equivalent reader benefit.
- Thyroid:
  - All readers, all data: Mean AUC improvement of +0.083 [0.066, 0.099] (parametric) and +0.079 [0.062, 0.096] (non-parametric).
  - US readers, US data: Mean AUC improvement of +0.074 [0.051, 0.098] (parametric) and +0.073 [0.049, 0.096] (non-parametric). This absolute improvement (0.074) was larger than seen in the predicate breast study (0.037).

6. Standalone (Algorithm Only) Performance Study:

Yes, standalone performance was evaluated for both breast and thyroid engines through "bench testing."
- Breast Engine: Reported AUC of 0.929%, Sensitivity of 0.97, and Specificity of 0.61.
- Thyroid Engine: Reported AUC of 0.798% (with AI Adapter and descriptor predictors applied to ACR TI-RADS guidelines).
- This evaluation helped establish the device's inherent capability to characterize lesions/nodules.

7. Type of Ground Truth Used for Test Set:

Breast Functionality: Pathology or 1-year follow-up for cases that were not biopsied.
Thyroid Functionality: Exclusively via histo/cyto-pathology and/or surgical excision.

8. Sample Size for the Training Set:

The summary states that the test sets (900 breast lesions, 500 thyroid lesions) were "set aside from the system's training data." It does not provide the total number of cases used for training, only that it was a "large database of known cases."

9. How Ground Truth for the Training Set was Established:

"The underlying breast and thyroid engines draw upon knowledge learned from a large database of known cases, tying image features to their eventual diagnosis, to form a predictive model." This implies that the training data's ground truth was established through a similar process to the test set, i.e., confirmed clinical diagnoses, likely including pathology and/or clinical follow-up for a sufficiently long period to ascertain benignity or malignancy.

Ask a Question

Ask a specific question about this device

Page 1 of 1