Search Results

Koios Decision Support (DS) is an artificial intelligence (AI)/machine learning (ML)-based computer-aided diagnosis (CADx) software device intended for use as an adjunct to diagnostic ultrasound examinations of lesions or nodules suspicious for breast or thyroid cancer.

Koios DS allows the user to select or confirm regions of interest (ROIs) within an image representing a single lesion or nodule to be analyzed. The software then automatically characterizes the selected image data to generate an AI/ML-derived cancer risk assessment and selects applicable lexicon-based descriptors designed to improve overall diagnostic accuracy as well as reduce interpreting physician variability.

Koios DS software may also be used as an image viewer of multi-modality digital images, including ultrasound and mammography. The software includes tools that allow users to adjust, measure and document images, and output into a structured report.

Koios DS software is designed to assist trained interpreting physicians in analyzing the breast ultrasound images of adult (>= 22 years) female patients with soft tissue breast lesions and/or thyroid ultrasounds of all adult (>= 22 years) patients with thyroid nodules suspicious for cancer. When utilized by an interpreting physician who has completed the prescribed training, this device provides information that may be useful in recommending appropriate clinical management.

Limitations:
· Patient management decisions should not be made solely on the results of the Koios DS analysis.
· Koios DS software is not to be used for the evaluation of normal tissue, on sites of post-surgical excision, or images with doppler, elastography, or other overlays present in them.
· Koios DS software is not intended for use on portable handheld devices (e.g. smartphones or tablets) or as a primary diagnostic viewer of mammography images.
· The software does not predict the presence of the thyroid nodule margin descriptor, extra-thyroidal extension. In the event that this condition is present, the user may select this category manually from the margin descriptor list.

Device Description

Koios Decision Support (DS) is a software application designed to assist trained interpreting physicians in analyzing breast and thyroid ultrasound images. The software device is a web application that is deployed to a Microsoft IIS web server and accessed by a user through a compatible client. Once logged in and granted access to the Koios DS application, the user examines selected breast or thyroid ultrasound DICOM images. The user selects Regions of Interest (ROIs) of orthogonal views of a breast lesion or thyroid nodule for processing by Koios DS. The ROI(s) are transmitted electronically to the Koios DS server for image processing and the results are returned to the user for review.

AI/ML Overview

Here's a summary of the acceptance criteria and the study proving the device meets them, based on the provided text:

Device Name: Koios DS Version 3.6

1. Table of Acceptance Criteria and Reported Device Performance (Combining Breast and Thyroid where applicable):

Acceptance Criteria Category	Specific Metric (Breast Engine)	Reported Device Performance (Breast Engine)	Specific Metric (Thyroid Engine)	Reported Device Performance (Thyroid Engine)	Acceptance Criteria (Smart Click)	Reported Device Performance (Smart Click)	Acceptance Criteria (Image Registration & Matching)	Reported Device Performance (Image Registration & Matching)	Acceptance Criteria (OCR)	Reported Device Performance (OCR)
Standalone Performance (AI Engine)	Malignancy Risk Classifier AUC	0.945 [0.932, 0.959] (increased from 0.929)	AUC (ACR TI-RADS, with AI Adapter)	79.8% (significant increase over average physician AUC)	Non-inferiority Test - Sensitivity / Specificity	Sensitivity: Difference = -0.009 [-0.036, 0.018] (Non-inferior) Specificity: Difference = -0.018 [-0.041, 0.005] (Non-inferior)	No Match Rate	0.32%	Breast Freetext Identification (by field)	Breast Side: 0.983 Location Type: 0.948 Clock Hour: 0.926 Clock Minute: 0.934 CMFN: 0.944 Plane: 0.976
	Categorical Output Sensitivity	0.976 [0.960, 0.992] (increased from 0.97)	Sensitivity (ACR TI-RADS, biopsy rec., with AI Adapter)	0.644 [0.545, 0.744] (non-significant improvement over avg physician)	Non-inferiority Test - AUC	Difference = -0.012 [-0.029, 0.006] (Non-inferior)	Average Time for Study Preprocessing	2.39 +/- 0.48 seconds	Thyroid Freetext Identification (by field)	Thyroid Side: 0.965 Pole: 0.976 Region: 0.998 Plane: 0.970
	Categorical Output Specificity	0.632 [0.588, 0.676] (increased from 0.61)	Specificity (ACR TI-RADS, biopsy rec., with AI Adapter)	0.612 [0.566, 0.658] (significant improvement over avg physician)	Sub-optimal ROI Test	Difference = 0.026 [-0.009, 0.062] (Non-inferior)	Average Time for Image Matching	0.22 +/- 0.12 seconds	Measurement Text Identification (by field)	Measurement Description: 0.943 Measurement Value: 0.948 Unit of Measurement: 0.967
	Sensitivity to Region of Interest	0.012 (decreased from 0.019)	Sensitivity (ACR TI-RADS, follow-up rec., with AI Adapter)	0.879 [0.812, 0.946] (non-significant improvement)	Detection DICE Coefficient	DICE = 0.913 +/- 0.075 (demonstrating precise approximation to physician ROIs)	End-to-End Breast Engine Performance	AUC = 0.946 Sensitivity = 0.975 Specificity = 0.637
	Sensitivity to Transducer Frequency (High freq, >=15MHz)	AUC = 0.948 [0.917, 0.978] (increased from 0.940)	Specificity (ACR TI-RADS, follow-up rec., with AI Adapter)	0.495 [0.446, 0.544] (significant improvement)	Non-inferiority Test - Descriptor Agreement (per descriptor, e.g., Composition)	Demonstrated non-inferiority for all listed descriptors (Composition, Echogenicity, Shape, Margin, Echogenic Foci subcategories). Examples: Composition: 0.018 [0.001, 0.035]; Echogenicity: -0.005 [-0.022, 0.011]	End-to-End Thyroid Engine Performance	AUC = 0.801 Sensitivity = 0.670 Specificity = 0.603
	Sensitivity to Transducer Frequency (Low freq, <15MHz)	AUC = 0.940 [0.925, 0.956] (increased from 0.924)	AUC (ATA, with AI Adapter)	Significant increase of 9.135% [5.975, 12.294] over physician AUC	Overall Thyroid Engine Met or Exceeded Performance Requirements in all tests.		Breast Image Matching Outcomes	Successful Match: 99.5% (2018/2028 ROIs) No Match: 0.5% (10/2028 ROIs) Incorrect Match: 0.0% Incorrect Image: 0.0%
	Single Image vs Orthogonal Image Pair	Single Image: 0.932 [+/- 0.003] (not directly comparable, but improved standalone AUC overall)	Sensitivity (ATA, with AI Adapter)	Non-significant increase of 0.511% [-5.182, 6.204]			Breast Image Matching DICE Coefficient	0.995 +/- 0.005
	Operating Point (PLR, NLR, PPV, NPV)	Improved from predicate: PLR = 2.661 [2.338, 2.984]; NLR = 0.039 [0.013, 0.064]; PPV = 0.708 [0.672, 0.743]; NPV = 0.966 [0.944, 0.988]	Specificity (ATA, with AI Adapter)	Significant increase of 18.741% [9.885, 27.596]			Thyroid Image Matching Outcomes	Successful Match: 100% (1288/1288 ROIs) No Match: 0.0% Incorrect Match: 0.0% Incorrect Image: 0.0%
	Overall Breast Engine Met or Exceeded Performance Requirements in all tests.		Overall Thyroid Engine Met or Exceeded Performance Requirements in all tests.				Thyroid Image Matching DICE Coefficient	0.996 +/- 0.004

2. Sample Sizes and Data Provenance for Test Sets:

Breast Engine Standalone/Clinical Test Set:
- Sample Size: 900 lesions from 900 different patients. An expanded validation set of 1014 cases (900 + 114 additional) was used for dataset drift.
- Data Provenance: Retrospective. Images sourced from a wide variety of ultrasound hardware. Patient demographic distribution was based upon data from the Breast Cancer Surveillance Consortium (2006-2009) to ensure representativeness of diverse populations.
Thyroid Engine Standalone/Clinical Test Set:
- Sample Size: 650 retrospectively collected cases (nodules) from 650 different patients. Each lesion represented by two orthogonal images, totaling 1000 images for standalone testing.
- Data Provenance: Retrospective. Data analysis cases involved images from both the US (500 cases, 77%) and Europe (150 cases). Tested on images from a wide variety of ultrasound hardware.
Thyroid Smart Click Test Set:
- Sample Size: 650 nodules.
- Data Provenance: Not explicitly stated, but likely the same validation dataset as the Thyroid Engine, derived retrospectively from US and European locations.
Image Registration and Matching Test Set:
- Sample Size: 1,600 ultrasound studies (950 breast, 650 thyroid), involving 2028 breast ROIs and 1288 thyroid ROIs.
- Data Provenance: Not explicitly stated, but likely drawn from the same validation datasets for breast and thyroid as mentioned above.
OCR Test Set:
- Sample Size: 1910 ultrasound B-Scans (mix of thyroid and breast images). A subset of 1226 images from supported machines was used for evaluation.
- Data Provenance: Not explicitly stated, but derived from a variety of machines.

3. Number of Experts and Qualifications for Ground Truth - Test Set:

Breast Engine Standalone: Not applicable for malignancy risk classification ground truth, which was pathology or 1-year follow-up. For categorical agreement metrics (Shape, Orientation), it mentions "agreement with the subjective categorizations assigned by physicians," implying experts, but the number and specific qualifications are not detailed beyond "trained interpreting physicians."
Thyroid Engine Standalone: Not applicable for malignancy risk classification ground truth, which was "pathology results only." For descriptor predictions, it states they were tested "objectively – against ground truth pathology" and "subjectively and met the requirements for agreement with readers' descriptor categorizations," implying experts, but number and specific qualifications are not detailed.
Breast Clinical Study: 15 readers. Qualifications varied (Diagnostic Radiology, Breast Surgeon, OB/GYN). Years of experience ranged from 0 to 30 years. Some were Breast Fellowship Trained and/or Dedicated Breast Imagers, and some were MQSA Qualified Interpreting Physicians.
Thyroid Clinical Study: 15 readers (11 US-based, 4 European-based). Qualifications included Endocrinologists (End) and Radiologists (Rad). Experience ranged from < 10 years to ≥ 20 years (post-residency).

4. Adjudication Method for Test Set:

Breast Engine Standalone/Clinical:
- Malignancy Ground Truth: Determined by pathology or 1-year follow-up. No explicit adjudication method amongst multiple experts for this final ground truth is mentioned, implying a single, definitive source.
- Reader Study: Not explicitly stated for establishing a ground truth for individual cases based on reader input. The study collected reader interpretations and compared them to the established ground truth (pathology/follow-up).
Thyroid Engine Standalone/Clinical:
- Malignancy Ground Truth: Determined by "pathology results only." No explicit adjudication method amongst multiple experts is mentioned.
- Reader Study: Not explicitly stated for establishing a ground truth for individual cases based on reader input. The study collected reader interpretations and compared them to pathological ground truth.
Other Standalone Tests (Smart Click, Image Registration, OCR): Ground truth was based on defined metrics (e.g., DICE coefficient for ROI matching, manual annotation for OCR, physician-drawn ROIs, pathology for descriptor agreement). No multi-expert adjudication mentioned.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:

Yes, for both Breast and Thyroid.
Effect Size of Human Readers Improve with AI vs without AI Assistance:
- Breast (from K190442, still applicable to K242130 as performance is superior):
  - Change in average AUC (USE + DS vs. USE Alone): 0.0370 [0.030, 0.044] at α = .05 (a significant increase).
  - Average Kendall Tau-B (measure of inter-operator variability):
    - USE Alone: 0.5404 (.5301, .5507)
    - USE + DS: 0.6797 (.6653, .6941) => Significant increase in agreement.
  - Intra-operator variability (class switching rate):
    - USE Alone: 13.6%
    - USE + DS: 10.8% (p = 0.042) => Statistically significant reduction.
- Thyroid (CRRS-3 Study):
  - Change in average AUC (USE + DS vs. USE Alone, all readers, all data): +0.083 [0.066, 0.099] (parametric) / +0.079 [0.062, 0.096] (non-parametric)
  - Specifically for US readers, US data: +0.074 [0.051, 0.098] (parametric) / +0.073 [0.049, 0.096] (non-parametric). This demonstrates a statistically significant improvement in overall reader performance.
  - Change in average Sensitivity/Specificity of FNA (with AI Adapter + size criteria):
    - All readers, all data: +0.084 (sensitivity), +0.140 (specificity)
    - US readers, US data: +0.058 (sensitivity), +0.130 (specificity)
  - Change in average Sensitivity/Specificity of Follow-up (with AI Adapter + size criteria):
    - All readers, all data: +0.060 (sensitivity), +0.206 (specificity)
    - US readers, US data: +0.053 (sensitivity), +0.180 (specificity)
  - Inter-Reader Variability (relative change in TI-RADS points association): 40.7% (all readers, all data), 37.4% (US readers, US data), 49.7% (EU Readers, EU Data)
  - Impact on Interpretation Time: -23.6% (all readers, all data), -22.7% (US readers, US data), -32.4% (EU Readers, EU Data).

6. Standalone (Algorithm Only without Human-in-the-loop) Performance:

Yes, for both Breast and Thyroid AI Engines, Smart Click, Image Registration and Matching, and OCR.
- Breast Engine: AUC = 0.945; Sensitivity = 0.976; Specificity = 0.632.
- Thyroid Engine (ACR TI-RADS, biopsy recommendation): Sensitivity = 0.644; Specificity = 0.612.
- Thyroid Smart Click: Demonstrated non-inferiority for Sensitivity, Specificity, AUC, and descriptor agreement compared to physician-selected calipers. Detection DICE = 0.913.
- Image Registration and Matching: Very high DICE coefficients (Breast 0.995, Thyroid 0.996) and successful match rates (>99.5%).
- OCR Engine: High accuracy rates for identification of various freetext and measurement fields (e.g., Breast Side 0.983, Measurement Value 0.948).

7. Type of Ground Truth Used:

Malignancy Risk Classification (Breast & Thyroid AI Engines):
- Breast: Pathology or 1-year follow-up.
- Thyroid: Pathology results only (for standalone). Clinical study also used cyto-/histological or excisional pathology.
Descriptor Predictions (Thyroid Standalone): Tested objectively against ground truth pathology and subjectively for agreement with readers' descriptor categorizations.
Smart Click, Image Registration, OCR: Ground truth was established by manual annotations, physician-drawn ROIs, or defined objective metrics (like DICE coefficient against a reference ROI).

8. Sample Size for Training Set:

Not explicitly stated for either Breast or Thyroid engines. The text mentions drawing upon a "large database of known cases" for the underlying engines and that the test sets were "set aside from the system's training data." However, the exact number of cases/images in the training set is not provided.

9. How Ground Truth for Training Set was Established:

Not explicitly detailed for either Breast or Thyroid engines. The text states the engines "draw upon knowledge learned from a large database of known cases, tying image features to their eventual diagnosis, to form a predictive model." This implies that the training data had associated definitive diagnoses (e.g., from pathology or follow-up), but the process of establishing this ground truth (e.g., expert review, adjudication) for the training data is not described.

Ask a Question

Ask a specific question about this device

K Number

K212616

Device Name

Koios DS

Manufacturer

Koios Medical, Inc.

Date Cleared

2021-12-16

(120 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K190442

Predicate For

K242130,K243614

Intended Use

Koios DS is an artificial intelligence (Al)/machine learning (ML)-based computer-aided diagnosis (CADx) software device intended for use as an adjunct to diagnostic ultrasound examinations of lesions or nodules suspicious for breast or thyroid cancer.

Koios DS may also be used as an image viewer of multi-modality digital images, including ultrasound and mammography. The software includes tools that allow users to adjust, measure and document images, and output into a structured report.

Limitations:

· Patient management decisions should not be made solely on the results of the Koios DS analysis.

· Koios DS software is not to be used for the evaluation of normal tissue, on sites of post-surgical excision, or images with doppler, elastography, or other overlays present in them.

· Koios DS software is not intended for use on portable handheld devices (e.g. smartphones or tablets) or as a primary diagnostic viewer of mammography images.

• The software does not predict the thyroid nodule margin descriptor, extra-thyroidal extension. In the event that this condition is present, the user may select this category manually from the margin descriptor list.

Device Description

Koios DS is a software application designed to assist trained interpreting physicians in analyzing breast and thyroid ultrasound images. The software device is a web application that is deployed to a Microsoft IIS web server and accessed by a user through a compatible client. Once logged in and granted access to the Koios DS application, the user examines selected breast or thyroid ultrasound DICOM images. The user selects Regions of Interest (ROls) of orthogonal views of a breast lesion or thyroid nodule for processing by Koios DS. The ROI(s) are transmitted electronically to the Koios DS server for image processing and the results are returned to the user for review.

AI/ML Overview

The Koios Medical, Inc. Koios DS device is an AI/ML-based computer-aided diagnosis (CADx) software that assists in the analysis of breast and thyroid ultrasound images.

Here's an overview of its acceptance criteria and the studies proving it meets them:

Acceptance Criteria and Reported Device Performance

Criteria (Metric)	Acceptance Criteria (Target)	Reported Device Performance (Koios DS)
Breast Functionality	(Based on predicate device K190442 performance)
System AUC (Standalone)	Not explicitly stated as a minimum threshold, but improvement expected over predicate.	0.929 [0.913, 0.945 95% CI] (on 900 cases)Compared to predicate (Koios DS Breast v2.0): Significant increase in AUC (5%), no change in sensitivity, significant increase in specificity (24%).0.930 [0.914, 0.946 95% Cl] (on 50 additional cases, demonstrating robustness to dataset drift).
System Sensitivity (Standalone)	Not explicitly stated as a minimum threshold.	0.97 [0.96, 0.99]
System Specificity (Standalone)	Not explicitly stated as a minimum threshold.	0.61 [0.57, 0.66]
Reader AUC Improvement (MRMC)	Significant improvement in AUC with Koios DS assistance.	0.0370 [0.030, 0.044] (mean AUC improvement at α = .05) from an earlier study (K190442). The subject device's updated breast engine showed superior standalone performance, implying equivalent or greater benefit in reader performance.
Inter-operator Variability	Reduction in variability.	Average Kendall Tau-B of USE + DS was 0.6797 [0.6653, 0.6941] compared to USE Alone at 0.5404 [0.5301, 0.5507], demonstrating a significant increase (reduction in variability).
Intra-operator Variability	Reduction in variability.	USE + DS class switching rate was 10.8% compared to USE Alone at 13.6% (p = 0.042), demonstrating a statistically significant reduction.
Thyroid Functionality	(New functionality, establishing performance thresholds)
System AUC (Standalone)	Not explicitly stated as a minimum threshold, but acceptable performance.	0.798 when applied to ACR TI-RADS guidelines.
System Sensitivity (Standalone) (Biopsy recommendation)	Not explicitly stated as a minimum threshold.	0.644 [0.545, 0.744]
System Specificity (Standalone) (Biopsy recommendation)	Not explicitly stated as a minimum threshold.	0.612 [0.566, 0.658]
Reader AUC Improvement (MRMC) (All readers, all data)	Significant improvement in AUC with Koios DS assistance.	+0.083 [0.066, 0.099] (parametric); +0.079 [0.062, 0.096] (non-parametric).
Reader AUC Improvement (MRMC) (US readers, US data)	Significant improvement in AUC with Koios DS assistance.	+0.074 [0.051, 0.098] (parametric); +0.073 [0.049, 0.096] (non-parametric). This met the explicit criterion for the Thyroid module.
Reader AUC Improvement (MRMC) (EU readers, EU data)	Significant improvement in AUC with Koios DS assistance.	+0.022 [0.005, 0.039] (parametric); +0.019 [0.001, 0.037] (non-parametric).
Inter-Reader Variability	Reduction in variability.	40.7% relative change (all readers, all data); 37.4% (US readers, US data); 49.7% (EU Readers, EU Data) in association of TI-RADS points assigned.
Interpretation Time (MRMC)	Reduction in interpretation time.	-23.6% (all readers, all data); -22.7% (US readers, US data); -32.4% (EU Readers, EU Data).

Study Details:

2. Sample Sizes and Data Provenance:

Test Set (Clinical Study):
- Breast Functionality: 900 lesions from 900 different patients. (From predicate K190442, used for comparison). An additional 50 new cases were added to the breast set to test for robustness to dataset drift.
- Thyroid Functionality: 650 retrospectively collected cases (lesions) from 650 different patients.
  - 500 cases from United States locations.
  - 150 cases from European locations.
- Data Provenance: Retrospective for both breast and thyroid. Sourced from a wide variety of ultrasound hardware.
Training Set:
- Breast Engine: "A large database of known cases." (Specific number not provided in the summary, but the test set of 900 lesions was "set aside from the system's training data").
- Thyroid Engine: "A large database of known cases." (Specific number not provided, but the test set of 500 lesions was "set aside from the system's training data"). The training data was separate from the independent site data used in bench testing.

3. Number of Experts and Qualifications for Ground Truth:

The document implies that ground truth for the clinical studies relied on pathology/follow-up outcomes, meaning clinical experts (pathologists, clinicians) established the definitive diagnosis.
For the reader studies (MRMC), the "readers" themselves were the experts whose performance was being evaluated.
- Breast Study (K190442): 15 readers. Their qualifications varied:
  - Board Certification/Specialty: Diagnostic Radiology, Breast Surgeon, OB/GYN, Interventional Radiology.
  - Breast Fellowship Trained and/or Dedicated Breast Imager: 6 out of 15 had this.
  - Years of Experience (Mammography and/or Breast Ultrasound): Ranged from 0 years to 30 years.
  - Academic Institution Affiliation: Mixed (Yes/No).
  - MQSA Qualified Interpreting Physician: Mixed (Yes/No).
- Thyroid Study (CRRS-3): 15 readers. Their qualifications varied:
  - Reader Category: Domestic Endocrinologist (End), Domestic Radiologist (Rad), European Rad, European End.
  - Experience (post-residency): Ranged from < 10 years to ≥ 20 years.
  - 11/15 (73%) were US-based, and 4/15 (27%) were European.

4. Adjudication Method for the Test Set:

Not explicitly stated for the clinical reader studies. However, the use of "ground truth" (pathology/follow-up) suggests that reader interpretations were compared against this established truth, not necessarily adjudicated among themselves for the purpose of determining the definitive diagnosis for study cases. The MRMC study design inherently handles variability across readers statistically.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study:

Yes, MRMC studies were performed for both breast (from predicate K190442, results replicated/improved upon) and thyroid functionalities.
Effect Size of Human Reader Improvement with AI vs. without AI Assistance:
- Breast: The prior study (K190442) demonstrated a mean AUC improvement of 0.0370 [0.030, 0.044] with Koios DS assistance (USE + DS) compared to USE Alone. The updated breast engine in the subject device showed statistically significant standalone performance improvements, implying superior or equivalent reader benefit.
- Thyroid:
  - All readers, all data: Mean AUC improvement of +0.083 [0.066, 0.099] (parametric) and +0.079 [0.062, 0.096] (non-parametric).
  - US readers, US data: Mean AUC improvement of +0.074 [0.051, 0.098] (parametric) and +0.073 [0.049, 0.096] (non-parametric). This absolute improvement (0.074) was larger than seen in the predicate breast study (0.037).

6. Standalone (Algorithm Only) Performance Study:

Yes, standalone performance was evaluated for both breast and thyroid engines through "bench testing."
- Breast Engine: Reported AUC of 0.929%, Sensitivity of 0.97, and Specificity of 0.61.
- Thyroid Engine: Reported AUC of 0.798% (with AI Adapter and descriptor predictors applied to ACR TI-RADS guidelines).
- This evaluation helped establish the device's inherent capability to characterize lesions/nodules.

7. Type of Ground Truth Used for Test Set:

Breast Functionality: Pathology or 1-year follow-up for cases that were not biopsied.
Thyroid Functionality: Exclusively via histo/cyto-pathology and/or surgical excision.

8. Sample Size for the Training Set:

The summary states that the test sets (900 breast lesions, 500 thyroid lesions) were "set aside from the system's training data." It does not provide the total number of cases used for training, only that it was a "large database of known cases."

9. How Ground Truth for the Training Set was Established:

"The underlying breast and thyroid engines draw upon knowledge learned from a large database of known cases, tying image features to their eventual diagnosis, to form a predictive model." This implies that the training data's ground truth was established through a similar process to the test set, i.e., confirmed clinical diagnoses, likely including pathology and/or clinical follow-up for a sufficiently long period to ascertain benignity or malignancy.

Ask a Question

Ask a specific question about this device

K Number

K190442

Device Name

Koios DS for Breast

Manufacturer

Koios Medical, Inc.

Date Cleared

2019-07-03

(128 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

DEN170022,K161959

Predicate For

K201555,K212616

Intended Use

Koios Decision Support (DS) for Breast is a software application designed to assist trained interpreting physicians in analyzing the breast ultrasound images of patients with soft tissue breast lesions who are being referred for further diagnostic ultrasound examination. Koios DS for Breast is a machine learning-based decision support system, indicated as an adjunct to diagnostic ultrasound for breast cancer. Koios DS for Breast automatically classifies user-selected region(s) of interest (ROIs) containing a breast lesion into four BI-RADS-aligned categories (Benign, Suspicious, Probably Malignant), and displays a continuous graphical confidence level indicator of where the lesson falls across all categories. Koios DS for Breast also automatically classifies lesion shape and orientation according to BI-RADS descriptors.

The software requires a user to select up to two orthogonal views, that represent a single lesion to be selected and processed. When utilized by an interpreting physician who has completed training, this device provides information that may be useful in rendering an accurate diagnosis. Patient management decisions should not be made solely on the results of the Koios DS for Breast analysis. This device is intended to help trained interpreting physicians improve their overall accuracy as well as reduce inter- and intra-operator variability.

Koios DS for Breast may also be used as an image viewer of multi-modality digital images, including ultrasound and mammography. The software includes tools that allow users to adjust, measure and document images, and output into a structured report.

Limitations: Koios DS for Breast is not to be used on sites of post-surgical excision, or images with doppler, elastography, or other overlays present in them. Koios DS for Breast is not intended for the primary interpretation of digital mammography images. Koios DS for Breast is not intended for use on mobile devices.

Device Description

Koios Decision Support (DS) for Breast is a software application designed to assist trained interpreting physicians in analyzing breast ultrasound images. The system provides image derived data via web triggering and remote analysis. The software device is a web application that is deployed to a Microsoft IIS web server and accessed by a user through a compatible client. Once logged in and granted access to the Koios DS for Breast application, the user examines selected breast ultrasound DICOM images. The user selects up to two Regions of Interest (ROIs) of two orthogonal views of a breast lesion for processing by Koios DS for Breast. The ROI(s) are transmitted electronically to the Koios DS for Breast server for image processing and the results are returned to the user for review.

The Koios DS for Breast core engine characterizes image features using the ROI data to generate a categorical output that aligns to BI-RADS categories. The engine uses computer vision and machine learning techniques embedded within a software capable of reading, interpreting, analyzing, and generating findings from ultrasound data. The underlying engine draws upon knowledge learned from a large database of known cases, tying image features to their eventual diagnosis, to form a predictive model. The categorical output of the Koios DS for Breast engine is divided into four categories (Benign, Probably Benign, Suspicious, Probably Malignant), separated by three operating points, aligning with or exceeding the sensitivity and specificity of radiologist chosen BI-RADS categorizations. . The output of the system is a digital display to be used as a concurrent read. Koios DS for Breast is intended to support compliance with the ACR BI-RADS ultrasound lexicon classification form. The engine additionally classifies the region of interest on the basis of shape (Oval, Round, Irregular) and orientation (Parallel to Skin, Not Parallel).

Koios DS for Breast results can be saved or transferred in three separate ways: in-transit transmission, PACS saving, and exporting results to third-party reporting software. Intransit transmission may be utilized when users wish to share analyses across viewing workstations. Results can be stored in in-transit memory for a preset period of time defined by a system administrator. These results are never locally cached, written to disk, or otherwise stored outside of in-transit memory. After that preset period of time, all results are wiped from the local memory.

Another method of saving is storing a report in the patient series on the PACS. After single or multiple breast lesion analyses have been performed and ultimately accepted by a trained interpreting physician, Koios DS for Breast can export a summary report to PACS as an addendum to the DICOM series that was selected for processing. This report serves as future reference and aid in comparison of cases requiring follow up. This functionality is strictly reserved for approved users.

Koios DS for Breast also supports exporting results to third-party reporting software to facilitate the reporting process. Saving or exporting preferences can be configured by the system administrator and user.

The Koios DS for Breast software is an ASP.NET web application that is deployed to an IIS Web Server inside a Windows operating system environment. The software does not require any specialized hardware, but the time to process ROIs will vary depending on the hardware specifications. If utilizing the recommended technical specifications, the time to generate and present results for two analyzed ROIs will be <= 2 seconds. The Koios DS for Breast processing software is a platform agnostic web service that queries and accepts DICOM compliant digital medical files from any DICOM compliant device.

Koios DS for Breast is intended to be used on patients with soft tissue breast lesions who are being referred for further diagnostic ultrasound examination.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study proving the device's performance, based on the provided FDA 510(k) summary for Koios DS for Breast:

Table of Acceptance Criteria and Reported Device Performance

Performance Metric	Acceptance Criteria	Reported Device Performance	Conclusion
MRMC Study (AUC Shift)	Statistically significant increase in AUC (Quantitative - implied by comparison to predicate and similar success criteria)	Mean AUC shift of +0.037 (95% CI: 0.030, 0.044) at α = 0.05	Satisfied (Primary Endpoint)
Inter-operator Variability (Kendall Tau-B)	Statistically significant increase in Kendall Tau-B (implied by demonstrating an improvement over USE Alone)	Average Kendall Tau-B increased from 0.5404 (USE Alone) to 0.6797 (USE + DS) (95% CI: 0.6653, 0.6941) with a significant increase demonstrated (α = 0.05)	Satisfied
Intra-operator Variability (Class Switching Rate)	Statistically significant reduction in intra-reader variability (implied by demonstrating an improvement over USE Alone)	Class switching rate reduced from 13.6% (USE Alone) to 10.8% (USE + DS) (p = 0.042)	Satisfied
Standalone Performance (AUC)	Implying high AUC given the clinical context of a diagnostic aid	88.2% AUC	Achieved
BI-RADS Descriptors - Lesion Orientation (Accuracy)	Overall accuracy falls within the 95% confidence interval of radiologists' performance. (Specific numerical cut-off: 86.12%)	91.12% (95% CI: 89.43% - 92.60%)	Within criteria established for clinical equivalence
BI-RADS Descriptors - Lesion Shape (Accuracy)	Overall accuracy falls within the 95% confidence interval of radiologists' performance. (Specific numerical cut-off: 83.54%)	87.62% (95% CI: 85.68% - 89.36%)	Within criteria established for clinical equivalence
BI-RADS Descriptors - Shape (Cohen's Kappa)	No statistical difference between reader vs. reader agreement and system vs. reader agreement	0.769 (Reader vs Reader) vs. 0.738 (System vs Reader) (95% CI overlap)	Not statistically different
BI-RADS Descriptors - Orientation (Cohen's Kappa)	No statistical difference between reader vs. reader agreement and system vs. reader agreement	0.728 (Reader vs Reader) vs. 0.744 (System vs Reader) (95% CI overlap)	Not statistically different

Study Details

1. Sample Sizes for Test Sets and Data Provenance:

MRMC Clinical Study:
- Cases: 900 patient cases. Each case included up to two orthogonal views of a single breast lesion.
- Data Provenance: Not explicitly stated, but the demographic information for patients (race, age distribution, breast density) is presented in a way that suggests a diverse, likely multi-center, retrospective dataset representative of national rates (Breast Cancer Surveillance Consortium 2006-2009 reference). It is implied to be retrospective as patients "present with a soft tissue breast lesion who are being referred for further diagnostic ultrasound examination" and their ground truth determined by pathology or 1-year follow-up.
Standalone Performance (Malignancy Risk Classification):
- Cases: 900 lesions from 900 different patients. Each lesion represented by two orthogonal images (total 1800 images). This data was specifically "set aside from the system's training data" for validation.
- Data Provenance: Not explicitly stated, but implies similar provenance to the MRMC study's test set as it's a validation set from their overall data.
BI-RADS Descriptors Bench Testing (Shape and Orientation):
- Cases: 1300 cases.
- Data Provenance: Not explicitly stated.

2. Number of Experts Used to Establish Ground Truth for Test Sets and Qualifications:

MRMC Clinical Study: The ground truth for the 900 cases in the MRMC study was "determined by pathology or 1-year follow-up for cases that were not biopsied" (stated under "Malignancy Risk Classification" for the standalone test, but applies to the overall definition of malignancy/benignancy for the test set). No human experts were used to establish the ground truth for the malignancy outcome itself for either the MRMC study or standalone performance validation.
BI-RADS Descriptors Bench Testing (Shape and Orientation):
- Number of Experts: Three MQSA certified radiologists.
- Qualifications: All with over 20 years of experience and at least 3000 images read per year.

3. Adjudication Method for the Test Set:

MRMC Clinical Study & Standalone Malignancy: Ground truth was established by pathology or 1-year follow-up. This is an objective ground truth, so no adjudication among experts was needed for disease status.
BI-RADS Descriptors Bench Testing: Ground truth for shape and orientation was established by majority decision of the three radiologists. For the second test comparing agreement (Cohen's Kappa), "majority agreement was not enforced and all cases were analyzed for reader and reader-system agreement," implying individual assessments were used for pairwise comparisons.

4. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done, and its effect size:

Yes, an MRMC comparative effectiveness study was done.
Study Objective: To determine the impact on Interpreting Physician (Reader) performance (measured by AUC) when Koios DS for Breast and an ultrasound examination are combined (USE + DS), compared to USE Alone.
Effect Size: The study showed a mean AUC shift of +0.037 (from 0.836 for USE Alone to 0.873 for USE + DS). This was found to be statistically significant.
Inter-reader Variability: The average Kendall Tau-B coefficient significantly increased from 0.5404 (USE Alone) to 0.6797 (USE + DS), indicating a reduction in inter-reader variability with AI assistance.
Intra-reader Variability: The class switching rate (a measure of intra-reader variability) was statistically significantly reduced from 13.6% (USE Alone) to 10.8% (USE + DS), demonstrating improvement.

5. If a Standalone (algorithm only without human-in-the-loop performance) was done:

Yes, standalone performance was evaluated for malignancy risk classification and BI-RADS descriptors.
- Malignancy Risk Classification (algorithm only): Achieved an AUC of 88.2% on a validation set of 900 lesions.
- BI-RADS Descriptors (algorithm only): The system's accuracy for shape (87.62%) and orientation (91.12%) classification was compared to and found to be statistically equivalent to human radiologists' performance based on majority decisions. The agreement (Cohen's Kappa) between the system and readers was also found to be not statistically different from agreement between pairs of readers.

6. Type of Ground Truth Used:

For Malignancy Risk Classification (both MRMC and standalone): Ground truth was determined objectively by pathology or 1-year clinical follow-up for cases not biopsied. This is a robust form of ground truth based on definitive outcomes.
For BI-RADS Descriptors (Shape and Orientation): Ground truth was established by expert consensus (majority decision) of three highly experienced, MQSA-certified radiologists.

7. Sample Size for Training Set:

The document states that the underlying engine "draws upon knowledge learned from a large database of known cases" but does not specify the exact sample size of the training set. It only mentions that the 900 cases for standalone validation were "set aside from the system's training data."

8. How Ground Truth for Training Set was Established:

The document states that the engine "draws upon knowledge learned from a large database of known cases, tying image features to their eventual diagnosis, to form a predictive model." While not explicitly detailed, this implies that the ground truth for training data was also established using definitive outcomes like pathology or clinical follow-up, similar to the validation set, to ensure accurate "known cases." For the BI-RADS descriptors, it's mentioned that Koios DS for Breast and cCAD (its predecessor) "share a similar algorithm set, training data, and validation approach for automated shape and orientation assessment," which generally suggests the use of expert annotations for these subjective descriptors.

Ask a Question

Ask a specific question about this device

Page 1 of 1