Search Results

The Sepsis ImmunoScore is an Artificial Intelligence/Machine Learning (AI/ML)-Based Software that identifies patients at risk for having or developing sepsis.

The Sepsis ImmunoScore uses up to 22 predetermined inputs from the patient's electronic health record to generate a risk score and to assign the patient to one of four discrete risk stratification categories, based on the increasing risk of sepsis.

The Sepsis ImmunoScore is intended to be used in conjunction with other laboratory findings and clinical assessments to aid in the risk assessment for presence of or progression to sepsis within 24 hours of patient assessment. It is intended to be used for patients admitted to the Emergency Department or hospital for whom sepsis is suspected, and a blood culture was ordered as part of the evaluation for sepsis. It should not be used as the sole basis to determine the presence of sepsis or risk of developing sepsis within 24 hours.

Device Description

The Sepsis ImmunoScore device is a software as a medical device intended to aid in the risk assessment for progression to sepsis for patients, 18 and older, in an emergency department or hospital. The device is intended to identify patients, who have a blood culture ordered as part of their evaluation for sepsis and who are at risk of having or developing sepsis within the next 24 hours. The software uses 22 parameters from the hospital's electronic medical record (EMR), including demographics, vitals, labs, and sepsis biomarkers, and outputs the Sepsis Patient View.

The Sepsis Patient View can be viewed in the EMR system or the through a web interface and it displays both a sepsis risk score and a risk stratification category as well as other supplemental information. There are four risk stratification categories (Low, Medium, High, or Very High). The device uses an artificial intelligence/machine learning (AI/ML) based algorithm that is locked to compute the risk score and place the patient in a risk category. The Sepsis ImmunoScore is intended to be used in conjunction with other laboratory findings and clinical assessments.

AI/ML Overview

The provided document outlines the analytical and clinical studies performed to demonstrate that the Sepsis ImmunoScore device meets its acceptance criteria.

1. Acceptance Criteria and Reported Device Performance

The primary clinical acceptance criteria for the Sepsis ImmunoScore were:

A pre-specified performance goal of AUROC ≥ 0.75.
Monotonic increase in the sepsis diagnostic predictive value and risk stratification category with an increase in severity.
Non-overlapping predictive value (PV) 95% confidence intervals (CIs) between the low and high, and medium and very high risk stratification categories.

The primary non-clinical acceptance criteria included demonstrating sufficient performance under various conditions, such as:

Robustness to input parameter error (precision/sensitivity and reproducibility of outputs).
Acceptable performance with missing input values (feature imputation study).
Monotonicity of risk scores.
Reproducibility of SHAP values.

Here's a table summarizing the reported device performance against these criteria:

Acceptance Criterion (Clinical)	Target Measure	Acceptance Value	Reported Performance (Adjudicated Forced Majority)	Meets Criteria?
Primary Endpoints
AUROC	AUC	≥ 0.75	0.81 [0.76, 0.86]	Yes
Monotonic Increase in PV	PV for each category	Monotonic increase with risk	Low: 3.02%, Medium: 12.74%, High: 36.59%, Very High: 69.70%	Yes
Non-overlapping 95% CI**s in PV	PV [95% CI]	Non-overlapping between low/high and medium/very high categories	Low: [1.22%, 6.12%], High: [30.90%, 42.58%]
Medium: [7.96%, 18.99%], Very High: [51.29%, 84.41%]	Yes
Secondary Endpoints (e.g., ICU Transfer)	PV [95% CI]	Monotonic increase & non-overlapping CIs	Demonstrated monotonic increase. Non-overlapping CIs met for most, with exception of Mechanical Ventilation likely due to low sample size/power.	Mostly Yes
Acceptance Criterion (Non-Clinical)	Target Measure	Acceptance Value	Reported Performance	Meets Criteria?
Input Parameter Robustness	Sepsis Risk Score Standard Deviation	Low std dev	As shown in Table 10, std dev is low across score intervals.	Yes
Reproducibility of outputs (ICC)	ICC for Sepsis Risk Score	High ICC	Slope of regression lines close to 1, intercept close to 0, indicating robustness to perturbations.	Yes
Impact of Input Parameter Bias	ICC vs. Bias	ICC > 0.966 for all parameters	Figure 14 shows ICCs are consistently high.	Yes
Feature Imputation Study	PV [95% CI] for imputed data	Primary endpoint criteria met	Tables 23 & 24 demonstrate criteria met for varying imputation scenarios.	Yes
Risk Score Monotonicity	Cochran-Armitage Test (p-value)	p 0.90 for top 10, >0.75 for others (no 0.90; all others >0.75 except Temp. Temp. remained > 0.5.	Yes

2. Sample Sizes and Data Provenance

Test Set (Clinical Validation Study): 746 patients.
Data Provenance: The data for the clinical validation study was from a subset of the NOSIS dataset and biobank, collected retrospectively but originating from prospectively collected clinical data. The clinical sites for the validation study were Beth Israel Deaconess Medical Center, Jesse Brown VA - Chicago, IL, and Beaumont - Royal Oak, MI. This provided geographic diversity and diversity in EHR systems, and critically, the data was independent of the algorithm training and tuning sites.
Training Set: 2,366 patients.
Training Data Provenance: From the NOSIS dataset, specifically OSF - Peoria, IL, Mercy Health - St. Louis, MO, and Carle Foundation Hospital - Urbana, IL. Similar to the test set, it was retrospectively-used prospectively collected data.

3. Number and Qualifications of Experts for Ground Truth

Number of Experts: A team of three physicians established the ground truth for the test set.
Qualifications of Experts: The document states they were "a team of three physicians." While specific years of experience or subspecialties are not explicitly mentioned for the adjudicating physicians, the context implies they are qualified medical doctors capable of performing detailed chart reviews and applying the Sepsis-3 definition. They were working at the healthcare institutions from which the subjects received care.

4. Adjudication Method for the Test Set

The adjudication method used was physician adjudication based on Retrospective Chart Diagnosis (RCD) Determination.

The entirety of the patient's record was sent to an adjudication committee of three physicians.
They determined the presence or absence of sepsis and its timing based on the Sepsis-3 definition (presence of infection, occurrence of organ dysfunction, and causality of organ dysfunction due to infection).
The onset of sepsis was defined by an increase of at least 2 points in the Sequential Organ Failure Assessment (SOFA) score due to infection.
Adjudicators were instructed to label cases as "Septic," "Non-Septic," or "Indeterminate."
For "Indeterminate" cases, adjudicators were also asked to provide a "forced decision."
Two primary analysis groups were established based on adjudication:
- "Adjudicated Forced Majority": Sepsis-3 determination was defined by the majority rule of diagnosis by the three physicians.
- "Adjudicated Forced Unanimous": All three physicians agreed on the diagnosis.
The physicians were blinded to the results of the ImmunoScore.
Subjects were randomized for adjudication.
A verification bias study was conducted to assess potential bias introduced by using same-site adjudicators. This study compared the original method (same-site adjudicators, full EMR access) against Method A (independent site adjudicators, abstracted data) and Method B (same-site adjudicators, abstracted data). The agreement between methods was high (e.g., Original vs. Method A: 97.1% [91.7%, 100%]), and the results did not indicate significant bias that would warrant re-adjudication of the entire cohort.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No explicit Multi-Reader Multi-Case (MRMC) comparative effectiveness study comparing human readers with AI vs. without AI assistance was detailed in the provided text. The study focused on the standalone performance of the AI algorithm and its correlation with clinical outcomes, rather than the improvement in human reader performance when assisted by AI. The device is intended for "adjunctive use" and "in conjunction with other laboratory findings and clinical assessments," implying a human-in-the-loop context, but this specific study design was not performed or reported.

6. Standalone (Algorithm-Only) Performance Study

Yes, a standalone performance study was conducted. The clinical validation study directly evaluated the performance of the Sepsis ImmunoScore algorithm in classifying patients as "Septic" or "Non-Septic" based on the adjudicated ground truth. The reported AUROC, PV, and SSLR values (Table 7 and Table 8) represent the algorithm's performance without direct human intervention in the classification process for the test set.

7. Type of Ground Truth Used

The primary ground truth used was expert consensus via physician adjudication, specifically based on a software-encoded version of the Sepsis-3 criteria after detailed retrospective chart review.

This involved determining the presence of infection (Infection Possible, Probable, Definite), occurrence of organ dysfunction, and causality of organ dysfunction due to infection.
The onset time of sepsis was adjudicated based on the timing of a SOFA score increase (at least 2 points) consequent to infection.
For indeterminate cases, a "forced decision" was also made.
Secondary endpoints (in-hospital mortality, ICU admission, mechanical ventilation usage, vasopressor usage, median length of stay) served as objective clinical outcomes that correlated with the risk categories, further supporting the clinical relevance of the device's output.

8. Sample Size for the Training Set

The training set included 2,366 patients.

9. How the Ground Truth for the Training Set Was Established

For the training set, the presence of a sepsis event was determined using two methods:

Medical record analysis: Using a software-encoded version of the Sepsis-3 criteria.
Retrospective chart review: Done by a team of three physicians. These physicians were blinded to the ImmunoScore results.

This dual-approach for ground truth establishment for the training set aimed to provide a robust label for model development.

Ask a Question

Ask a specific question about this device

Page 1 of 1