Search Results

eCART is a software product that provides automated risk stratification and early warning for impending patient deterioration, signified as the composite outcome of death or ICU transfer. It is intended to be used on hospitalized ward patients 18 years of age or older by trained medical professionals.

As a clinical decision support device, eCART's risk score and trend analysis is intended to aid clinical teams in identifying which patients are most likely to clinically deteriorate. eCART provides additional information and does not replace the standard of care or clinical judgment.

eCART scoring is initiated by the documentation of any vital sign on an adult ward patient. The device calculates risk only from validated EHR data, such as vitals that have been confirmed by a registered nurse (RN); unvalidated data streaming from monitors/devices will not be used until confirmed by a healthcare professional. The product predictions are for reference only and no therapeutic decisions should be made based solely on the eCART scores.

Device Description

The AgileMD eCARTv5 Clinical Deterioration Suite ("eCART") is a cloud-based software device that is integrated into the electronic health record ("EHR") in order to anticipate clinical deterioration in adult ward patients, which is signified as either of the following two predicted outcomes: (1) death or (2) ICU transfer. The tool synthesizes routine vital signs, laboratory data, and patient demographics into a single value that can be used to flag patients at-risk of the composite outcome of clinical deterioration for additional evaluation and monitoring. eCARTv5 requires the healthcare system within which it will be used, to provide an EHR connection and data interfaces through which the patient data necessary to run the software will be transmitted.

The primary functions of the system are imparted by the Gradient Boosted Machine ("GBM") learning algorithm that takes input directly from the EHR, in real time, to provide an assessment of patients and displays its outputs in an intuitive user interface which drives providers to follow standardized clinical workflows (established by their institutions) for elevated-risk patients.

eCARTv5's end users include med-surg nursing staff, physicians and other providers caring for these patients. The eCARTv5 composite score is determined from the model output (predicted probability of deterioration) scaled from 0-100, based on the specificity (true negative rate). The observed rate of deterioration at each eCART score threshold, displayed as the odds of deterioration in the next 24 hours, is presented to the user along with the scaled score. Default thresholds are set to an eCART of 93 and 97, respectively, for moderate and high risk categorization.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study details for the eCARTv5 Clinical Deterioration Suite, based on the provided FDA 510(k) summary:

Acceptance Criteria and Reported Device Performance

The acceptance criteria for the eCARTv5 device are implicitly defined by the performance metrics reported in the validation studies, specifically Area Under the Receiver Operating Characteristic curve (AUROC), Sensitivity, Specificity, Positive Predictive Value (PPV), and Negative Predictive Value (NPV) for two risk thresholds (Moderate-risk at eCART ≥93 and High-risk at eCART ≥97). The composite outcome of interest is "Deterioration" (death or ICU transfer within 24 hours).

Table: Acceptance Criteria (Implicit) and Reported Device Performance

Performance Metric	Acceptance Criteria (Implicit Target)	Retrospective Cohort (Deterioration)	Prospective Cohort (Deterioration)
AUROC	(Target > 0.82)	0.835 (0.834, 0.835)	0.828 (0.827, 0.829)
Moderate-risk threshold (eCART ≥93)
Sensitivity	(Target ~48-52%)	51.8% (51.7%, 51.8%)	48.8% (48.7%, 49.0%)
Specificity	(Target ~93-94%)	93.1% (93.1%, 93.1%)	93.3% (93.3%, 93.3%)
PPV	(Target ~8-9%)	9.0% (9.0%, 9.1%)	8.9% (8.8%, 8.9%)
NPV	(Target ~99%)	99.3% (99.3%, 99.3%)	99.3% (99.3%, 99.3%)
High-risk threshold (eCART ≥97)
Sensitivity	(Target ~33-38 %)	38.6% (38.5%, 38.7%)	33.7% (33.6%, 33.9%)
Specificity	(Target ~96-97%)	96.9% (96.9%, 96.9%)	97.3% (97.3%, 97.3%)
PPV	(Target ~14%)	14.2% (14.1%, 14.2%)	14.2% (14.1%, 14.3%)
NPV	(Target ~99%)	99.2% (99.2%, 99.2%)	99.1% (99.1%, 99.1%)

Note: The "Acceptance Criteria (Implicit Target)" values are inferred based on the consistently reported values that demonstrate performance above random chance and clinical utility for risk stratification. The document does not explicitly state pre-defined quantitative acceptance criteria but rather presents the achieved performance as a demonstration of substantial equivalence.

Study Details

Sample sizes used for the test set and data provenance:
- Retrospective Test Set:
  - Encounters (N): 1,769,461 unique hospitalizations.
  - Observations (n): 132,873,833 eCART scores.
  - Unique Patients: 934,454
  - Data Provenance: Admissions between 2009 and 2023 from three geographically distinct health systems. The specific countries are not mentioned, but "US" is inferred from typical FDA submissions. It is retrospective.
- Prospective Test Set:
  - Encounters (N): 205,946 unique hospitalizations.
  - Observations (n): 21,516,964 eCART scores.
  - Unique Patients: 151,233
  - Data Provenance: Non-overlapping admissions between 2023 and 2024 from the same three healthcare systems as the retrospective analysis. It is prospective.
Number of experts used to establish the ground truth for the test set and qualifications of those experts:
- The document does not specify the number of experts used or their qualifications for establishing ground truth. The ground truth (death or ICU transfer) appears to be derived directly from Electronic Health Record (EHR) data, which is objective outcome data, rather than requiring expert labeling.
Adjudication method for the test set:
- The document does not specify an adjudication method. Given that the ground truth is death or ICU transfer from EHR, a formal adjudication process involving multiple experts for each case may not have been necessary, as these are typically clear clinical outcomes documented in the EHR.
If a multi-reader multi-case (MRMC) comparative effectiveness study was done, and the effect size:
- A multi-reader multi-case (MRMC) comparative effectiveness study was not explicitly mentioned as being performed to compare human readers with and without AI assistance. The performance data presented is for the standalone algorithm.
If a standalone (i.e., algorithm only without human-in-the-loop performance) was done:
- Yes, a standalone study was presented. The performance metrics (AUROC, Sensitivity, Specificity, PPV, NPV) are reported for the eCART algorithm itself, without human intervention in the reported performance. The device is intended as a "clinical decision support device" to "aid clinical teams in identifying which patients are most likely to clinically deteriorate."
The type of ground truth used:
- The ground truth used is outcomes data derived from the Electronic Health Record (EHR). Specifically, "Deterioration is defined as death or ward to ICU transfer within 24 hours following a score." Mortality is defined as "death within 24 hours following a score." These are objective clinical events.
The sample size for the training set:
- The document states: "eCART's algorithm was trained on ward patients..." but does not explicitly provide the sample size of the training set. It only provides details for the retrospective and prospective validation cohorts which are distinct from the training set.
How the ground truth for the training set was established:
- The document does not explicitly detail how the ground truth for the training set was established. However, given the nature of the ground truth for the test set (death or ICU transfer from validated EHR data), it is reasonable to infer that the ground truth for the training set would have been established similarly using objective patient outcomes data from EHRs.

Ask a Question

Ask a specific question about this device

K Number

K233216

Device Name

CLEWICU System

Manufacturer

Clew Medical Ltd.

Date Cleared

2024-01-13

(107 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K200717

Predicate For

N/A

Intended Use

CLEWICU provides the clinician with physiological insight into a patient's likelihood of future hemodynamic instability. CLEWICU is intended for use in hospital critical care settings for patients 18 years and over. CLEWICU is considered to provide additional information regarding the patient's predicted future risk for clinical deterioration, as well as identifying patients at low risk for deterioration. The product predictions are for reference only and no therapeutic decisions should be made based solely on the CLEWICU predictions.

Device Description

Intensive Care Unit (ICU) .
. Emergency Department's (ED) Critical Care or Resuscitation area
Post-Anesthesia Care Unit (PACU) .
. Step-Down Unit
Post-Surgical Recovery Unit .
. Other Specialized Care Units (e.g., Cardiac Care Unit (CCU), Neurocritical Care Unit (NCU), High-dependency Care Unit (HDU)
ClewICUServer and ClewICUnitor are software-only devices that are run on a user-provided server or cloud-infrastructure.
The ClewICUServer is a backend software platform that imports patient data from various sources including Electronic Health Record (EHR) data and patient monitoring devices through an HL7 data connection. The data are then used by models operating within the ClewICUServer to compute and store the CLEWHI index (likelihood of hemodynamic instability requiring vasopressor / inotrope support), and CLEWLR (indication that the patient is at "low risk" for deterioration).
The ClewICUnitor is the web-based user interface displaying CLEWHI, and CLEWLR associated notifications and related measures, as well as presenting the overall unit status.

AI/ML Overview

Here is an analysis of the acceptance criteria and study proving the device meets them, based on the provided text:

1. Table of Acceptance Criteria and Reported Device Performance

The CLEWICU System includes two models: CLEWHI (predicts hemodynamic instability) and CLEWLR (identifies low risk for deterioration).

Model	Metric	Acceptance Criteria (Target Point Estimate)	UMASS Study Performance (95% CI)	MIMIC Study Performance (95% CI)	Met Criteria?
CLEWHI	Sensitivity	60%	63% (59-67%)	69% (66-73%)	Yes
	PPV	10%	12% (11-14%)	10% (9-11%)	Yes
CLEWLR	Specificity	90%	90.5% (89.6-91.4%)	90% (89.1-80.9%)	Yes
	Sensitivity	25%	47% (46.8-47.2%)	35.5% (35.3-35.7%)	Yes

Note on CLEWLR Specificity (MIMIC): The provided 95% CI for MIMIC Specificity for CLEWLR is stated as (89.1 - 80.9). This appears to be a typo, as the lower bound (89.1%) is higher than the upper bound (80.9%). Assuming the intent was 89.1-90.9 or similar, and given the point estimate is 90% (meeting the target), it is considered to have met the criteria. The text explicitly states "The model validation test results demonstrate that the clinical performance of the CLEWICU models continue to meet the pre-defined acceptance criteria."

Minimum Required Performance Specifications for PCCP (Post-clearance models):

Model	Metric	Minimum Required Performance
CLEWHI	Sensitivity	0.6 (60%)
	PPV	0.1 (10%)
CLEWLR	Sensitivity	0.25 (25%)
	Specificity	0.9 (90%)

2. Sample Size Used for the Test Set and Data Provenance

Sample Sizes:
- UMass dataset: 6534 unique patient stays
- MIMIC-III dataset: 5069 unique patient stays
Data Provenance: Retrospective cohort study. The text explicitly states "This was a retrospective cohort study that involved two separate health care systems, each evaluated independently."
- UMass dataset: From the University of Massachusetts elCU dataset.
- MIMIC-III dataset: From the MIMIC-III dataset (general knowledge indicates this is a publicly available dataset primarily from Beth Israel Deaconess Medical Center, USA). The country of origin for both is implicitly the USA, as these are US-based datasets/institutions.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

The document does not specify the number or qualifications of experts used to establish the ground truth for the test set. It describes the models predicting "hemodynamic instability requiring vasopressor / inotrope support" and "low risk for deterioration," but it doesn't detail how these ground truth labels were derived.

4. Adjudication Method for the Test Set

The document does not describe any adjudication method for establishing the ground truth for the test set.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was Done, and Effect Size of Human Improvement with AI vs Without AI Assistance

No, a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was not done. The study focuses purely on the standalone performance of the algorithm. There is no mention of human readers or AI assistance for human readers, nor any effect size for human improvement.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) was Done

Yes, a standalone study was done. The entire study described focuses on the direct performance of the CLEWHI and CLEWLR models against predefined criteria. The device is a "stand-alone analytical software product."

7. The Type of Ground Truth Used (expert consensus, pathology, outcomes data, etc.)

The ground truth used appears to be outcomes data based on clinical events. The CLEWHI model predicts "likelihood of occurrence of certain clinically significant events... including hemodynamic instability requiring vasopressor / inotrope support." The CLEWLR model identifies patients at "low risk for deterioration." These are objective clinical outcomes that can be derived from EHRs and patient monitoring data, which are the sources for "patient data from various sources including Electronic Health Record (EHR) data and patient monitoring devices." The document does not mention expert consensus or pathology for ground truth.

8. The Sample Size for the Training Set

The document states that the models were "re-trained using a reduced set of features." However, it does not explicitly state the sample size of the training set(s) used for this re-training. It only provides the sample sizes for the independent test sets (UMass and MIMIC-III). The PCCP section mentions "one dataset for training and a different, completely independent, dataset for testing." This implies a separate training dataset was used, but its size is not given.

9. How the Ground Truth for the Training Set was Established

The document does not explicitly describe how the ground truth for the training set was established. Given the nature of the ground truth for the test sets (clinical outcomes like hemodynamic instability or low risk of deterioration), it can be inferred that the training set's ground truth was established by similar objective clinical event definitions derived from historical patient data (EHR, monitoring devices).

Ask a Question

Ask a specific question about this device

K Number

K230842

Device Name

SignalHF (IM008)

Manufacturer

Implicity Inc.

Date Cleared

2023-10-25

(211 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K203224

Predicate For

N/A

Intended Use

The SignalHF System is intended for use by qualified healthcare professionals (HCP) managing patients over 18 years old who are receiving physiological monitoring for Heart Failure surveillance and implanted with a compatible Cardiac Implantable Electronic Devices (CIED) (i.e., compatible pacemakers, ICDs, and CRTs).

The SignalHF System provides additive information to use in conjunction with standard clinical evaluation.

The SignalHF HF Score is intended to calculate the risk of HF for a patient in the next 30 days.

This System is intended for adjunctive use with other physiological vital signs and patient symptoms and is not intended to independently direct therapy.

Device Description

SignalHF is a software as medical device (SaMD) that uses a proprietary and validated algorithm, the SignalHF HF Score, to calculate the risk of a future worsening condition related to Heart Failure (HF). The algorithm computes this HF score using data obtained from (i) a diverse set of physiologic measures generated in the patient's remotely accessible pre-existing cardiac implant (activity, atrial burden, heart rate variability, heart rate, heart rate at rest, thoracic impedance (for fluid retention), and premature ventricular contractions per hour), and (ii) his/her available Personal Health Records (demographics). SignalHF provides information regarding the patient's health status (like a patient's stable HF condition) and also provides alerts based on the SignalHF HF evaluation. Based on an alert and a recovery threshold on the SignalHF score established during the learning phase of the algorithm and fixed for all patients, our monitoring system is expected to raise an alert 30 days (on median) before a predicted HF hospitalization event.

SignalHF does not provide a real-time alert. Rather, it is designed to detect chronic worsening of HF status. SignalHF is designed to provide a score linked to the probability of a future decompensated heart failure event specific to each patient. Using this adjunctive information, healthcare professionals can make adjustments for the patient based on their clinical judgement and expertise.

The score and score-based alerts provided through SignalHF can be displayed on any compatible HF monitoring platform, including the Implicity platform. The healthcare professional (HCP) can utilize the SignalHF HF score as adjunct information when monitoring CIED patients with remote monitoring capabilities.

The HCP's decision is not based solely on the device data which serves as adjunct information, but rather on the full clinical and medical picture and record of the patient.

AI/ML Overview

Here's a summary of the acceptance criteria and the study proving the device meets them, based on the provided FDA 510(k) summary for SignalHF:

Acceptance Criteria and Device Performance for SignalHF

The SignalHF device was evaluated through the FORESEE-HF Study, a non-interventional clinical retrospective study.

1. Table of Acceptance Criteria and Reported Device Performance

For ICD/CRT-D Devices:

Endpoints	Acceptance Criteria (Objective)	SignalHF Performance (ICD/CRT-D Devices)
Sensitivity for detecting HF hospitalization (%)	> 40%	59.8% [54.0%; 65.4%]
Unexplained Alert Rate PPY	< 2.0	0.654 [0.614; 0.692]
Lower quartile on alerting time (in days)	> 15 days	35.0 [27.0; 52.0]

For Pacemaker/CRT-P Devices:

Endpoints	Acceptance Criteria (Objective)	SignalHF Performance (Pacemaker/CRT-P Devices)
Sensitivity for detecting HF hospitalization (%)	> 30%	45.9% [38.1%; 53.8%]
Unexplained Alert Rate PPY	< 2.0	0.470 [0.441; 0.502]
Lower quartile on alerting time (in days)	> 15 days	37 [24.5; 53.0]

2. Sample Size and Data Provenance for the Test Set

Test Set (Clinical Cohort) Sample Size: 6,740 patients (comprising PM 7,360, ICD 5,642, CRT-D 4,116 and CRT-P 856 - Note: there appears to be a discrepancy in the total sum provided, however, "6,740" is explicitly stated as the 'Clinical cohort' which is the test set).
Data Provenance: Retrospective study using data from the French national health database "SNDS" (SYSTÈME NATIONAL DES DONNÉES DE SANTÉ) and Implicity proprietary databases. The follow-up period was 2017-2021.

3. Number of Experts and Qualifications for Ground Truth

The document does not explicitly state the number of experts used to establish ground truth or their specific qualifications (e.g., radiologist with 10 years of experience). However, the ground truth was "hospitalizations with HF as primary diagnosis" as recorded in the national health database, implying that these diagnoses were made by qualified healthcare professionals as part of routine clinical care documented within the SNDS.

4. Adjudication Method for the Test Set

The document does not specify an adjudication method like 2+1 or 3+1 for establishing the ground truth diagnoses. The study relies on “hospitalizations with HF as primary diagnosis” from the national health database, suggesting that these are established clinical diagnoses within the healthcare system.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

There is no indication that a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was done to evaluate human reader improvement with AI assistance. The study focuses solely on the standalone performance of the SignalHF algorithm.

6. Standalone Performance

Yes, a standalone (algorithm only without human-in-the-loop performance) study was done. The FORESEE-HF study evaluated the SignalHF algorithm's performance in predicting heart failure hospitalizations based on CIED data and personal health records.

7. Type of Ground Truth Used

The ground truth used was outcomes data, specifically "hospitalizations with HF as primary diagnosis" recorded in the French national health database (SNDS).

8. Sample Size for the Training Set

Training Cohort Sample Size: 7,556 patients

9. How the Ground Truth for the Training Set Was Established

The document states that the algorithm computes the HF score using physiological measures from compatible CIEDs and available Personal Health Records (demographics). It also mentions that the "recovery threshold on the SignalHF score established during the learning phase of the algorithm and fixed for all patients". This implies that the ground truth for the training set, similar to the test set, was derived from the same data sources: "hospitalizations with HF as primary diagnosis" documented within the SNDS database. The training process would have used these documented HF hospitalizations as the target outcome for the algorithm to learn from.

Ask a Question

Ask a specific question about this device

K Number

K231038

Device Name

Global Hypoperfusion Index (GHI) Algorithm

Manufacturer

Edwards Lifesciences, LLC

Date Cleared

2023-07-26

(105 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K203687,K200717

Predicate For

K232294

Intended Use

The Global Hypoperfusion Index (GHI) algorithm provides the clinician with physiological insight into a patient's likelihood of future hemodynamic instability. The GHI algorithm provides the risk of a global hypoperfusion event (defined as SvO2 ≤ 60% for at least 1 minute) occurring in the next 10-15 minutes.

The GHI algorithm is intended for use in surgical patients receiving advanced hemodynamic monitoring with the Swan-Ganz catheter.

The GHI algorithm is considered to provide additional information regarding the patient's predicted future risk for clinical deterioration, as well as identifying patients at low risk for deterioration. The product predictions are for reference only and no therapeutic decisions should be made based solely on the GHI algorithm predictions.

Device Description

The Global Hypoperfusion Index (GHI) parameter provides the clinician with physiological insight into a patient's likelihood of a global hypoperfusion event on average 10-15 minutes before mixed venous oxygen saturation (SvO2) reaches 60%. The GHI feature is intended for use in surgical or nonsurgical patients. The product predictions are adjunctive for reference only and no therapeutic decisions should be made based solely on the GHI parameter.

AI/ML Overview

The provided text does not contain detailed acceptance criteria or a comprehensive study report with all the requested information. It primarily presents the FDA's 510(k) clearance letter and a summary of the device, its indications for use, and a comparison to predicate devices, stating that performance testing was executed and that no clinical trial was performed for the 510(k) submission.

However, based on the available information, here's what can be extracted and inferred:

1. A table of acceptance criteria and the reported device performance:

The document mentions that the GHI algorithm provides the risk of a global hypoperfusion event (defined as SvO2 ≤ 60% for at least 1 minute) occurring in the next 10-15 minutes and alerts the clinician on average 10-15 minutes before SvO2 reaches 60%. It also states that the GHI algorithm provides an index from 0 to 100 where the higher the value, the increased likelihood that a global hypoperfusion event will occur.

While specific numerical acceptance criteria (e.g., minimum sensitivity, specificity, or AUC) and their corresponding achieved performance values are not explicitly stated in the provided text, the overall conclusion is that the algorithm "has successfully passed functional and performance testing" and "meets the predetermined design and performance specifications." This implies that internal acceptance criteria were met, even if they are not detailed here.

Example (Hypothetical, as not provided in text):

Metric	Acceptance Criteria (Hypothetical)	Reported Device Performance (Implied as "met")
Time to Alert	Average 10-15 minutes before event	Achieved average 10-15 minutes before event
Ability to Identify Risk	GHI 0-100, higher = increased risk	GHI provides increased likelihood with higher values
Overall Performance	Meets predetermined specifications	Met predetermined specifications

2. Sample size used for the test set and the data provenance:

Sample Size: The text states, "Prospective analyses of retrospective clinical data from multiple independent datasets, comprised of data from a diverse set of patients over the age of 18 years undergoing surgical procedures with invasive monitoring, were analyzed to verify the safety and performance of the subject device." However, the exact sample size (number of patients or data points) for the test set is not specified.
Data Provenance:
- Country of Origin: Not specified in the provided text.
- Retrospective or Prospective: "Prospective analyses of retrospective clinical data" implies that existing (retrospective) data was collected and then analyzed in a forward-looking (prospective) manner for the purpose of the study.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:

The text does not provide any information regarding the number of experts, their qualifications, or their involvement in establishing ground truth for the test set.

4. Adjudication method (e.g., 2+1, 3+1, none) for the test set:

The text does not provide any information regarding an adjudication method for the test set.

5. If a multi-reader multi-case (MRMC) comparative effectiveness study was done, and if so, what was the effect size of how much human readers improve with AI vs without AI assistance:

The text explicitly states: "No clinical trial was performed in support of the subject 510(k)." This indicates that an MRMC comparative effectiveness study involving human readers and AI assistance was not conducted for this submission.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:

Yes, a standalone performance evaluation was conducted. The text states:

"Algorithm performance was tested using clinical data."
"The algorithm was tested at the algorithm level to ensure the safety of the device. All tests passed."
"Prospective analyses of retrospective clinical data... were analyzed to verify the safety and performance of the subject device."

This confirms that the algorithm's performance was assessed independently.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.):

The ground truth for a "global hypoperfusion event" is explicitly defined in the Indications for Use as: "SvO2 ≤ 60% for at least 1 minute." This is an objective physiological measurement (outcomes data) rather than expert consensus or pathology.

8. The sample size for the training set:

While the text mentions that "patient waveforms were collected in support of the development and validation of the GHI algorithm," the sample size for the training set is not specified.

9. How the ground truth for the training set was established:

Given that the ground truth for the device's output is based on SvO2 measurements, it is highly probable that the ground truth for the training set was established using the same objective physiological measurement: SvO2 ≤ 60% for at least 1 minute. The text implies that clinical data (patient waveforms) were used for both development and validation.

Ask a Question

Ask a specific question about this device

K Number

K200717

Device Name

CLEWICU System (ClewICUServer and ClewICUnitor)

Manufacturer

CLEW Medical Ltd.

Date Cleared

2021-01-09

(297 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

DEN160044,K183646,K190205

Predicate For

K212219,K231038,K233216,K233253

Intended Use

CLEWICU provides the clinician with physiological insight into a patient's likelihood of future hemodynamic instability. CLEWICU is intended for use with intensive care unit (ICU) patients 18 years and over. CLEWICU is considered to provide additional information regarding the patient's predicted future risk for clinical deterioration, as well as identifying patients at low risk for deterioration. The product predictions are for reference only and no therapeutic decisions should be made based solely on the CLEWICU predictions.

Device Description

The CLEWICU System is a stand-alone analytical software product that includes the ClewICUServer and the ClewICUnitor. It uses models derived from machine learning to calculate the likelihood of occurrence of certain clinically significant events for patients in the intensive care unit (ICU). ClewICUServer and ClewICUnitor are software-only devices that are installed on user-provided hardware. The ClewICUServer is a backend software platform that imports patient data from various sources including Electronic Health Record (EHR) data and medical device data. The data are then used by models operating within the ClewICUServer to compute and store the CLEWHI index (likelihood of hemodynamic instability requiring vasopressor / inotrope support), and CLEWLR (indication that the patient is at "low risk" for deterioration). The ClewICUnitor is the web-based user interface displaying CLEWHI, and CLEWLR associated notifications and related measures, as well as presenting the overall unit status.

AI/ML Overview

Here's a breakdown of the acceptance criteria and the study details for the CLEWICU System, based on the provided text:

1. Acceptance Criteria and Reported Device Performance

The acceptance criteria are implied by the reported performance metrics of the device, particularly the ranges for the 95% Confidence Intervals (CI). The study aimed to demonstrate acceptable performance for predicting hemodynamic instability (CLEWHI) and identifying low-risk patients (CLEWLR).

Metric	Acceptance Criteria (Implied by 95% CI)	Reported Device Performance
Hemodynamic Instability Model (CLEWHI)
Sensitivity	≥ 56.9%	60.6%
Positive Predictive Value (PPV)	≥ 20.7%	22.3%
Lead-time (true positive alerts)	Not explicitly quantified as a range, but reported as central tendencies for acceptable lead time.	Median: 3.0 hours, 25th Percentile: 1.6 hours, 75th Percentile: 4.8 hours
Low Risk Model (CLEWLR)
Specificity	≥ 94.8%	95.7%
Sensitivity	≥ 21.2%	21.4%

2. Sample Size Used for the Test Set and Data Provenance

Sample Size (Test Set): Not explicitly stated as a single number. The study utilized a dataset from the WakeMed Health System, including patient stays in 7 intensive care units across 2 hospitals. The number of patients or patient stays for the retrospective clinical validation is not precisely quantified, but it was used for both training and validation.
Data Provenance:
- Country of Origin: United States (WakeMed Health System indicated).
- Retrospective or Prospective: Retrospective.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

Number of Experts: Not explicitly stated as a specific number. The text mentions a "tagging system was developed and validated (against human physician readers as ground truth)." This implies multiple physician readers were involved in establishing or validating the ground truth for selected cases.
Qualifications of Experts: "Human physician readers." No further specific qualifications (e.g., years of experience, subspecialty) are provided.

4. Adjudication Method for the Test Set

Adjudication Method: Not explicitly detailed. The text states "a tagging system was developed and validated (against human physician readers as ground truth)." This suggests an indirect method where the tagging system learned from or was compared against physician assessments, rather than direct physician adjudication of every case in the test set.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

MRMC Study: No, an MRMC comparative effectiveness study was not explicitly mentioned or described. The study focused on the performance of the standalone AI system.
Effect Size with AI vs. Without AI Assistance: Not applicable, as an MRMC study was not performed.

6. Standalone (Algorithm Only Without Human-in-the-Loop Performance) Study

Standalone Study: Yes, the described clinical validation is a standalone performance study. The reported metrics (Sensitivity, PPV, Specificity, Lead-time) directly reflect the algorithm's performance without explicit human intervention or assistance during the evaluation. The device is described as "a stand-alone analytical software product."

7. Type of Ground Truth Used

Type of Ground Truth: The ground truth was established by "a tagging system... validated (against human physician readers as ground truth)." This suggests a hybrid approach where an automated tagging system, verified by human expert consensus (physician readers), was used to create the labels for the "events of interest" (hemodynamic instability, low risk).

8. Sample Size for the Training Set

Sample Size (Training Set): Not explicitly stated as a specific number. The text mentions "the WakeMed dataset included patient stays in 7 intensive care units across 2 hospitals between 5 November 2019 and 30 June 2020." This dataset was used for "training of the CLEWICU predictive models," but the specific portion or number of cases allocated for training is not provided.

9. How the Ground Truth for the Training Set Was Established

Ground Truth for Training Set: "Once validated, the tagging system was used to generate the clinical truth labels needed, both for training of the CLEWICU predictive models and for validation of the clinical performance of those models." This indicates that the same "tagging system" (validated against human physician readers) was used to establish the ground truth for the training set.

Ask a Question

Ask a specific question about this device

Page 1 of 1