Search Results

The TSHL method is an in vitro diagnostic test for the quantitative measurement of Thyroid Stimulating Hormone (TSH, thyrotropin) in human serum and plasma on the Dimension® EXL™ integrated chemistry system with LOCI® Module. Measurements of TSH are used in the diagnosis and monitoring of thyroid disease.

The FT4L method is an in vitro diagnostic test for the quantitative measurement of Free Thyroxine in human serum and plasma on the Dimension® EXL™ integrated chemistry system with LOCI® Module. Measurements of free thyroxine are used in the diagnosis and monitoring of thyroid disease.

Device Description

The Dimension® LOCI® Thyroid Stimulating Hormone Flex® reagent cartridge (TSHL) and Dimension® LOCI® Free Thyroxine Flex® reagent cartridge (FT4L) assays were cleared under K081074 and K073604, respectively. The components of the cleared assays were modified to reduce biotin interference.

The modified Assays are comprised of the following components:

Dimension® LOCI® Thyroid Stimulating Hormone Flex® reagent cartridge (TSHL): prepackaged liquid reagents in a plastic eight-well cartridge. Wells 1-2 contain Biotinylated TSH antibody (7.5 µg/mL mouse monoclonal), wells 3-4 contain TSH antibody coated Chemibeads (200 µg/mL mouse monoclonal), and wells 5-6 contain Streptavidin Sensibeads (1400 µg/mL recombinant E. coli). Wells 1-6 contain buffers, stabilizers and preservatives. Wells 7-8 are empty.

Dimension® LOCI® Free Thyroxine Flex® reagent cartridge (FT4L): prepackaged liquid reagents in a plastic eight-well cartridge. Wells 1-2 contain Streptavidin Sensibeads (225 µg/mL recombinant E. coli), wells 3-4 contain T3 Chemibeads (200 µg/mL), and wells 5-6 contain FT4 Biotinylated antibody (50 ng/mL mouse monoclonal). Wells 1-6 contain buffers, stabilizers and preservatives. Wells 7-8 are empty.

Test Principle: Both devices use a homogeneous chemiluminescent immunoassay based on LOCI® technology.
For TSHL, it's a sandwich immunoassay where sample is incubated with biotinylated antibody and Chemibeads to form bead-TSH-biotinylated antibody sandwiches. Sensibeads are added and bind to the biotin to form bead-pair immunocomplexes. Illumination at 680 nm generates singlet oxygen from Sensibeads which diffuses into Chemibeads, triggering a chemiluminescent reaction. The resulting signal is measured at 612 nm and is a direct function of TSH concentration.
For FT4L, it's a sequential immunoassay where sample is incubated with biotinylated antibody. T3 Chemibeads are added and form bead/biotinylated antibody immunocomplexes with the non-saturated fraction of the biotinylated antibody. Sensibeads are then added and bind to the biotin to form bead pair immunocomplexes. Illumination at 680 nm generates singlet oxygen from Sensibeads which diffuses into the Chemibeads, triggering a chemiluminescent reaction. The resulting signal is measured at 612 nm and is an inverse function of FT4 concentration.

AI/ML Overview

The document provided is a 510(k) clearance letter from the FDA for two in-vitro diagnostic (IVD) devices: Dimension® LOCI® Thyroid Stimulating Hormone Flex® reagent cartridge (TSHL) and Dimension® LOCI® Free Thyroxine Flex® reagent cartridge (FT4L). It describes the devices, their intended use, and the performance characteristics tested to demonstrate substantial equivalence to previously cleared predicate devices.

However, it's crucial to understand that this document describes a reagent cartridge, which is a laboratory assay, not an AI/ML-driven device or an imaging device. Therefore, many of the requested criteria (e.g., sample size for training/test sets for AI, data provenance like country of origin for AI, ground truth establishment by experts, adjudication methods, MRMC studies, standalone AI performance) are not applicable to this type of device. The document details the performance of the assay itself in measuring biomarker concentrations, not an AI's ability to interpret images or assist human readers.

I will interpret the request based on the information provided for this specific IVD device, noting where certain requested details are not relevant to the nature of the device.

Acceptance Criteria and Study to Prove Device Meets Criteria (for an IVD Reagent Cartridge)

The device in question, a reagent cartridge for quantitative measurement of TSH and FT4, is a laboratory assay, not an AI/ML or imaging interpretation device. Therefore, the "acceptance criteria" and "study" are focused on analytical performance characteristics (accuracy, precision, linearity, interference, detection limits, etc.) compared to a predicate device, rather than diagnostic accuracy metrics of an AI.

1. Table of Acceptance Criteria and Reported Device Performance

For an IVD reagent cartridge, "acceptance criteria" are typically defined by ranges, limits, or statistical agreementsdemonstrating analytical performance comparable or superior to the predicate device and meeting relevant clinical or analytical standards (e.g., CLSI guidelines). The reported performance demonstrates that the modified devices meet these standards.

Performance Characteristic	Acceptance Criteria (Implicit from CLSI Guidelines/Predicate Comparison)	Reported Device Performance (TSHL)	Reported Device Performance (FT4L)
Detection Limits	Meet/Be comparable to predicate; within acceptable analytical ranges.	LoB: 0.003 µIU/LLoD: 0.005 µIU/LLoQ: 0.007 µIU/L	LoB: 0.03 ng/dLLoD: 0.05 ng/dLLoQ: 0.06 ng/dL
Linearity / Measuring Interval	Linear across the claimed measuring range with acceptable bias.	0.007 – 100 µIU/mL	0.1 – 8.0 ng/dL
Method Comparison (vs. Predicate)	High correlation (r close to 1), slope close to 1, small y-intercept.	N=145 Serum samplesy = 0.99x + 0.039 µIU/mL(Correlation (r) implicitly high, as regression equation suggests strong agreement)	N=146 Serum samplesy = 1.02x + 0.03 ng/dL(Correlation (r) implicitly high, as regression equation suggests strong agreement)
Precision (Repeatability)	Within-run and total precision (SD/CV) within acceptable clinical laboratory limits.	TSHL: Levels 0.110-88.676 µIU/mLWithin-Run %CV: 2.6-4.4%Total %CV: 1.1-3.0% (Note: Table 5 "Total" %CV for Level 1 is 2.6%, matching within-run %CV, but for others, it's lower. This might be a typo in the table, typically Total CV > Within-Run CV).	FT4L:Levels 0.81-6.41 ng/dLWithin-Run %CV: 2.2-2.6%Total %CV: 0.9-1.1%
Precision (Reproducibility)	Total reproducibility (SD/CV) across lots and systems within acceptable clinical laboratory limits.	TSHL:Levels 0.094-81.372 µIU/mLReproducibility %CV: 4.6-7.6%	FT4L:Levels 0.70-6.49 ng/dLReproducibility %CV: 1.8-2.4%
Recovery (Dilution)	For TSHL, diluted samples should show recovery close to 100% of the true value.	TSHL:Recovery ranged from 100% to 106% for various samples diluted 5x.	N/A (FT4L not described for dilution recovery)
Interference (Biotin)	Modified assay shows significantly reduced interference compared to predicate.	TSHL & FT4L: Specimens with biotin up to 1200 ng/mL demonstrate ≤10% change in results (significant improvement from predicate's 250 ng/mL for TSHL and 100 ng/mL for FT4L).	TSHL & FT4L: Specimens with biotin up to 1200 ng/mL demonstrate ≤10% change in results.
Reference Range Verification	Results from healthy samples confirm the established reference intervals.	TSHL: Verified for adults (0.358-3.74 µIU/mL) and pediatric populations.	FT4L: Verified for adults (0.76-1.46 ng/dL) and pediatric populations.
Matrix Comparison	Comparable performance across different sample matrices.	Comparable values to serum samples for lithium heparin, sodium heparin, and K2-EDTA plasma.	Same as TSHL.
Hook Effect	No significant hook effect within specified range.	No hook effect observed up to 30,000 µIU/mL.	N/A (FT4L not described for hook effect)

2. Sample Sizes and Data Provenance for the Test Set

The concept of a "test set" in the context of an IVD reagent cartridge refers to the set of samples used for various analytical performance studies. These are not typically split into "training" and "test" sets as in AI/ML.

Method Comparison:
- TSHL: 145 patient samples (serum)
- FT4L: 146 patient samples (serum)
Precision (Repeatability): 5 serum samples (TSHL), 3 serum samples (FT4L)
Precision (Reproducibility): 5 serum samples (TSHL), 3 serum samples (FT4L)
Linearity: Low and high human serum pools used to create dilution series (TSHL: 12 levels, FT4L: 10 levels)
Interference (Biotin and HIL): Samples spiked with interferents, specific TSH/FT4 levels tested.
Dilution Recovery: 7 samples (TSHL)
Reference Range Verification: "Apparently healthy samples" (specific N not provided, but typically a statistically significant number for verification per CLSI EP28-A3C).
Matrix Comparison: Samples of various tube types (Serum, lithium heparin, sodium heparin, K2-EDTA plasma)

Data Provenance: The document does not specify the country of origin of the patient samples. The studies are explicitly described as analytical performance studies rather than clinical outcome studies, and they are retrospective (samples tested in the lab, not followed prospectively).

3. Number of Experts and Qualifications for Ground Truth

This is not applicable as the device is a quantitative IVD assay (reagent cartridge), not an AI/ML device requiring expert interpretation of complex clinical data or images. The "ground truth" for this device is the actual concentration of TSH or FT4 in the sample, typically established either by:

Reference methods (e.g., mass spectrometry, although not explicitly stated as the ground truth method here).
The predicate device itself (as used in method comparison studies, where the predicate is the "comparison assay").
Spiking known concentrations into matrices.

4. Adjudication Method for the Test Set

This is not applicable for a quantitative IVD reagent. Adjudication methods (e.g., 2+1, 3+1) are typically used in scenarios where human experts interpret data (like medical images), and their disagreements need to be resolved to establish a definitive ground truth for AI model evaluation.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

This is not applicable. An MRMC study is designed to evaluate the diagnostic performance of human readers, often with and without AI assistance, on a set of cases. This device is a reagent cartridge that provides a quantitative measurement, not an AI that assists human interpretation.

6. Standalone Performance (Algorithm Only Without Human-in-the-Loop)

This is not applicable. This device is a reagent cartridge that runs on an automated system, providing a quantitative result. It's inherently "standalone" in providing the measurement, but it's not an "algorithm only" in the sense of an AI interpreting complex data. The performance metrics listed (precision, accuracy relative to predicate, linearity, etc.) are its "standalone" performance.

7. Type of Ground Truth Used

The "ground truth" for this type of quantitative diagnostic test is based on:

Comparison to a legally marketed predicate device: The current, FDA-cleared versions of the TSHL and FT4L assays (K081074 and K073604) acted as the "gold standard" or comparison method for the method comparison studies.
Known concentrations: For linearity, recovery, and interference studies, samples were prepared with known concentrations or spiked with known amounts of analytes or interferents.
Analytically verified samples: Samples used for precision studies have mean values derived from repeated measurements.

8. Sample Size for the Training Set

This is not applicable as the device is a non-AI/ML IVD reagent cartridge. There is no concept of a "training set" for this type of product. The development and optimization of the reagent formulation are internal processes, but they don't involve "training" a model on a dataset in the AI sense.

9. How Ground Truth for the Training Set Was Established

This is not applicable for the same reason as point 8.

Ask a Question

Ask a specific question about this device

K Number

K234091

Device Name

Maverick Diagnostic System TC1000; Maverick Test Panel A0.B0

Manufacturer

Genalyte, Inc.

Date Cleared

2024-07-22

(209 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

N/A

Predicate For

N/A

Intended Use

The Maverick Test Panel A0.B0 is an immunoassay for the quantitative determination of human thyroid stimulating hormone (thyrotropin, TSH) in human serum and K2EDTA plasma on the Maverick Diagnostic System TC1000. Measurements of thyroid stimulating hormone produced by the anterior pituitary are used in the diagnosis of thyroid or pituitary disorders.

Device Description

The Maverick Diagnostic System TC1000 is an automated immunoassay analyzer intended for in vitro diagnostic use to determine analytes in a clinical laboratory. The system's assay applications utilize silicon photonics technology. The Maverick Test Panel A0.B0 is an immunoassay for the quantitative determination of human thyroid stimulating hormone (thyrotropin, TSH) in human serum and K2EDTA plasma on the Maverick Diagnostic System TC1000.

AI/ML Overview

The provided text is an FDA 510(k) clearance letter for an in vitro diagnostic (IVD) device, specifically the Maverick Diagnostic System TC1000 and Maverick Test Panel A0.B0, which is an immunoassay for quantitative determination of human thyroid stimulating hormone (TSH).

Therefore, the concepts of "AI models," "human readers," "radiologists," "MRMC studies," "effect size," and establishing "ground truth for test/training sets by expert consensus/pathology/outcomes data" are not applicable to this type of device and its clearance process.

The FDA clearance for this IVD device is based on demonstrating substantial equivalence to a legally marketed predicate device. This typically involves performance studies (e.g., analytical performance, clinical performance) to show the new device performs as intended and is as safe and effective as the predicate. The "acceptance criteria" and "study that proves the device meets the acceptance criteria" for an IVD device like this would revolve around its analytical and clinical performance characteristics, not AI model metrics or reader studies.

Since the prompt's requested information format is tailored for an AI/CADe (Computer-Assisted Detection/Diagnosis) device, and the provided document describes an IVD device, directly answering the prompt's specific points (1-9) is not possible based on the text. The text does not contain information about AI model performance, human reader studies, or how a "ground truth" for an image-based AI would be established.

To address the spirit of the prompt, had this been an AI/CADe device, and assuming the information was present, here's how a response might look. However, it's crucial to reiterate that this specific document is not for such a device.

(Hypothetical response if the document were about an AI/CADe device, assuming the information was present within the document)

Disclaimer: The provided document is an FDA 510(k) clearance for an In Vitro Diagnostic (IVD) device (Maverick Diagnostic System TC1000 for TSH immunoassay), not an AI/CADe medical device. Therefore, the specific details requested in the prompt, such as AI model performance, expert interpretation of images, MRMC studies, or training/test set ground truth establishment for an AI, are not applicable to the content of this document. The following sections are provided as an example of how the prompt would be answered if the document were for an AI/CADe device and contained the relevant information, but the information below is NOT derived from the provided text.

(Hypothetical/Illustrative Answer - NOT based on the provided document)

(1) A table of acceptance criteria and the reported device performance

Acceptance Criterion (e.g., Performance Metric Threshold)	Reported Device Performance (e.g., AI Model X)
Sensitivity ≥ 90% for detecting Condition A	Sensitivity: 92.5%
Specificity ≥ 85% for Condition A	Specificity: 88.0%
AUC (Area Under the ROC Curve) ≥ 0.90	AUC: 0.915
False Positive Rate ≤ 5 per image	False Positive Rate: 4.2 per image
Mean processing time ≤ 5 seconds per image	Mean processing time: 3.8 seconds

(2) Sample size used for the test set and the data provenance

Test Set Sample Size: 500 cases (e.g., 250 positive for Condition A, 250 negative).
Data Provenance: Retrospectively collected data from multiple institutions across the United States, Germany, and Japan.

(3) Number of experts used to establish the ground truth for the test set and the qualifications of those experts

Number of Experts: 3 independent expert readers.
Qualifications of Experts: Each expert was a board-certified Radiologist with at least 10 years of experience specializing in the relevant imaging modality (e.g., thoracic imaging for lung nodules, breast imaging for mammography).

(4) Adjudication method for the test set

Adjudication Method: 2+1 adjudication. If at least 2 of the 3 initial expert readers agreed on the ground truth, that was considered the consensus. If there was a disagreement (e.g., 1 agreed, 2 disagreed; or all 3 disagreed), a fourth, highly experienced senior expert (or an expert panel) performed a final review and adjudication.

(5) If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

MRMC Study Status: Yes, an MRMC comparative effectiveness study was conducted.
Effect Size of Improvement: The study demonstrated a statistically significant improvement in reader performance (e.g., AUC). Human readers assisted by the AI model showed a mean increase of 0.05 in AUC (from 0.85 without AI to 0.90 with AI assistance) when interpreting cases for Condition A, compared to their performance without AI assistance. This corresponded to a reduction in diagnostic error rate of 15%.

(6) If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Standalone Performance: Yes, standalone performance was evaluated. The algorithm's standalone AUC for Condition A was 0.915.

(7) The type of ground truth used

Type of Ground Truth: Expert consensus with confirmation by pathology for positive cases of Condition A. Negative cases were confirmed through follow-up imaging and clinical outcomes over a specified period.

(8) The sample size for the training set

Training Set Sample Size: 10,000 cases.

(9) How the ground truth for the training set was established

Training Ground Truth Establishment: The ground truth for the training set was primarily established by a single expert radiologist's initial review, followed by confirmation from a second expert. Cases with disagreement were reviewed by a third, senior expert to reach consensus. A subset of cases (e.g., 20%) had pathology confirmation available. Automated labeling techniques, where feasible and validated, were also used to augment the manually reviewed data.

Ask a Question

Ask a specific question about this device

K Number

K233050

Device Name

ADVIA Centaur® TSH3-Ultra II (TSH3ULII)

Manufacturer

Siemens Healthcare Diagnostics, Inc.

Date Cleared

2024-04-04

(192 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K083844,K150403

Predicate For

N/A

Intended Use

The ADVIA Centaur® TSH3-Ultra II (TSH3ULII) assay is for in vitro diagnostic use in the quantitative determination of thyroid-stimulating hormone (TSH, thyrotropin) in human serum and plasma (EDTA and lithium heparin) using the ADVIA Centaur® XP system. Measurements of thyroid stimulating hormone produced by the anterior pituitary are used in the diagnosis of thyroid or pituitary disorders.

Device Description

This assay is a third-generation assay that employs anti-FITC monoclonal antibody covalently bound to paramagnetic particles, an FITC-labeled anti-TSH capture mouse monoclonal antibody, and a tracer consisting of a proprietary acridinium ester and an anti-TSH mouse monoclonal antibody conjugated to bovine serum albumin (BSA) for chemiluminescent detection.

AI/ML Overview

The provided text describes the ADVIA Centaur® TSH3-Ultra II (TSH3ULII) assay, a device for in vitro diagnostic quantitative determination of thyroid-stimulating hormone (TSH). The document covers the device's indications for use, comparison with a predicate device, and performance characteristics data.

Here's an analysis of the acceptance criteria and the study proving the device meets them, based on the provided text:

1. Table of Acceptance Criteria and Reported Device Performance

Performance Characteristic	Acceptance Criteria (Design Goal)	Reported Device Performance
Detection Capability	N/A (LoB, LoD, LoQ are reported values, not acceptance criteria for determination)	- Limit of Blank (LoB): 0.005 µIU/mL (mIU/L) - Limit of Detection (LoD): 0.008 µIU/mL (mIU/L) - Limit of Quantitation (LoQ): 0.010 µIU/mL (mIU/L) (Lower than predicate device's LoQ of 0.008 µIU/mL, but within acceptable range for the new device as specified in assay range)
Precision	- Repeatability (Within-Run): - ≤ 12% CV for 0.020–0.299 µIU/mL (mIU/L) - ≤ 6% CV for ≥ 0.300–90.000 µIU/mL (mIU/L) - ≤ 7% CV for > 90.000 µIU/mL (mIU/L) - Within-Laboratory (Total Precision): - ≤ 16% CV for 0.020–0.299 µIU/mL (mIU/L) - ≤ 8% CV for ≥ 0.300–90.000 µIU/mL (mIU/L) - ≤ 10% CV for > 90.000 µIU/mL (mIU/L)	Reported values (all calculated Repeatability CV and Within-Laboratory CV values are within the specified limits): - Serum A (0.088 µIU/mL): Repeatability CV 2.5%, Within-Lab CV 3.6% - Serum B (0.196 µIU/mL): Repeatability CV 1.8%, Within-Lab CV 3.1% - Serum C (0.507 µIU/mL): Repeatability CV 1.7%, Within-Lab CV 2.6% - Serum D (4.752 µIU/mL): Repeatability CV 2.3%, Within-Lab CV 2.7% - Serum E (46.749 µIU/mL): Repeatability CV 2.4%, Within-Lab CV 4.0% - Serum F (97.929 µIU/mL): Repeatability CV 2.2%, Within-Lab CV 3.5% Similar acceptable results for Plasma and Controls.
Reproducibility	- Reproducibility (Total): - ≤ 18.5% CV for 0.020-0.299 µIU/mL (mIU/L) - ≤ 10.5% CV for ≥ 0.300-90.000 µIU/mL (mIU/L) - ≤ 12.5% CV for > 90.000 µIU/mL (mIU/L)	Reported values (all calculated Reproducibility CV values are within the specified limits): - Serum A (0.090 µIU/mL): Reproducibility CV 3.11% - Serum B (0.178 µIU/mL): Reproducibility CV 4.87% - Serum C (0.474 µIU/mL): Reproducibility CV 2.21% - Serum D (4.684 µIU/mL): Reproducibility CV 2.47% - Serum E (56.562 µIU/mL): Reproducibility CV 2.33% - Serum F (99.522 µIU/mL): Reproducibility CV 4.12% Similar acceptable results for Plasma and Controls.
Assay Comparison	- Correlation coefficient (r) ≥ 0.95 - Slope of 1.0 ± 0.1	- Correlation coefficient (r): 0.999 - Regression Equation (Slope): 0.97 (within 1.0 ± 0.1)
Specimen Equivalency	- Correlation coefficient (r) ≥ 0.95 - Slope of 0.95-1.05	- Plasma, EDTA vs. Serum: r = 0.999, Slope = 0.99 (within 0.95-1.05) - Plasma, lithium heparin vs. Serum: r = 0.990, Slope = 1.01 (within 0.95-1.05)
Interferences (HIL)	Bias due to hemoglobin, bilirubin (conjugated/unconjugated), and Intralipid does not exceed 10%	Hemoglobin (500mg/dL), Bilirubin (40mg/dL), Intralipid (1000mg/dL) do not exceed 10% bias at TSH concentrations of ~0.900 µIU/mL and ~8.000 µIU/mL.
Interferences (Other Substances)	Bias due to various common substances does not exceed 10%	Various substances (e.g., Acetaminophen, Aspirin, Biotin, Heparin, Ibuprofen, Levothyroxine) at specified concentrations do not exceed 10% bias at TSH concentrations of ~0.900 µIU/mL and ~8.000 µIU/mL.
Cross-Reactivity	Cross-reactivity of hCG, FSH, and LH does not exceed 5%	hCG (200000 mIU/mL), FSH (1500 mIU/mL), LH (600 mIU/mL) at specified concentrations do not exceed 5% cross-reactivity at TSH concentrations of ~0.400, 5.00, 17.00, and 90.00 µIU/mL.
Linearity	Device is linear throughout its measuring interval (0.010–150.000 µIU/mL)	"The assay is linear for the measuring interval of 0.010–150.000 µIU/mL (mIU/L)."
High-Dose Hook Effect	Results for TSH concentrations above the measuring interval and up to 3000 µIU/mL should report > 150 µIU/mL (not a paradoxical decrease)	"Patient samples with TSH concentrations above the measuring interval and as high as 3000 µIU/mL will report > 150 µIU/mL (mIU/L)." (This confirms the absence of a significant high-dose hook effect within this specified range, meaning the device displays the result as above the measurable range.)

The study that proves the device meets the acceptance criteria is detailed across the "Performance Characteristics Data" section (Section 8) of the 510(k) Summary.

2. Sample Size Used for the Test Set and Data Provenance

Detection Capability (LoQ): Not specified (CLSI Document EP17-A2 was followed).
Precision: 80 measurements (replicates of 2, 2 runs/day, 20-day protocol) for each of the 6 serum samples, 5 plasma samples, and 5 control samples. Total N for Precision study is 80 x (6+5+5) = 1280 measurements.
Reproducibility: 225 measurements (replicates of 5, 1 run/day, 5-day protocol) for each of the 6 serum samples, 5 plasma samples, and 5 control samples. Total N for Reproducibility study is 225 x (6+5+5) = 3600 measurements.
Assay Comparison: 973 samples.
Specimen Equivalency:
- Plasma, EDTA vs. Serum: 52 samples.
- Plasma, lithium heparin vs. Serum: 57 samples.
Interferences (HIL and Other Substances): Samples at two TSH concentrations (~0.900 µIU/mL and ~8.000 µIU/mL) were tested for each interfering substance. The exact number of individual samples tested is not given, but multiple samples would be required for the two TSH levels per substance.
Cross-Reactivity: Samples at four TSH concentrations (~0.400, 5.00, 17.00, and 90.00 µIU/mL) were spiked with hCG, FSH, or LH. The exact number of individual samples is not given.
Linearity: Not explicitly stated, but performed in accordance with CLSI Document EP06-ed2, which involves testing multiple diluted samples.
High-Dose Hook Effect: Samples with TSH concentrations up to 3000 µIU/mL were evaluated. The number of samples tested is not explicitly stated.

Data Provenance: The document does not explicitly state the country of origin or whether the data was retrospective or prospective. Given it's an in vitro diagnostic device, the samples would generally be human biological specimens, likely collected from a clinical laboratory setting. The use of CLSI documents (Clinical and Laboratory Standards Institute) suggests standard laboratory practices.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Their Qualifications

This information is not provided in the document. For an in vitro diagnostic assay like TSH, "ground truth" is typically established by:

The reference method against which the new assay is compared (for accuracy/assay comparison). In this case, "ADVIA Centaur TSH3-UL assay" is used as the comparative assay (the predicate device).
Traceability to an international standard (WHO 3rd International Reference Preparation for human TSH (IRP 81/565)), which implies that the TSH values are calibrated against a universally accepted standard, rather than expert consensus on individual patient cases.

4. Adjudication Method for the Test Set

This refers to the process of resolving discrepancies in expert opinions, which is not applicable here as it is an analytical performance study for an IVD, not an interpretative AI device requiring human expert label agreement.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No, a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was not done. This type of study is typically performed for AI-assisted diagnostic imaging devices where human readers interpret medical images with and without AI assistance. This document pertains to an in vitro diagnostic assay, which involves automated quantitative measurement of a biomarker.

6. Standalone (Algorithm Only Without Human-in-the-Loop Performance) Study

Yes, implicitly. The entire performance characterization (Detection Capability, Precision, Reproducibility, Assay Comparison, Specimen Equivalency, Interferences, Cross-Reactivity, Linearity, High-Dose Hook Effect) is describing the standalone performance of the TSH3-Ultra II assay as an automated laboratory test. There is no mention of "human-in-the-loop" for this device's intended diagnostic function.

7. Type of Ground Truth Used

The ground truth for the performance studies is established by:

Reference materials/standards: For accuracy, the assay is traceable to the World Health Organization (WHO) 3rd International Reference Preparation for human TSH (IRP 81/565).
Comparative method: For assay comparison, the predicate device (ADVIA Centaur TSH3-UL assay) results serve as the comparative standard.
Defined concentrations: For precision, linearity, interferences, and cross-reactivity, samples with known or spiked concentrations of TSH or interfering substances are used.

8. Sample Size for the Training Set

The document does not report a training set sample size. This is because the ADVIA Centaur TSH3-Ultra II is a chemical immunoassay, not a machine learning or AI-based device that would typically involve a "training set" in the computational sense. The "development" process for such an assay involves reagent formulation, assay optimization, and calibration curve development, which are distinct from training an AI model.

9. How the Ground Truth for the Training Set Was Established

As there is no "training set" in the context of an AI/ML model, this question is not applicable. The assay's analytical characteristics are established through various studies (precision, accuracy, linearity, etc.) using calibrated materials and established reference methods, as detailed in the performance characteristics.

Ask a Question

Ask a specific question about this device

K Number

K222116

Device Name

Atellica® CI Analyzer, Atellica® IMThyroid Stimulating Hormone 3-Ultra (TSH3-UL), Atellica® CH Albumin BCP (AlbP)

Manufacturer

Siemens Healthcare Diagnostics Inc.

Date Cleared

2023-07-13

(360 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K151792,K151767

Predicate For

N/A

Intended Use

The Atellica® CI Analyzer is an automated, integrated system in vitro diagnostic tests on clinical specimens. The system is intended for the qualitative analysis of various body fluids, using photometry, turbidimetric, chemiluminescent, and integrated ionselective electrode technology for clinical use.

The Atellica® IM Thyroid Stimulating Hormone 3-Ultra (TSH3-UL) assay is for in vitro diagnostic use in the quantitative determination of thyroid-stimulating hormone (TSH, thyrotropin) in human serum and plasma (EDTA and lithium heparin) using the Atellica® CI Analyzer. Measurements of thyroid stimulating hormone produced by the anterior pituitary are used in the diagnosis of thyroid or pituitary disorders.

The Atellica® CH Albumin BCP (AlbP) assay is for in vitro diagnostic use in the quantitative measurement of albumin in human serum and plasma (lithium heparin, potassum EDTA) using the Atellica® CI Analyzer. Albumin measurements are used in the diagnosis and treatment of numerous diseases primarily involving the liver or kidneys.

Device Description

The Atellica® CI Analyzer is an automated, integrated system designed to perform in vitro diagnostic tests on clinical specimens. The system is intended for the qualitative and quantitative analysis of various body fluids, using photometric, turbidimetric, chemiluminescent, and integrated ionselective electrode technology for clinical use.

The Atellica CI Analyzer with Atellica® Rack Handler supports both clinical chemistry (CH) and Immunoassay (IM) features and contains all the necessary hardware, electronics, and software to automatically process samples and generate results, including sample and reagent dispensing, mixing, and incubating.

The Atellica IM TSH3-UL assay is a third-generation assay that employs anti-FITC monoclonal antibody covalently bound to paramagnetic particles, an FITC-labeled anti-TSH capture mouse monoclonal antibody, and a tracer consisting of a proprietary acridinium ester and an anti-TSH mouse monoclonal antibody conjugated to bovine serum albumin (BSA) for chemiluminescent detection

The Atellica CH Albumin BCP (AlbP) assay is an adaptation of the bromocresol purple dy-e binding method reported by Carter and Louderback et al. In the Atellica CH AlbP assay, serum or plasma albumin quantitatively binds to BCP to form an albumin-BCP complex that is measured as an endpoint reaction at 596/694 nm coenzyme NAD+ functions only with the bacterial (Leuconostoc mesenteroides) enzyme employed in the assay.

AI/ML Overview

The document provided is a 510(k) summary for in vitro diagnostic devices (IVDs), specifically the Atellica® CI Analyzer and its associated assays for Thyroid Stimulating Hormone (TSH3-UL) and Albumin (AlbP). IVDs, by their nature, measure specific analytes in biological samples and are evaluated against performance criteria such as precision, accuracy, linearity, and interference, rather than diagnostic accuracy metrics like sensitivity and specificity that would typically apply to AI/ML software. Therefore, many of the requested elements pertaining to AI/ML acceptance criteria and human-in-the-loop studies are not applicable to this type of device.

Here's a breakdown of the relevant information provided:

1. A table of acceptance criteria and the reported device performance:

The document describes the performance characteristics for the Atellica IM Thyroid Stimulating Hormone 3-Ultra (TSH3-UL) assay and the Atellica CH Albumin BCP (AlbP) assay. These are performance criteria, which serve as the acceptance criteria for the device's analytical performance.

Atellica IM Thyroid Stimulating Hormone 3-Ultra (TSH3-UL) Assay:

Performance Characteristic	Acceptance Criteria (Implied)	Reported Device Performance
Limit of Blank (LoB)	Must meet defined statistical criteria (CLSI EP17-A2.18)	0.004 µIU/mL (mIU/L)
Limit of Detection (LoD)	Must meet defined statistical criteria (CLSI EP17-A2.18)	0.008 µIU/mL (mIU/L)
Limit of Quantitation (LoQ)	Within-laboratory CV ≤ 20%	0.008 µIU/mL (mIU/L)
Precision (Serum Samples)	Repeatability and Within-Laboratory CVs within acceptable ranges	Ranges from 1.1% to 1.5% for CV (Repeatability) and 1.9% to 3.3% for CV (Within-Laboratory) across various concentrations.
Assay Comparison (Serum)	Correlation coefficient (r) > 0.960 (per AlbP section, assumed similar for TSH3-UL)	r = 0.996 (compared to Atellica IM Analyzer)
Interfering Substances	Bias due to interfering substances ≤ 10% (for specific concentrations)	Hemoglobin, Bilirubin (conjugated/unconjugated), Lipemia (Intralipid®) show biases of -0.1% to -3%.
Other Substances	Bias due to these substances ≤ 10% (at specified TSH concentrations)	No interference (bias ≤ 10%) from listed substances (e.g., Biotin, Cholesterol, Acetaminophen, etc.) at tested concentrations.
Specimen Equivalency	Correlation coefficient (r) indicative of equivalence	Plasma (Lithium heparin) vs. Serum: r = 1.00; Plasma (EDTA) vs. Serum: r = 1.00
High-Dose Hook Effect	Report > 150.000 µIU/mL (mIU/L) for high TSH concentrations	Samples with TSH concentrations as high as 3000 µIU/mL (mIU/L) will report > 150.000 µIU/mL (mIU/L).
Cross-Reactivity	Bias due to cross-reacting substances ≤ 5%	Human Chorionic Gonadotropin, Follicle Stimulating Hormone, Luteinizing Hormone show differences of -2.1% to 1.7%.
Onboard Dilution Recovery	Recovery within an acceptable range (e.g., 90-110%)	Mean recovery of 99.3% and 100.1% for serum, 100.5% and 99.3% for plasma across dilutions.
Linearity	Demonstrated linearity over the claimed measuring range (0.008-150.000 µIU/mL)	Y=0.9945*X-0.0011, demonstrating linearity.

Atellica CH Albumin BCP (AlbP) Assay:

Performance Characteristic	Acceptance Criteria	Reported Device Performance
Limit of Blank (LoB)	≤ 0.1 g/dL (≤ 1 g/L)	0.1 g/dL (1 g/L)
Limit of Detection (LoD)	≤ 0.6 g/dL (≤ 6 g/L)	0.5 g/dL (5 g/L)
Limit of Quantitation (LoQ)	Within-laboratory precision ≤ 10%	0.5 g/dL (5 g/L)
Precision (Serum Samples)	Repeatability and Within-Laboratory CVs within acceptable ranges	Ranges from 0.6% to 1.3% for CV (Repeatability) and 1.7% to 2.6% for CV (Within-Laboratory) across various concentrations.
Reproducibility	Repeatability, Between-Day, Between-Instrument, Between-Lot, Total Reproducibility within acceptable ranges	Total Reproducibility CVs range from 1.4% to 1.9%.
Assay Comparison	Correlation coefficient (r) > 0.960 and slope 1.00 ± 0.10	r = 0.999; y = 0.98x + 0.0 g/dL (compared to Atellica CH Analyzer)
Specimen Equivalency	Correlation coefficient (r) indicative of equivalence	Plasma (Lithium heparin) vs. Serum: r = 0.995; Plasma (Potassium EDTA) vs. Serum: r = 0.997
Hemolysis, Icterus, Lipemia (HIL)	≤ 10% interference from hemoglobin, bilirubin, and lipemia	Biases typically within 9% for tested concentrations.
Non-Interfering Substances	Bias due to these substances ≤ 10%	Biases typically within 10% for listed substances.
Linearity	Demonstrated linearity over the claimed measuring range (0.5-8.0 g/dL)	Y=0.9984*X+0.2891, demonstrating linearity.

2. Sample sizes used for the test set and the data provenance:

TSH3-UL Assay:
- Precision: 80 samples for each type (Serum A-F, EDTA Plasma A-C, Heparin Plasma A-C, Control 1-3).
- Assay Comparison (Serum): 112 samples.
- Interferences (Specific substances): Not explicitly stated how many samples per substance, but concentrations tested at two analyte levels.
- Specimen Equivalency: 64 samples for Plasma (Lithium heparin) and 64 for Plasma (EDTA).
- Onboard Dilution Recovery: 3 samples (Serum and Plasma) tested at two dilution levels.
- Linearity: Not explicitly stated, but "at least 14 levels created by mixing high and low serum samples" with N=5 replicates per level.
AlbP Assay:
- LoD: 486 determinations (270 blank, 216 low level replicates).
- LoQ: n=5 replicates using 3 reagent lots over 5 days.
- Precision: N ≥ 80 for each sample (Serum 1-3, Serum QC 1).
- Reproducibility: 225 samples for each serum level (assayed n=5 in 1 run for 5 days using 3 instruments and 3 reagent lots).
- Assay Comparison (Serum): 106 samples.
- Specimen Equivalency: 76 samples for Plasma (Lithium heparin) and 55 for Plasma (Potassium EDTA).
- HIL: Not explicitly stated how many samples per interferent, but concentrations tested at two analyte levels.
- Non-Interfering Substances: Not explicitly stated how many samples per substance, but tested at two analyte concentrations.
- Linearity: "at least nine levels created by mixing the high and low pools of serum" with N=5 replicates per level.

Data Provenance: The document does not explicitly state the country of origin or whether the data was retrospective or prospective. Given it's a 510(k) submission for a medical device intended for broad use, it's highly likely the studies were prospective analytical validation studies conducted under controlled laboratory conditions, typically in multiple sites to ensure robustness.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts (e.g., radiologist with 10 years of experience):

This information is not applicable to this type of device. The "ground truth" for clinical laboratory assays like TSH and Albumin comes from established analytical methods, reference materials, and accepted scientific principles of chemistry and immunology. It's about measuring the concentration of an analyte, not interpreting an image or diagnosing a condition based on expert consensus. The "experts" involved would be clinical chemists, laboratory scientists, and engineers responsible for assay development and validation, following established guidelines like those from CLSI (Clinical and Laboratory Standards Institute).

4. Adjudication method (e.g., 2+1, 3+1, none) for the test set:

This is not applicable for this type of device. Adjudication methods are used in studies involving subjective interpretations (e.g., image reading) where multiple readers provide opinions that need to be reconciled to establish ground truth. For quantitative chemical assays, the "truth" is determined by reference methods and the intrinsic properties of the analyte, not by human consensus or adjudication.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:

This is not applicable. An MRMC study is designed for evaluating the impact of a system on human readers' diagnostic performance, typically in the context of imaging. This document describes an automated in vitro diagnostic analyzer and its assays, which do not involve human "readers" in the sense of interpreting outputs like medical images.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:

The performance characteristics presented (precision, linearity, assay comparison, interference, etc.) represent the standalone performance of the device and its assays. The Atellica® CI Analyzer and its assays are automated systems designed to perform measurements without human interpretative input beyond setting up the instrument and following standard laboratory procedures for running samples and quality control.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc):

The ground truth for these quantitative assays is established through:

Reference Methods / Comparability: The performance is evaluated by comparing the new device's results to a legally marketed predicate device (Siemens Trinidad systems) which serve as the reference. This establishes the equivalence of the new device to already accepted technology.
Traceability to International Standards: For TSH3-UL, traceability is to the World Health Organization (WHO) 3rd International Standard for human TSH (IRP 81/565). For AlbP, traceability is to ERM-DA470k Reference Material. These international standards or reference materials provide the "true" or accepted values against which the device's measurements are calibrated and verified.
Analytical Procedures: The "ground truth" for characteristics like limit of detection, precision, and linearity are determined by rigorous statistical methods and established protocols (e.g., CLSI guidelines EP05-A3, EP07-ed3, EP09c-ed3, EP17-A2, EP06-ED2) during analytical validation.

8. The sample size for the training set:

This information is not applicable in the context of an IVD where "training set" implies machine learning or AI model development. For an IVD, there is a development and validation process. The number of samples for analytical validation studies (which is what is presented) is given under point 2.

9. How the ground truth for the training set was established:

As this is not an AI/ML device, the concept of a "training set" for an algorithm and its associated ground truth establishment methods (e.g., expert annotations) are not applicable. The "ground truth" or reference for the development and validation of these IVD assays is based on established laboratory practices, chemical principles, certified reference materials, and comparison to predicate devices, as described in point 7.

Ask a Question

Ask a specific question about this device

K Number

K221225

Device Name

Access TSH (3rd IS) Assay, DxI 9000 Access Immunoassay Analyzer

Manufacturer

Beckman Coulter, Inc.

Date Cleared

2022-11-10

(196 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

k153651,k121214

Predicate For

N/A

Intended Use

The Access TSH (3rd IS) assay is a paramagnetic particle, chemiluminescent immunoassay for the quantitative determination of human thyroid-stimulating hormone (thyrotropin, TSH, hTSH) levels in human serum and plasma using the Access Immunoassay Systems. Measurements of thyroid stimulating hormone produced by the anterior pituitary are used in the diagnosis of thyroid or pituitary disorders. This assay is capable of providing 3rd generation TSH results.

The DxI 9000 Access Immunoassay Analyzer is an in vitro diagnostic device used for the quantitative, semiquantitative, or qualitative determination of various analyte concentrations found in human body fluids.

Device Description

Access TSH (3rd IS) assay is a two-site immunoenzymatic ("sandwich") assay. The Access TSH (3rd IS) reagent kit is in a liquid ready-to-use format designed for optimal performance on Beckman Coulter's immunoassay analyzers. Each reagent kit contains two reagent packs. Other items needed to run the assay include substrate, calibrators and wash buffer.

The Dxl 9000 Access Immunoassay Analyzer is a fully automated, continuous, random-access sample processing and analysis instrument. The Dxl 9000 Access Immunoassay Analyzer uses enzyme immunoassays (utilizing paramagnetic particle solid phase and chemiluminescent detection) for the quantitative, semi-quantitative or qualitative determination of various analyte concentrations found in human body fluids.

AI/ML Overview

The provided text describes the performance of the Access TSH (3rd IS) Assay on the DxI 9000 Access Immunoassay Analyzer. This is an in vitro diagnostic device, and the detailed information typically provided for AI/ML devices regarding ground truth establishment, expert qualifications, and MRMC studies is not directly applicable to this type of device.

Here's a breakdown based on the information available:

1. A table of acceptance criteria and the reported device performance

Performance Metric	Acceptance Criteria	Reported Device Performance
Method Comparison	R² ≥ 0.90 and Slope 1.00 ± 0.10	R²: 1.00 Slope: 1.06 (95% CI: 1.04, 1.07) Intercept: -0.019 (95% CI: -0.10, -0.0037)
Imprecision (Within-laboratory/Total)	Not explicitly stated as a single overall acceptance criterion, but implied to be within acceptable limits for a diagnostic assay.	For TSH concentrations > 0.02 µlU/mL: - %CV ranged from 2.5% to 4.5%. For TSH concentrations ≤ 0.02 µlU/mL: - SD ranged from 0.0007 to 0.0014. (Detailed table with specific sample means and SD/CV values provided in the text)
Reproducibility	SD ≤ 0.0038 for values ≤ 0.02 ulU/mL CV < 13.0% for values > 0.02 ulU/mL	For values ≤ 0.02 ulU/mL (Sample 1, mean 0.024): Overall Reproducibility SD = 0.0010 (meets criteria) For values > 0.02 ulU/mL: - Sample 2 (mean 0.37): Reproducibility CV = 4.3% (meets criteria) - Sample 3 (mean 4.8): Reproducibility CV = 3.7% (meets criteria) - Sample 5 (mean 12): Reproducibility CV = 3.8% (meets criteria) - Sample 4 (mean 46): Reproducibility CV = 3.4% (meets criteria)
Linearity	Not explicitly stated as a numerical criterion, but implies demonstration of accuracy across the measuring range.	Linear throughout the analytical measuring interval of approximately 0.01 - 50.0 µIU/mL (mIU/L).
Limit of Blank (LoB)	Not explicitly stated as a numerical criterion, but implies a low detection threshold.	0.002 µIU/mL
Limit of Detection (LoD)	Not explicitly stated as a numerical criterion, but implies a low detection threshold.	0.003 µIU/mL
Limit of Quantitation (LoQ)	LoQ must be greater than or equal to LoD.	Maximum LoQ determined was 0.001 µlU/mL, but reported as 0.003 µIU/mL to align with LoD, following CLSI EP17-A2 recommendation.

2. Sample size used for the test set and the data provenance

Method Comparison Test Set: 111 serum samples.
Imprecision Test Set: 80 replicates per sample level across multiple runs/days.
Reproducibility Test Set: 75 replicates per sample level.
Data Provenance: Not explicitly stated (e.g., country of origin, retrospective/prospective). Given it's an in vitro diagnostic device, these samples are typically laboratory-generated or clinical samples collected for analytical validation.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

This information is not applicable to the analytical performance studies conducted for this in vitro diagnostic device. The "ground truth" for these studies is established by the reference method (the predicate device for method comparison) or by known concentrations/spikes for other performance characteristics like linearity, LoB, LoD, LoQ, and imprecision.

4. Adjudication method (e.g. 2+1, 3+1, none) for the test set

Not applicable. Adjudication methods are typically used in clinical studies or for subjective interpretations of results, not for the analytical performance of an immunoassay system where quantitative measurements are directly compared.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

Not applicable. This device is an automated immunoassay analyzer, not an AI-assisted diagnostic tool that requires human interpretation or aids human readers.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Yes, the studies presented as "Summary of studies" directly represent the standalone performance of the Access TSH (3rd IS) Assay on the DxI 9000 Access Immunoassay Analyzer. The device is an automated system, and its performance is evaluated directly without human interpretation in the analytical process.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc)

Method Comparison: The predicate device, Access TSH (3rd IS) Assay on UniCel DxI 800 Immunoassay System (K153651), served as the reference or "ground truth" for comparative performance.
Imprecision, Reproducibility, Linearity, LoB, LoD, LoQ: "Ground truth" is established by the study design, using known concentrations, spiked samples, or statistical methods (e.g., CLSI guidelines) to define true values and assess the device's ability to measure them accurately and precisely.

8. The sample size for the training set

Not applicable. This is not an AI/ML device that requires a training set in the typical sense. Its performance is based on the inherent analytical characteristics of the reagents and instrumentation.

9. How the ground truth for the training set was established

Not applicable, as there is no "training set" in the context of an AI/ML algorithm.

Ask a Question

Ask a specific question about this device

K Number

K190773

Device Name

Elecsys TSH

Manufacturer

Roche Diagnostics

Date Cleared

2019-04-16

(21 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

N/A

Predicate For

K191899

Intended Use

Immunoassay for the in vitro quantitative determination of thyrotropin in human serum and plasma. Measurements of TSH are used in the diagnosis of thyroid and pituitary disorders.

The electrochemiluminescence immunoassay "ECLIA" is intended for use on cobas e immunoassay analyzers.

Device Description

The Elecsys TSH immunoassay makes use of a sandwich test principle using monoclonal antibodies specifically directed against human TSH. The antibodies labeled with ruthenium complex) consist of a chimeric construct from human and mouse specific components. The Elecsys TSH immunoassay is used for the in vitro quantitative determination of thyroid stimulating hormone in human serum and plasma. It is intended for use on the cobas e immunoassay analyzers.

AI/ML Overview

The Elecsys TSH device is an immunoassay for the in vitro quantitative determination of thyrotropin in human serum and plasma, used in the diagnosis of thyroid and pituitary disorders. It is an electrochemiluminescence immunoassay (ECLIA) intended for use on cobas e immunoassay analyzers. The key change in the updated device is a two-step approach to block biotin interference by adding an antibody to bind free biotin in the sample and changing the linker on the biotinylated capture antibody.

Here's an analysis of the acceptance criteria and the study that proves the device meets them:

1. A table of acceptance criteria and the reported device performance

The document provides performance data across various non-clinical studies. The acceptance criteria are generally implied by "All samples met the predetermined acceptance criterion" or "All lots met the predetermined acceptance criterion" for studies like precision, LoB, LoD, LoQ, and linearity. For interference studies, the "No interference seen up to" values represent the performance vs. a defined limit. For lot-to-lot reproducibility, the comparability of SDs and CVs implicitly confirms acceptance. For method comparison, the statistical results (slope, intercept, correlation coefficient, bias) are compared against internal acceptance criteria.

Clinical / Technical Feature	Acceptance Criteria (Explicit or Implied)	Reported Device Performance
Repeatability & Intermediate Precision	All samples met the predetermined acceptance criterion.	CVs for repeatability ranged from 0.7% to 3.4%. CVs for intermediate precision ranged from 1.5% to 11.2%.
Lot-to-Lot Reproducibility	Calculated SDs and CVs for multiple lots comparable to single lot precision study.	Calculated SDs and CVs for multiple lots were comparable.
Limit of Blank (LoB)	All lots met the predetermined acceptance criterion.	0.0025 µIU/mL
Limit of Detection (LoD)	All lots met the predetermined acceptance criterion.	0.005 µIU/mL
Limit of Quantitation (LoQ)	All lots met the predetermined acceptance criterion.	0.005 µIU/mL
Linearity/Assay Reportable Range	All deviations within predetermined acceptance criteria.	Linear in the range from 0.004 - 102 µIU/mL.
High Dose Hook Effect	No hook effect observed up to a specified concentration.	No hook effect up to 1000 µIU/mL TSH.
Biotin Interference (Endogenous)	Biotin interference not exceeding a specified threshold.	No biotin interference in serum concentrations up to 1200 ng/mL. (Previous limitation was ≤ 102 nmol/L or ≤ 25 ng/mL).
Lipemia (Intralipid) Interference	No interference seen up to 1500 mg/dL.	No interference seen up to 2000 mg/dL.
Hemoglobin Interference	No interference seen up to 1000 mg/dL.	No interference seen up to 1000 mg/dL.
Bilirubin Interference	No interference seen up to 41 mg/dL.	No interference seen up to 66 mg/dL.
Rheumatoid Factor (RF) Interference	No interference seen up to 1500 IU/mL.	No interference seen up to 1500 IU/mL.
Immunoglobulin (IgG) Interference	No interference seen up to 2 g/dL.	No interference seen up to 3.98 g/dL.
Immunoglobulin (IgM) Interference	No interference seen up to 0.5 g/dL.	No interference seen up to 0.72 g/dL.
Analytical Specificity (Cross-Reactivity)	All cross-reactivities met the predefined acceptance criterion at the specified concentration.	LH, FSH, hCG showed 0.000% cross-reactivity at high concentrations; hGH not detectable.
Exogenous Interferences (Drugs)	Each compound found to be non-interfering at the drug concentration.	All 30 tested drugs (commonly and specially used) showed no significant interference at concentrations at least 3x maximum daily doses (or 1x for some).
Sample Matrix Comparison	Regression analysis (Passing/Bablok) data consistent with acceptance criteria for various plasma types and different separating gels.	Slope (0.976 - 0.983), Intercept (-0.0006 - -0.021), Correlation (0.999 - 1.00) for serum vs. plasma. Recovery acceptable for PST/SST.
Method Comparison to Predicate	All data met predefined acceptance criteria for agreement between candidate (updated assay) and predicate (current assay).	Passing-Bablok: Slope 0.974, Intercept -0.0002, Correlation 0.999. Bias at 0.27 µIU/mL: -2.7%, Bias at 4.2 µIU/mL: -2.6%.
Stability	Pre-specified acceptance criteria were met.	Stability data supports Roche Diagnostic's claims as reported in package inserts.

2. Sample size used for the test set and the data provenance (e.g. country of origin of the data, retrospective or prospective)

Precision (Repeatability & Intermediate Precision): The test set involved 2 replicates per run for 21 days, across 2 runs per day, for PreciControl Universal, PC Thyro Sensitive, and 5 human serum samples. This is a prospective study design for precision. Sample types were native, single human donors as well as pools.
Lot-to-Lot Reproducibility: 2 replicates of each of 5 human serum samples per run, 2 runs per day, for 21 days (7 days per lot, n=28 determinations per lot). Prospective design. Sample types were native, single human donors as well as pools.
Limit of Blank (LoB): Five blank samples with two replicates each per run, for 6 runs on ≥ 3 days (total 60 determinations for analyte free samples). Prospective design.
Limit of Detection (LoD): Five low analyte samples with two replicates each per run, for 6 runs on ≥ 3 days (total 60 replicates per sample per reagent lot). Prospective design.
Limit of Quantitation (LoQ): 25 replicates per sample per reagent lot, over 5 days (1 run per day). Prospective design.
Linearity/Assay Reportable Range: Three high analyte human serum samples were diluted and measured in 3-fold determination within a single run. Prospective design.
High Dose Hook Effect: Three human serum samples spiked with analyte, dilution series performed, measured in one-fold determination. Prospective design.
Endogenous Interference: Varied by interferent. For Biotin, Lipemia, Hemoglobin, Bilirubin, RF, IgG, IgM, samples were spiked with interfering substances and diluted into a dilution pool in 10% increments. The number of individual samples/pools is not explicitly stated but implies multiple. Prospective design.
Analytical Specificity/Cross-Reactivity: A native human serum sample pool was used for each potential cross-reacting compound. Prospective design.
Exogenous Interferences (Drugs): Two human serum samples (native serum pools) were used. Prospective design.
Sample Matrix Comparison: A minimum of 56 serum/plasma pairs per sample material (Li-heparin, K2-EDTA, K3-EDTA plasma) were tested in singleton. For PST/SST, blood from five donors was used, measured in duplicate. Prospective design.
Method Comparison to Predicate: 138 samples (129 native human serum samples and 9 diluted human serum samples, single donors as well as pools diluted) were measured in singleton. Prospective design.

Data Provenance: The document does not explicitly state the country of origin for the human serum and plasma samples. However, the manufacturer, Roche Diagnostics, operates globally with establishments in Mannheim and Penzberg, Germany, and Indianapolis, USA. The studies typically indicate the use of "human serum" or "human serum samples" without further geographic specification. All described studies appear to be prospective experimental designs conducted in a laboratory setting for device validation.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts (e.g. radiologist with 10 years of experience)

This device is an in vitro diagnostic immunoassay testing for a quantifiable biomarker (TSH), not an imaging device or a device requiring expert interpretation of complex clinical data to establish ground truth for its performance characteristics. The ground truth for such assays is established through analytical methods and reference standards (e.g., spectrophotometry for linearity, spiked samples for interference, reference materials for precision, comparison to a predicate device). No human experts are used to "establish primary ground truth" in the sense of clinical diagnosis for these analytical performance studies. The ground truth is the actual concentration of the analyte, or the known characteristics of the samples (e.g., spiked amount of interferent).

4. Adjudication method (e.g. 2+1, 3+1, none) for the test set

Not applicable. As described above, this is an in vitro diagnostic analytical performance study, not a clinical study requiring adjudication of diagnoses or interpretations by experts.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

Not applicable. This device is an automated immunoassay, an in vitro diagnostic (IVD) test, not an imaging device or AI-driven diagnostic tool that would involve human "readers" or AI assistance.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

Yes, the studies described are all standalone performance studies of the Elecsys TSH immunoassay system. It's an automated device (cobas e immunoassay analyzer) that provides quantitative results without human intervention in the measurement process itself, beyond sample loading and general operation/maintenance.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc)

The ground truth used for these analytical studies includes:

Known concentrations: For LoB, LoD, LoQ, and linearity, the samples are either analyte-free, at known low concentrations, or dilutions from known high concentrations.
Spiked samples: For interference studies (endogenous and exogenous), known amounts of interfering substances are added to samples.
Reference materials/standards: For precision, controls with defined concentrations are used. Traceability is to the 2nd IRP WHO Reference Standard 80/558.
Comparison to a legally marketed predicate device: For method comparison, the results of the new device are compared quantitatively to those of the predicate device.
Clinical samples (native human serum/plasma): These are used to assess the device's performance across a range of physiological concentrations and in real-world matrices for studies like precision, linearity, and matrix comparison.

8. The sample size for the training set

This document describes a 510(k) submission for a revised immunoassay, not a machine learning or AI-based device. Therefore, there is no "training set" in the context of algorithm development. The development of the assay itself would have involved extensive R&D and analytical testing to optimize reagents and protocols, but this is distinct from training an AI model on a dataset.

9. How the ground truth for the training set was established

Not applicable, as there is no "training set" in the AI/ML context for this device. The assay's performance characteristics are established through the non-clinical studies detailed in the summary.

Ask a Question

Ask a specific question about this device

K Number

K170232

Device Name

AFIAS TSH-SP, AFIAS TSH-VB, AFIAS-6/SP Analyzer, AFIAS-6/VB Analyzer

Manufacturer

BODITECH MED INC.

Date Cleared

2017-10-13

(261 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K042281

Predicate For

N/A

Intended Use

AFIAS TSH-SP, for use in conjunction with the AFIAS-6/SP Analyzer, is an immunofluorometric test system intended for in vitro diagnostic use at clinical laboratories and Point-of-Care (POC) sites for the quantitative measurement of thyroid stimulating hormone (TSH) levels in serum, sodium-heparinized plasma samples. The test system is intended for use as an aid in the diagnosis of thyroid or pituitary disorders.

AFIAS-6/SP Analyzer is a fluorescence-scanning instrument for in vitro diagnostic use at clinical laboratories and Pointof-Care (POC) sites in conjunction with various in vitro diagnostic AFIAS immunoassays for measuring the concentration of designated analytes in serum or plasma samples.

AFIAS TSH-VB, for use in conjunction with the AFIAS-6/VB Analyzer, is an immunofluorometric test system intended for in vitro diagnostic use at clinical laboratories and Point-of-Care (POC) sites for the quantitative measurement of thyroid stimulating hormone (TSH) levels in sodium-heparinized or EDTA venous whole blood samples. The test system is intended for the monitoring of TSH levels in euthyroid and hypothyroid individuals.

AFIAS-6/VB Analyzer is a fluorescence-scanning instrument for in vitro diagnostic use at clinical laboratories and Pointof-Care (POC) sites in conjunction with various in vitro diagnostic AFIAS immunoassays for measuring the concentration of designated analytes in venous whole blood samples.

Device Description

AFIAS TSH-SP as well as AFIAS TSH-VB Test Cartridge is a plastic structure molded in the form of a disposable, self-contained, unitized device which houses the 'lyophilized detection buffer', the 'diluent i.e. reconstitution buffer' as well as the 'test strip'; all of which are integral components of the test. The test cartridge is an elongated structure having 140 mm length. 17 mm width and 17 mm height.

'AFIAS TSH-SP ID Chip' as well as 'AFIAS TSH-VB ID Chip' is a flat, rectangular device with its main body measuring 24 mm × 20 mm × 3 mm. Another rectangular portion measuring 12 mm × 10 mm × 2 mm protrudes out from the apical side of the main body. The ID Chip is an electronic memory device fitted into a plastic matrix. Lot-specific 'ID Chip' is an integral component of the test.

AFIAS-6/SP as well as AFIAS-6/VB analyzer is a compact, bench-top, automated, fluorometric analyzer measuring 42 cm (L) x 33.6 cm (W) x 29.3 cm (H). AFIAS-6 weighs 15.1 kg. Either analyzer is a flourometer instrument of closed-system analyzer type.

AI/ML Overview

Here's an analysis of the provided text, outlining the acceptance criteria and study details for the AFIAS TSH devices:

Acceptance Criteria and Device Performance for AFIAS TSH-SP and AFIAS TSH-VB

Note: The document presents acceptance criteria implicitly through performance study results and comparisons to a predicate device. Specific numerical acceptance criteria (e.g., "CV must be <= X%") are not explicitly stated for all metrics but are inferred from the reported "achieved performance" values. Where a direct "acceptance criterion" is not explicitly given, the achieved performance itself serves as the benchmark demonstrated by the study.

1. Table of Acceptance Criteria and Reported Device Performance

Feature/Metric	Acceptance Criterion (Inferred from Predicate/Good Practice)	AFIAS TSH-SP Reported Performance	AFIAS TSH-VB Reported Performance
Limit of Blank (LoB)	Not explicitly stated; expected to be low.	0.03 µIU/ml	0.13 µIU/ml
Limit of Detection (LoD)	Not explicitly stated; expected to be low.	0.05 µIU/ml	0.2 µIU/ml
Limit of Quantitation (LoQ) / Functional Sensitivity	Inter-assay CV ≤ 20% (explicitly stated for calculation)	0.07 µIU/ml with inter-assay CV ≤ 20%	0.3 µIU/ml with inter-assay CV ≤ 20%
Measuring/Reportable Range	Comparable to predicate device.	0.07 - 80.00 µIU/ml	0.3 - 80.0 µIU/ml
Linearity (R²)	High correlation (implied by good linearity).	0.9993 (Serum)	0.9997 (Na-heparinized whole blood)
Susceptibility to High-dose Hook Effect	No hook effect below a very high concentration.	No hook/prozone effect up to 3000 µIU/ml	No hook/prozone effect up to 3000 µIU/ml
Analytical Specificity (Interference/Cross-reactivity)	Analyte recovery within 90-110% in presence of specified interferants/cross-reactants.	Analyte recovery within 90-110%	Analyte recovery within 90-110%
Site-to-Site Precision/Reproducibility (Total Imprecision %CV)	Not explicitly stated; typical clinical assay precision targets vary (e.g., <10-15%).	Individual site total %CVs range from 3.7% to 6.7% (low to moderate TSH). Combined sites total %CVs range from 4.0% to 5.6%.	Individual site total %CVs range from 6.3% to 8.1% (low to moderate TSH). Combined sites total %CVs range from 6.5% to 7.2%.
Matrix Comparison (Correlation Coefficient)	High correlation (e.g., >0.95 or higher).	0.9998 (Serum vs. Sodium heparin plasma), 0.9997 (Serum vs. Di-Potassium EDTA plasma)	0.9998 (Sodium heparin venous whole blood vs. Di-Potassium EDTA venous whole blood)
Clinical Method Comparison (Correlation Coefficient)	High correlation (e.g., >0.95 or higher) with predicate device.	0.9994	0.9999
Clinical Method Comparison (Weighted Deming Regression Slope)	Close to 1 (e.g., 0.9-1.1) to indicate agreement with predicate.	0.976	0.909
Clinical Method Comparison (Weighted Deming Regression Y-intercept)	Close to 0 to indicate agreement with predicate.	-0.003	0.012

2. Sample Size and Data Provenance

Limit of Blank (LoB):
- Test Set Sample Size: 5 unique blank/TSH-depleted human serum samples (for TSH-SP) and 5 unique blank/TSH-depleted whole blood samples (for TSH-VB). Each tested in 5 replicates, with 3 lots on 3 analyzers for 3 days, leading to 75 replicates per lot/analyzer.
- Data Provenance: Not explicitly stated (e.g., country). Appears to be laboratory-controlled samples (TSH-depleted).
Limit of Detection (LoD):
- Test Set Sample Size: 5 unique low TSH-spiked human serum samples (for TSH-SP) and 5 unique low TSH-spiked whole blood samples (for TSH-VB). Each tested in 5 replicates, with 3 lots on 3 analyzers for 3 days, leading to 75 replicates per lot/analyzer.
- Data Provenance: Not explicitly stated (e.g., country). Appears to be laboratory-controlled samples (TSH-spiked).
Limit of Quantitation (LoQ):
- Test Set Sample Size:
  - TSH-SP: 5 low TSH-spiked serum samples, tested in 2 replicates daily in two runs, for 21 successive days (total 210 measurements per sample per lot/analyzer combination).
  - TSH-VB: 4 low TSH-spiked venous whole blood samples, tested in 5 replicates daily in two runs, for 5 successive days (total 200 measurements per sample per lot/analyzer combination).
- Data Provenance: Not explicitly stated (e.g., country). Appears to be laboratory-controlled samples (TSH-spiked).
Linearity and Reportable Range:
- Test Set Sample Size: 22 test samples each for TSH-SP (serum) and TSH-VB (whole blood), prepared by mixing high and TSH-depleted samples. Each tested in triplicate.
- Data Provenance: Not explicitly stated (e.g., country). Laboratory-prepared samples.
Susceptibility to High-dose Hook Effect:
- Test Set Sample Size: 12 spiked samples (TSH concentrations 25 to 3000 µIU/ml). Tested in triplicate.
- Data Provenance: Not explicitly stated (e.g., country). Laboratory-prepared samples.
Analytical Specificity:
- Test Set Sample Size: Samples spiked with various interferants/cross-reactants. Specific number of samples not detailed, but substances and concentrations are listed.
- Data Provenance: Not explicitly stated (e.g., country). Laboratory-prepared samples.
Site-to-Site Precision/Reproducibility:
- Test Set Sample Size:
  - TSH-SP: 4 serum samples (TSH levels ~0.5, ~5.0, ~15.0 & ~55.0 µIU/ml). Each tested in 5 replicates, with 3 lots on 3 analyzers (1 per site) by 9 operators (3 per site). Total 15 replicates per sample per site, 45 replicates per sample combined.
  - TSH-VB: 4 whole blood samples (TSH levels ~0.5, ~5.0, ~15.0 & ~55.0 µIU/ml). Each tested in 5 replicates, with 3 lots on 3 analyzers (1 per site) by 9 operators (3 per site). Total 15 replicates per sample per site, 45 replicates per sample combined.
- Data Provenance: External point-of-care sites. Not explicitly stated (e.g., country). Uses clinical samples (Clinical Serum Sample 1, 2, Clinical Venous Whole Blood Sample 1, 2) and spiked samples (Spiked Serum Sample 3, 4, Spiked Venous Whole Blood Sample 3, 4).
Matrix Comparison:
- Test Set Sample Size:
  - TSH-SP: 81 matched serum vs. sodium heparin plasma samples; 79 matched serum vs. Di-Potassium EDTA plasma samples.
  - TSH-VB: 63 matched sodium heparin venous whole blood vs. Di-Potassium EDTA venous whole blood samples.
- Data Provenance: Clinical samples from same study subjects. Not explicitly stated (e.g., country).
Adult Reference Interval:
- Test Set Sample Size:
  - TSH-SP: 128 apparently healthy adults (65 males, 63 females, age 21-70 years) for serum samples.
  - TSH-VB: 133 apparently healthy adults (69 males, 64 females, age 21-70 years) for sodium heparin venous whole blood samples.
- Data Provenance: Not explicitly stated (e.g., country). Appears to be prospective collection of healthy adult samples.
Clinical Method Comparison:
- Test Set Sample Size:
  - TSH-SP: 183 serum samples (including 22 spiked).
  - TSH-VB: 157 sodium heparinized venous whole blood samples (including 22 spiked).
- Data Provenance: Clinical sites (three point-of-care sites). Samples collected from across the three study sites. Not explicitly stated (e.g., country).

3. Number of Experts and Qualifications for Ground Truth

The document describes in vitro diagnostic devices for measuring TSH levels. The performance studies for these types of devices primarily rely on established analytical methods and reference standards rather than expert human interpretation of images or clinical cases.
No "experts" were used to establish ground truth in the typical sense of a diagnostic imaging study (e.g., radiologists interpreting images). Instead, ground truth is established by:
- Reference testing (e.g., predicate device, or other established laboratory methods) for method comparison studies.
- Known concentrations for spiked samples (LoD, LoQ, Linearity, Hook Effect, Analytical Specificity).
- Large cohorts of 'apparently healthy adults' for reference intervals.

4. Adjudication Method

None specified. For in vitro diagnostic assays, ground truth is typically analytical (known concentrations, reference method results) rather than requiring adjudication of human interpretations.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No. This is an in vitro diagnostic (IVD) test, not an imaging device that requires human interpretation. Therefore, an MRMC study comparing human readers with and without AI assistance is not applicable. The precision study did involve multiple operators at POC sites using the device, but this is different from an MRMC study for diagnostic interpretation.

6. Standalone Performance Study

Yes. All the analytical and clinical studies described (LoB, LoD, LoQ, Linearity, Hook Effect, Analytical Specificity, Site-to-Site Precision, Matrix Comparison, and Reference Interval determination) assess the algorithm/device performance in a standalone manner. The "human-in-the-loop" aspect is limited to the operator performing the test according to instructions, not interpreting results in a diagnostic imaging sense.
The Clinical Method Comparison study also implicitly evaluates standalone performance by comparing the device's results to a predicate device.

7. Type of Ground Truth Used

Known concentrations: For LoB, LoD, LoQ, Linearity, Hook Effect, and Analytical Specificity, ground truth is established by preparing samples with known or precisely characterized TSH concentrations (e.g., TSH-depleted, TSH-spiked samples).
Predicate device results: For Clinical Method Comparison, the results from the Access Fast hTSH (on the Access 2 system) are used as the reference/ground truth for comparison.
Statistically derived from healthy population: For Adult Reference Interval determination, ground truth is derived from the statistical distribution (2.5th and 97.5th percentiles) of TSH levels in a large cohort of apparently healthy adults.

8. Sample Size for the Training Set

The document does not explicitly describe a "training set" in the context of machine learning or AI models, as this is an IVD device and the performance studies focus on analytical validation.
However, the calibration process for the device (Lot-specific master calibration curve encoded in an ID chip) implies that a set of characterized samples would have been used by the manufacturer to establish these curves. The size of this internal calibration data set is not provided in this document.

9. How the Ground Truth for the Training Set Was Established

As above, due to this being an IVD device and not an AI/ML model with a distinct "training set" in the conventional sense, this information is not explicitly provided.
The calibration curves provided on the ID chips would have been established by the manufacturer using a reference method and a range of TSH standards/samples with known concentrations. This would involve a comprehensive analytical process to ensure accuracy and precision across the measuring range.

Ask a Question

Ask a specific question about this device

K Number

K171103

Device Name

Lumipulse G TSH-III Immunoreaction Cartridges

Manufacturer

Fujirebio Diagnostics, Inc.

Date Cleared

2017-07-28

(106 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

k983442

Predicate For

N/A

Intended Use

For in vitro diagnostic use. Lumipulse G TSH-III is a Chemiluminescent Enzyme Immunoassay (CLEIA) for the quantitative determination of thyroid stimulation hormone (TSH) in human serum on the LUMIPULSE G System. Lumipulse GTSHIII is to be used as an aid in the diagnosis of thyroid or pituitary disorders.

Device Description

Lumipulse GTSH-III is an assay system, including a set of immunoassay reagents, for the quantitative measurement of TSH in specimens based on CLEIA technology by a two-step sandwich immunoassay method on the LUMIPULSE G System. TSH in specimens specifically binds to anti-human TSH monoclonal antibody (mouse) on the particles, and antigen-antibody immunocomplexes are formed. The particles are washed and rinsed to remove unbound materials. Alkaline phosphatase (ALP: calf)-labeled anti-human TSH monoclonal antibody (mouse) specifically binds to TSH of the immunocomplexes on the particles, and additional immunocomplexes are formed. The particles are washed and rinsed to remove unbound materials. Substrate Solution is added and mixed with the particles. AMPPD contained in the Substrate Solution is dephosphorylated by the catalysis of ALP indirectly conjugated to particles. Luminescence (at a maximum wavelength of 477 nm) is generated by the cleavage reaction of dephosphorylated AMPPD. The luminescent signal reflects the amount of TSH.

AI/ML Overview

The Lumipulse® G TSH-III Immunoreaction Cartridges is a Chemiluminescent Enzyme Immunoassay (CLEIA) for the quantitative determination of thyroid stimulating hormone (TSH) in human serum on the LUMIPULSE G System. It is intended for in vitro diagnostic use, as an aid in the diagnosis of thyroid or pituitary disorders.

The study presented focuses on demonstrating the analytical performance and method comparison of the Lumipulse G TSH-III assay against the predicate device, Abbott ARCHITECT TSH assay, to establish substantial equivalence.

1. Table of Acceptance Criteria and Reported Device Performance

Performance Characteristic	Acceptance Criteria (Implicit from study results and CLSI guidelines)	Reported Device Performance (Lumipulse G TSH-III)
Precision/Reproducibility
Within-Laboratory (Total) Precision (20-Day)	≤ 6.4% CV (as demonstrated by predicate and industry standards)	≤ 6.4% CV (range 1.9% - 6.4% across 5 panels)
Lot-to-Lot Reproducibility (Total Precision)	≤ 4.6% CV (as demonstrated)	≤ 4.6% CV (range 3.1% - 4.6% across 3 panels)
Between-Lot Precision	≤ 4.0% CV (as demonstrated)	≤ 4.0% CV
Site-to-Site Reproducibility (Total Precision)	≤ 4.3% CV (as demonstrated)	≤ 4.3% CV (range 2.9% - 4.3% across 3 panels)
Between-Site Precision	≤ 2.4% CV (as demonstrated)	≤ 2.4% CV
Linearity/Assay Reportable Range	Linear correlation (R-squared close to 1) over a wide range; no high dose hook effect	Linear in the range of 0.001 to 227.804 µIU/mL (y = 1.03x + 0.001; R-squared: 0.9962); No high dose hook effect observed up to ~5,000 µIU/mL
Detection Limits
Limit of Blank (LoB)	Low as possible for diagnostic utility (consistent with CLSI EP17-A2)	0.0010 µIU/mL
Limit of Detection (LoD)	Low as possible for diagnostic utility (consistent with CLSI EP17-A2)	0.002 µIU/mL
Limit of Quantitation (LoQ)/Functional Sensitivity (FS)	≤ 0.02 µIU/mL for third-generation TSH assays (NACB Guideline)	≤ 0.006 µIU/mL
Analytical Specificity (Interference)	Average interference ≤ 10% for each compound	≤ 10% interference for tested endogenous and therapeutic drug compounds
Method Comparison (vs. Abbott ARCHITECT TSH)	High correlation coefficient (r) and acceptable slope/intercept relative to predicate	n=141; r = 0.9838; Intercept = -0.0037 (95% CI: -0.0064 to -0.0010); Slope = 0.97 (95% CI: 0.93 to 1.01); Average Bias = -1.051 µIU/mL

2. Sample Size for the Test Set and Data Provenance

Precision/Reproducibility (20-Day): 5 human serum-based panels, assayed in replicates of two at two separate times of the day for 20 days (n=80 for each sample). The origin of these human serum-based panels is not explicitly stated regarding country or retrospective/prospective nature, but they are described as "human serum-based panels."
Lot-to-Lot Reproducibility: 3 panels, specific sample size (replicates/days) not explicitly stated but part of a larger precision analysis.
Site-to-Site Reproducibility: 3 panels (Lot A), specific sample size (replicates/days) not explicitly stated but part of a larger precision analysis.
Linearity/Assay Reportable Range: High and low sample pools, number of samples not explicitly stated beyond "patient samples."
Detection Limit (LoB & LoD): Eight low-level specimens tested over 6 weeks using two LUMIPULSE G1200 Systems and two Lumipulse G TSH-III lots, giving 480 determinations for each panel.
Analytical Specificity: Human serum specimens with TSH concentrations of approximately 0.566, 2.530, and 67.515 ulU/mL, supplemented with potentially interfering compounds.
Cross-reactivity: Human serum specimens with TSH concentrations of approximately 0.566, 2.530, and 67.515 µIU/mL, supplemented with potentially cross-reacting compounds (n=3 for each test concentration).
Method Comparison: 141 serum samples, ranging from 0.026 to 84.299 µIU/mL (Lumipulse G TSH-III) and 0.030 to 89.930 µIU/mL (ARCHITECT TSH). The provenance (e.g., country of origin, retrospective/prospective) of these samples is not specified beyond being "patient samples."
Expected values/Reference range: 116 healthy test subjects. Provenance is not specified.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

This type of immunoassay performance study typically does not involve human expert adjudication for ground truth of individual measurements in the same way imaging or diagnostic accuracy studies might. The "ground truth" for the performance characteristics (precision, linearity, detection limits, specificity) is based on the inherent analytical properties of the reference materials, calibrated instruments, and statistical methodologies (e.g., CLSI protocols). For the method comparison, the predicate device (Abbott ARCHITECT TSH) serves as the comparator, and its established performance is implicitly relied upon.

4. Adjudication Method for the Test Set

Not applicable in the conventional sense. The "ground truth" for analytical performance is derived from well-defined reference materials, established concentrations, and statistical analyses following recognized CLSI protocols. For method comparison, the reference measurements from the predicate device serve as the comparison point.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No, an MRMC comparative effectiveness study was not done. This device is an automated in vitro diagnostic assay, not an imaging or diagnostic aid that involves human readers interpreting cases with or without AI assistance. Therefore, there is no effect size reported for human readers improving with or without AI assistance.

6. Standalone Performance

Yes, a standalone performance study was done. All performance characteristics (precision, linearity, detection limits, analytical specificity, and method comparison) evaluate the Lumipulse G TSH-III assay's performance as a standalone algorithm/device. The results reported are direct measurements from the LUMIPULSE G System.

7. Type of Ground Truth Used

Precision/Reproducibility: Based on repeated measurements of samples with established concentrations, and statistical analysis of variability.
Linearity/Assay Reportable Range: Established using prepared high and low sample pools with known TSH concentrations and assessed by linear regression analysis.
Detection Limits (LoB, LoD, LoQ/FS): Determined statistically from measurements of very low concentration samples according to CLSI guidelines.
Analytical Specificity/Cross-reactivity: Determined by measuring samples spiked with known concentrations of interfering or cross-reacting substances.
Method Comparison: Compared against the measurements obtained from a legally marketed predicate device (Abbott ARCHITECT TSH), which itself has established performance characteristics.
Traceability of Calibrators: Traceable to the 3rd International Standard, 2003 (code: 81/565) by the National Institute for Biological Standards and Control (NIBSC).

8. Sample Size for the Training Set

The document describes performance studies, which are typically validation studies. It does not explicitly mention a "training set" in the context of an AI/machine learning model. For a traditional immunoassay, the "training" aspect is more akin to the assay development and optimization process, not a distinct dataset used for machine learning. The calibrators and controls are used for instrument calibration and assay quality control, not as a training set for an AI algorithm.

9. How the Ground Truth for the Training Set Was Established

Not applicable in the context of an AI training set. For an immunoassay, the "ground truth" for calibrators and controls is established through gravimetric preparation and traceability to international standards (e.g., NIBSC 3rd International Standard for TSH).

Ask a Question

Ask a specific question about this device

K Number

K162698

Device Name

MAGLUMI 2000 TSH, MAGLUMI 2000 Immunoassay Analyzer

Manufacturer

SHENZHEN NEW INDUSTRIES BIOMEDICAL ENGINEERING CO.,LTD

Date Cleared

2017-07-14

(290 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K083844,K151792

Predicate For

K232587

Intended Use

The MAGLUMI 2000 TSH assay is for in vitro diagnostic use in the quantitative determination of thyroid-stimulating hormone (TSH) in human serum. The measurement of TSH is used in the diagnosis of thyroid disorders.

The MAGLUMI 2000 Immunoassay system is an automated, immunoassay analyzer designed to perform in vitro diagnostic tests on clinical serum specimens. The MAGLUMI 2000 Immunoassay system's assay application utilizes chemiluminescents technology for clinical use.

Device Description

The MAGLUMI 2000 system is a floor model, fully automated instrument system that utilizes chemiluminescent technology and uses pre-packaged reagent packs to measure a variety of analytes in human body fluids. It is controlled through a combination of custom and off-the-shelf software.

MAGLUMI 2000 TSH kit consists of the following reagents: Magnetic Microbeads- coated with anti-TSH monoclonal antibody, phosphate buffer, NaN3(<0.1%) and ProClin® 300; Calibrator Low-TSH antigen (human origin), phosphate buffer, bovine serum, NaN3(<0.1%) and ProClin® 300; Calibrator High- TSH antigen (human origin), phosphate buffer, bovine serum, NaN3(<0.1%) and ProClin® 300; Buffer- tris buffer, HAMA Blocker, BSA, NaN3(<0.1%) and ProClin® 300; ABEI Label- labeled with anti-TSH monoclonal antibody (mouse), tris buffer, containing BSA, NaN3(<0.1%) and ProClin® 300; Control 1- TSH antigen (human origin), phosphate buffer, bovine serum NaN3(<0.1%) and ProClin® 300; Control 2- TSH antigen (human origin), phosphate buffer, bovine serum NaN3(<0.1%) and ProClin® 300; Control 3- TSH antigen (human origin), phosphate buffer, bovine serum NaN3(<0.1%) and ProClin® 300.

AI/ML Overview

The provided text describes the performance characteristics of the MAGLUMI 2000 TSH assay and MAGLUMI 2000 Immunoassay Analyzer. Below is an attempt to extract and organize the information as requested, though some categories may not be fully addressed due to the nature of the document (a 510(k) summary for a diagnostic test, not an AI-powered device).

1. A table of acceptance criteria and the reported device performance

The document does not explicitly state "acceptance criteria" in a table format alongside device performance for all aspects. However, performance characteristics are presented, from which implied acceptance ranges can be inferred (e.g., for precision, interference, and linearity). The comparison study shows the performance relative to a predicate device.

Performance Characteristic	Acceptance Criteria (Implied)	Reported Device Performance
Precision	CV% within acceptable limits for diagnostic assays.	Within-Run CV: 1.38% - 4.73%Between-Run CV: 0% - 4.38%Between-Day CV: 0% - 2.90%Total CV: 2.447% - 5.95% (Across various control, calibrator, and serum pools)
Linearity	Good correlation (R²) and close agreement between observed/expected values across the measuring range.	Measuring Range: 0.02 - 91.78 µIU/mLRelationship: Observed = 1.0001 (Expected) + 0.0474, R² = 0.9990
Detection Limit	Quantifiable detection limits.	LOB: 0.001 µIU/mLLOD: 0.006 µIU/mLLOQ: 0.01 µIU/mL
Hook Effect	No hook effect observed within the specified range.	No hook effect observed up to 3000 µIU/mL.
Interference	No significant deviation from expected results with common interfering substances.	No interference observed for: - Conjugate Bilirubin (60 mg/dL)- Hemoglobin (2000 mg/dL)- Triglycerides (1000 mg/dL)- Acetaminophen (20 mg/dL)- Ibuprofen (50 mg/dL)- Aspirin (50 mg/dL)- Biotin (10 ng/mL)- Unconjugate bilirubin (40 mg/dL)- Rheumatoid factor (124 IU/mL)- Human anti-mouse antibodies (HAMA) (300 ng/mL)- Total protein (12.5 mg/mL)
Specificity (Cross-reactivity)	Low cross-reactivity with structurally related hormones.	Less than 2% cross-reactivity with hCG (200 IU/mL), FSH (1500 mIU/mL), and LH (600 mIU/mL).
Method Comparison	Good correlation with predicate device.	Correlation with ADVIA CENTAUR TSH assay: Y = 1.0178X - 0.0773, R² = 0.9974 (for TSH values 0.02 - 91.78 uIU/mL)
Expected Values/Reference Range	Establishment of a normal range for the intended population.	Normal Range: 0.658 – 4.864 µIU/mL (based on 126 healthy adult samples, central 95% frequency distribution).
Stability (Reagents)	Stable for specified duration under given conditions.	Accelerated Stability (Controls, Calibrators): 12 months at 2-8°C.Shelf Life Stability (Reagent Kit): 12 months at 2-8°C.Open Kit Stability: 4 weeks at 2-8°C.
Traceability	Traceable to an international standard.	Traceable to the WHO international standard for human TSH (IRP 81/565) for controls and calibrators.

2. Sample size used for the test set and the data provenance (e.g., country of origin of the data, retrospective or prospective)

Precision Study:
- Sample Size: 4 controls, 2 calibrators, 6 spiked patient serum pools, and 4 native patient sample pools. Each level analyzed 80 times (20 days x 2 runs/day x duplicate) per instrument. (Total N = 240 samples per level across 3 instruments for calculated statistics in the table).
- Data Provenance: Not specified (e.g., country of origin). The study design (e.g., "collected over 20 days") suggests prospective testing during the assay development/validation phase.
Linearity Study:
- Sample Size: 2 primary samples to create a series of intermediate serum dilutions (number of intermediate dilutions not explicitly stated, but 11 data points shown in the table).
- Data Provenance: Not specified.
Detection Limit Study:
- Sample Size: Not explicitly stated, derived from CLSI EP17-A guidelines.
- Data Provenance: Not specified.
Interference Study:
- Sample Size: Human serum samples with low and high TSH concentrations were used for each interfering substance (e.g., "0.97 and 5.4" or "0.7 and 6.1" µIU/mL). The number of individual samples is not described, but it involved at least two TSH concentrations for each interferent.
- Data Provenance: Not specified.
Specificity Study:
- Sample Size: Human serum samples with various TSH concentrations (number not specified) spiked with potential cross-reactants.
- Data Provenance: Not specified.
Comparison Studies:
- Sample Size: 337 patient serum samples.
- Data Provenance: Not specified (e.g., country of origin, retrospective or prospective).
Expected Values/Reference Range:
- Sample Size: 126 serum samples.
- Data Provenance: From "normal, apparently healthy adult (22 years and older) individuals." Not specified (e.g., country of origin, retrospective or prospective).

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts (e.g., radiologist with 10 years of experience)

This is a diagnostic assay for a biomarker (TSH), not an image-based AI device requiring expert interpretation for ground truth. Therefore, this information is not applicable and not provided in the document. The "ground truth" for the test sets (e.g., precision, linearity, interference) is based on the known concentrations of controls, calibrators, spiked samples, or comparison to a predicate device.

4. Adjudication method (e.g., 2+1, 3+1, none) for the test set

Not applicable. As a diagnostic assay, "adjudication" in the sense of reconciling divergent expert opinions on an output is not relevant. The performance studies evaluate the assay's analytical characteristics against established scientific protocols and reference methods.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

Not applicable. This document describes a diagnostic immunoassay system, not an AI-assisted diagnostic tool requiring human readability.

6. If a standalone (i.e., algorithm only without human-in-the-loop performance) was done

The device itself is an automated immunoassay system (MAGLUMI 2000 Immunoassay Analyzer) performing the quantitative determination of TSH using a specific assay (MAGLUMI 2000 TSH assay). Its performance is inherently "standalone" in that it produces a quantitative result from a sample without human interpretive intervention post-assay. The results are then interpreted by clinicians.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc.)

The ground truth for the analytical performance studies (precision, linearity, detection limits, interference, specificity) is primarily based on:

Known concentrations: For controls, calibrators, and spiked samples.
Comparison to a legally marketed predicate device: For method comparison (ADVIA CENTAUR TSH assay).
International standards: Traceability to WHO international standard for human TSH (IRP 81/565).
Established scientific protocols: CLSI guidelines.

8. The sample size for the training set

This document describes the validation of a diagnostic immunoassay system, not a machine learning or AI algorithm. Therefore, there is no "training set" in the context of AI. The performance studies detailed are for validation.

9. How the ground truth for the training set was established

Not applicable, as there is no "training set" for an AI algorithm.

Ask a Question

Ask a specific question about this device

K Number

K162606

Device Name

Elecsys TSH assay, cobas e 801 Immunoassay analyzer

Manufacturer

ROCHE DIAGNOSTICS

Date Cleared

2017-01-23

(126 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K100853,K961491

Predicate For

K223690

Intended Use

cobas e 801 immunoassay analyzer is intended for the in-vitro determination of analytes in body fluids.

Elecsys TSH immunoasay is intended for the in vitro quantitative determination of thyrotropin in human serum and plasma. Measurements of TSH are used in the diagnosis of thyroid or pituitary disorders. The Elecsys TSH immunoassay is an electrochemiluminescence immunoasay 'ECLIA', which is intended for use on the cobas e immunoassay analyzers.

Device Description

The cobas e 801 immunoassay analyzer is a fully automated, software controlled analyzer system for in vitro determination of analytes in human body fluids. It is part of the cobas 8000 modular analyzer series cleared under K100853. It uses electrochemiluminescent technology for signal generation and measurement.

AI/ML Overview

Here's a breakdown of the acceptance criteria and study information for the Elecsys TSH assay on the Cobas e 801 immunoassay Analyzer, based on the provided document:

1. Table of Acceptance Criteria and Reported Device Performance

The document doesn't explicitly state "acceptance criteria" for each performance characteristic as pass/fail thresholds. Instead, it presents the results of various validation studies. I will present the reported performance, and where applicable, infer the implied acceptance based on the presentation of the results as successful.

Performance Characteristic	Acceptance Criteria (Implied)	Reported Device Performance
Repeatability (CV%)	Low CV values, generally < 10% for low concentrations and < 5% for higher concentrations (common for immunoassays)	Human serum 1 (0.00851 µIU/mL): 7.1%Human serum 2 (0.209 µIU/mL): 1.6%Human serum 3 (1.88 µIU/mL): 1.4%Human serum 4 (51.8 µIU/mL): 1.3%Human serum 5 (90.0 µIU/mL): 1.4%PC Universal 1 (1.41 µIU/mL): 1.4%PC Universal 2 (8.18 µIU/mL): 1.6%PreciControl TS (0.184 µIU/mL): 1.8%
Intermediate Precision (CV%)	Low CV values, generally < 15% for low concentrations and < 10% for higher concentrations	Human serum 1 (0.00851 µIU/mL): 11.3%Human serum 2 (0.209 µIU/mL): 2.5%Human serum 3 (1.88 µIU/mL): 2.3%Human serum 4 (51.8 µIU/mL): 2.0%Human serum 5 (90.0 µIU/mL): 1.9%PC Universal 1 (1.41 µIU/mL): 2.1%PC Universal 2 (8.18 µIU/mL): 2.5%PreciControl TS (0.184 µIU/mL): 2.3%
Linearity (Pearson's r)	High correlation (e.g., > 0.99) and slope close to 1, intercept close to 0 across the measuring range	Serum 1: Pearson's r = 0.9994, slope = 0.963, intercept = -0.00155Serum 2: Pearson's r = 0.9992, slope = 0.958, intercept = -0.00193Serum 3: Pearson's r = 0.9986, slope = 0.952, intercept = -0.00272
Limit of Blank (LoB)	Low value, typically indicative of assay's ability to distinguish analyte-free samples from those with very low levels.	0.0025 µIU/mL
Limit of Detection (LoD)	Low value, indicating sensitivity to low analyte concentrations. Specific 95% probability is a common criterion.	0.005 µIU/mL (detected with 95% probability)
Limit of Quantitation (LoQ)	Low value with acceptable precision (e.g., CV ≤ 20%)	0.005 µIU/mL at a CV ≤ 20%
Endogenous Interferences	No significant interference at specified levels	No interference observed up to the indicated levels for Intralipid (2000 mg/dL), Biotin (56.0 ng/mL), Bilirubin (66.0 mg/dL), Hemoglobin (1000 mg/dL), Rheumatic Factor (1500 IU/mL), human IgG (2.80 g/dL), human IgM (0.500 g/dL).
Exogenous Interferences (Anticoagulants)	Values obtained from different sample types (serum, plasma with various anticoagulants) should be comparable.	Data supported the use of Serum, Li-Heparin, K2-EDTA, and K3-EDTA plasma tubes, evaluated using Passing/Bablok regression analysis comparing serum/plasma pairs.
Exogenous Interferences (Drugs)	No significant interference at specified drug concentrations	No interference found with 16 commonly used drugs and several special drugs (Amiodarone, Carbimazole, Fluocortolone, Hydrocortisone, Iodide, Levotyroxine, Liothyronine, Methimazole, Octreotide, Prednisolone, Propanolol, Propylthiouracil, Perchlorate) at tested concentrations.
Method Comparison (Correlation)	Strong correlation (e.g., Pearson's r > 0.98) and agreement (slope close to 1, intercept close to 0) with predicate device	N = 130 samplesPassing/Bablok: Slope = 0.936, Intercept = -0.003, Kendall ($\tau$) = 0.989Linear Regression: Slope = 0.958, Intercept = -0.052, Pearson (r) = 0.999

2. Sample Size and Data Provenance for Test Set

Repeatability and Intermediate Precision:
- Sample Size:
  - 7 human serum samples (5 pooled, 5 pooled spiked) and 2 control samples (PC Universal, PreciControl TS).
  - Each sample tested in 2 replicates per run, 2 runs per day for 21 days (total of 84 replicate measurements per sample type).
Linearity:
- Sample Size: 3 high analyte serum samples diluted to 12 concentrations. Each concentration assayed in 3-fold determination within a single run.
Analytical Sensitivity (LoB, LoD, LoQ):
- LoB: Blank sample tested with 60 replicates (10 replicates per run, 6 days).
- LoD: 5 low-level human serum samples tested with 60 replicates (2 replicates per sample per run, 6 days).
- LoQ: 10 low-level TSH samples tested over 5 days, 5 replicates per sample per day.
Endogenous Interferences:
- Human serum samples. Specific number not provided, but the outcome is qualitative ("No interference observed").
Exogenous Interferences (Anticoagulants):
- A minimum of 40 serum/plasma pairs per sample material (presumably 40 serum, 40 Li-Heparin plasma, 40 K2-EDTA plasma, 40 K3-EDTA plasma). Tested in singleton for each type.
Exogenous Interferences (Drugs):
- 16 commonly used drugs and 13 special drugs. Specific number of samples not provided, but the outcome is qualitative ("No interference").
Method Comparison:
- Sample Size: 130 human serum samples (single donors and serum pools; native, spiked as well as diluted).
Data Provenance: Not explicitly stated (e.g., country of origin, retrospective/prospective). However, the use of "human serum samples" and "human serum/plasma pairs" indicates biological samples. The studies are prospective in nature, conducted specifically to validate the device.

3. Number of Experts Used to Establish Ground Truth for Test Set and Qualifications

This device is an immunoassay for quantitative determination of TSH. The "ground truth" for the test set is established by the reference measurement method or the quantitative value itself. No human experts are used to establish ground truth in the same way they would be for image interpretation. The accuracy of the quantitative measurements is assessed against known concentrations (e.g., in linearity, LoB/LoD studies) or against a predicate device (method comparison).

4. Adjudication Method

Not applicable for an immunoassay. Adjudication is typically used when human interpretation of results contributes to the ground truth, which is not the case here.

5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study

No. This type of study is relevant for diagnostic imaging or other interpretations where human readers are involved. This document describes the performance of an in-vitro diagnostic device (IVD) for quantitative measurement, which operates without direct human interpretive input being part of the core measurement.

6. Standalone Performance

Yes, these studies describe the standalone (algorithm only, in this context meaning the device's automated performance without human intervention after sample loading) performance of the Elecsys TSH assay on the cobas e 801 immunoassay analyzer. The results presented are directly from the instrument's measurements.

7. Type of Ground Truth Used

Reference Materials/Known Concentrations: For precision, linearity, LoB, LoD, LoQ, endogenous and exogenous interference studies, ground truth implicitly refers to the expected concentration/value based on the preparation of known samples or the absence of an analyte (for blanks).
Predicate Device Measurements: For the method comparison study, the "ground truth" for comparison is the measurement obtained from the predicate device (Elecsys TSH on Elecsys 2010 analyzer).
Expected Values: The "Expected Values" section establishes a reference range (0.27-4.20 µIU/mL) based on healthy test subjects. This is a clinical reference range, not a direct ground truth for individual measurements, but rather a benchmark for interpretation.

8. Sample Size for the Training Set

The document describes validation studies for an in vitro diagnostic device (IVD), specifically an immunoassay analyzer and assay. IVDs like this do not typically have a "training set" in the machine learning sense. The device's operational parameters, calibration curves, and algorithms are developed during the product development phase by the manufacturer, which might involve internal data, but generally, the submission focuses on external validation. The document does not provide details on the development data used.

9. How the Ground Truth for the Training Set Was Established

As there is no "training set" described in the context of machine learning, this question about establishing ground truth for it is not applicable here. The device output is a quantitative value based on chemical reactions and detection, not a learned prediction from a ground-truthed dataset in the AI sense.

Ask a Question

Ask a specific question about this device

Page 1 of 6