Search Results

The GSP Neonatal Total Galactose kit is intended for the quantitative determination of total galactose (galactose and galactose-1-phosphate) concentrations in blood specimens dried on filter paper as an aid in screening newborns for galactosemia using the GSP® instrument.

Device Description

The GSP Neonatal Total Galactose kit contains sufficient reagents to perform 1152 assays. The GSP Neonatal Total Galactose test system measures total galactose, i.e. both galactose and galactose-1-phosphate, using a fluorescent galactose oxidase method. The fluorescence is measured using an excitation wavelength of 505 nm and an emission wavelength of 580 nm. The kit contains Neonatal Total Galactose Assay Reagent 1, Neonatal Total Galactose Assay Reagent 2, Neonatal Total Galactose Assay Buffer, Neonatal Total Galactose Assay Reconstitution Solution, and Neonatal Extraction Solution. Calibrators and Controls are also included.

AI/ML Overview

1. Table of Acceptance Criteria and Reported Device Performance

Acceptance Criteria	Reported Device Performance
Precision (Total Variation)	Ranged from 9.3% to 14.1% CV.
Limit of Blank (LoB)	0.34 mg/dL
Limit of Detection (LoD)	0.97 mg/dL
Limit of Quantitation (LoQ)	1.15 mg/dL (defined as the lowest concentration with a total CV equal to or less than 20%).
Linearity	Demonstrated linear performance throughout the measuring range (from 1.15 mg/dL to 50 mg/dL).
Recovery	Average recovery of 109% for galactose, 117% for galactose-1-phosphate, and 103% for both combined from three contrived dried blood spot samples.
Interference	- Acetaminophen: Concentrations above 2.75 mg/dL caused a significant decrease (>15%) in measured total galactose. Maximum tested (5.5 mg/dL) caused a decrease of ~20-22%.- Conjugated Bilirubin: Concentrations above 16.6 mg/dL caused a significant decrease (>15%) in measured total galactose. At 24.9 mg/dL and above, the decrease was 100% at some total galactose concentrations.- Intralipid: Concentrations above 250 mg/dL (at 5 and 10 mg/dL total galactose) or 375 mg/dL (at 15 mg/dL total galactose) caused a significant increase (>15%) in measured total galactose. Maximum tested (1500 mg/dL) caused an increase of ~52-77%.- Hemoglobin (with Bilirubin): Hemoglobin levels at 198 g/L and above in combination with an elevated bilirubin level of 15 mg/dL caused a significant increase (>15%) in measured total galactose at certain total galactose concentrations. For example, at 5 mg/dL total galactose, 198 g/L Hb led to a 26.3% increase.- Non-Interfering Substances: Unconjugated bilirubin (20 mg/dL), ß-Nicotinamide adenine dinucleotide (100 µmol/L), Glutathione (3 mmol/L), Human Serum Albumin (30 mg/mL), Ascorbate (6 mg/dL), D-glucose (1000 mg/dL), D-mannose (100 mg/dL), D-fructose (18 mg/dL), Ampicillin (152 µmol/L), and Lithium heparin (0.375 mg/ml), and Hematocrit levels from 30% to 66% (102-230 g/L Hemoglobin) were found not to interfere.
Screening Performance vs. Predicate (95th percentile)	Overall percent agreement = 96.0%Positive percent agreement = 63.6%Negative percent agreement = 97.9%
Screening Performance vs. Predicate (99th percentile)	Overall percent agreement = 98.8%Positive percent agreement = 53.3%Negative percent agreement = 99.4%

2. Sample Size Used for the Test Set and Data Provenance

Sample Size for Precision Study: 7 samples, with 216 total measurements per sample (4 replicates per sample, in 27 runs over 21 days using 3 kit lots and 3 GSP instruments).
Sample Size for LoD: 216 determinations of 4 low-level samples.
Sample Size for LoB: 150 blank samples.
Sample Size for Recovery: 3 contrived dried blood spot samples.
Sample Size for Interference Studies: Not explicitly stated for each concentration, but involved various concentrations of interfering substances at three total galactose concentrations (5, 10, and 15 mg/dL).
Sample Size for Internal Method Comparison: 141 routine screening and spiked blood spot specimens.
Sample Size for Screening Performance Study: 2320 samples (6 confirmed positive samples and 2314 routine samples).
Data Provenance: The screening performance study was conducted at "one newborn screening laboratory in the United States." Other non-clinical studies (precision, linearity, LoB/LoD/LoQ, recovery, interference) appear to be internal laboratory studies without specific geographic provenance mentioned, but presumably also conducted in the US or Finland (Wallac Oy headquarters). The studies were retrospective, using banked samples and contrived samples.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

The document does not mention the use of experts to establish ground truth for the test set. For the screening performance study, "6 confirmed positive samples" are mentioned, implying prior clinical diagnosis as the ground truth. The qualifications of who confirmed these positive cases or how the "routine samples" were classified as normal are not specified.

4. Adjudication Method for the Test Set

Not applicable. The document does not describe any expert adjudication process for the test set. Ground truth for confirmed positive samples likely came from clinical diagnosis.

5. If a Multi Reader Multi Case (MRMC) Comparative Effectiveness Study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

Not applicable. This device is an in-vitro diagnostic test kit (laboratory assay) for quantitative determination, not an imaging device or AI-assisted diagnostic tool that would involve human readers interpreting results in a MRMC study.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

This refers to the performance of the assay itself. The entire submission details the standalone performance of the GSP Neonatal Total Galactose kit (assay only) without human-in-the-loop interpretation beyond standard laboratory procedures and reporting. The "GSP® instrument" is automated, as stated in the comparison chart ("GSP instrument, automated").

7. The Type of Ground Truth Used

For Analytical Performance (Precision, LoB, LoD, LoQ, Linearity, Recovery, Interference): Ground truth was established by preparing samples with known concentrations of total galactose or specific interfering substances. For example, for recovery, the "recovery of galactose, galactose-1-phosphate, and both combined was determined from three contrived dried blood spot samples," meaning these samples were prepared with known amounts.
For Screening Performance Study: The ground truth for the 6 positive samples was "confirmed positive." This implies a clinical diagnosis of galactosemia, likely through follow-up diagnostic testing. The other 2314 samples are referred to as "routine samples" and classified as "normal" in the context of screening performance, likely based on either their negative predicate device result or their actual clinical status. The document also compares the new device's results against the predicate device's results as a reference for "Manual result."

8. The Sample Size for the Training Set

Not applicable in the conventional sense of machine learning training sets. This is a chemical assay, not an AI/ML device that requires a distinct training set for model development. The development and optimization of the assay would involve various experiments, but these are not referred to as "training sets" in this context.

9. How the Ground Truth for the Training Set Was Established

Not applicable, as there is no "training set" in the context of an AI/ML model for this chemical assay. The development of the calibrators and controls (which are prepared with known concentrations of galactose and galactose-1-phosphate) serves an analogous function in ensuring accuracy and consistency of the assay.

Ask a Question

Ask a specific question about this device

K Number

K110274

Device Name

AUTODELFIA NEONATAL IRT KIT

Manufacturer

WALLAC OY, A SUBSIDIARY OF PERKINELMER, INC.

Date Cleared

2011-06-10

(130 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

N/A

Predicate For

N/A

Intended Use

The AutoDELFIA Neonatal IRT kit is intended for the quantitative determination of human immunoreactive trypsin(ogen) (IRT) in blood specimens dried on filter paper as an aid in screening newborns for cystic fibrosis using the 1235 AutoDELFIA® automatic immunoassay system.

Device Description

The AutoDELFIA Neonatal IRT assay is a solid phase, two-site fluoroimmunometric assay based on the direct sandwich technique in which two monoclonal antibodies (derived from mice) are directed against two separate antigenic determinants on the IRT molecule. Calibrators, controls and test specimens containing IRT are reacted simultaneously with immobilized monoclonal antibodies directed against a specific antigenic site on the IRT molecule and europium-labeled monoclonal antibodies (directed against a different antigenic site) in assay buffer. The assay buffer elutes IRT from the dried blood on filter paper disks. The complete assay requires only one incubation step. Enhancement Solution dissociates europium ions from the labeled antibody into solution where they form highly fluorescent chelates with components of the Enhancement Solution. The fluorescence in each well is then measured. The fluorescence of each sample is proportional to the concentration of IRT in the sample.

AI/ML Overview

The provided text describes a 510(k) premarket notification for the AutoDELFIA Neonatal IRT kit. This submission focuses on demonstrating substantial equivalence to a predicate device, rather than providing a detailed study report with specific acceptance criteria and performance data in the format typically used for AI/ML device evaluations. As such, many of the requested fields for AI/ML device studies are not directly applicable or explicitly stated in this document.

However, I can extract information related to the device's analytical performance characteristics, which serve as a form of acceptance criteria for this type of in-vitro diagnostic device.

Here's an attempt to populate the table and answer the questions based on the provided text, indicating where information is not available.

1. Table of Acceptance Criteria and the Reported Device Performance

For this in-vitro diagnostic device, "acceptance criteria" are generally established by demonstrating performance characteristics that are comparable to or better than a legally marketed predicate device, and that meet the required analytical performance for its intended use.

Characteristic (Feature)	Acceptance Criteria (from Predicate Device)	Reported Device Performance (New Device: B005-212/B005-204)
Measuring Range	4 (as defined by LoB) to 500 (as defined by upper calibrator) ng/mL blood	16 to 480 ng/mL blood
Linearity Range	No claims for linearity in labeling.	16 to 480 ng/mL blood
Analytical Sensitivity / Limit of Blank (LoB)	< 4 ng/mL blood	0.53 ng/mL blood
Limit of Detection (LoD)	Not explicitly stated, implied to be around 4 ng/mL blood (from LoB)	2.9 ng/mL blood
Antibody Cross-Reactions	α2-macroglobulin < 4 ng/ml blood, α1-antitrypsin < 4 ng/ml blood, Phospholipase A2 < 4 ng/ml blood, Chymotrypsin < 4 ng/ml blood, Human IgG < 4 ng/ml blood, Uropepsinogen < 4 ng/ml blood	α2-macroglobulin < 4 ng/ml blood, α1-antitrypsin < 4 ng/ml blood, Phospholipase A2 < 4 ng/ml blood, Chymotrypsin < 4 ng/ml blood, Human IgG < 4 ng/ml blood, Uropepsinogen < 4 ng/ml blood (All "Same" as predicate, which explicitly lists these values)
Hook effect	No hook effect has been found with IRT concentrations up to 40,000 ng/mL	No hook effect has been found with IRT concentrations up to 40,000 ng/mL
Precision (Total Variation CV%)	42.6 ng/mL blood CV% 9.3, 98.8 ng/mL blood CV% 10.0, 266 ng/mL blood CV% 9.6	16.7 ng/mL blood CV% 8.7, 22.5 ng/mL blood CV% 9.6, 48.0 ng/mL blood CV% 9.1, 104 ng/mL blood CV% 8.0, 247 ng/mL blood CV% 8.3, 401 ng/mL blood CV% 8.4, 449 ng/mL blood CV% 9.4

Note on "Acceptance Criteria": For this 510(k) submission, the "acceptance criteria" are implied to be achieving analytical performance characteristics that are comparable to or improved from the predicate device, thereby demonstrating substantial equivalence. The table shows that the new device generally performs comparably or better (e.g., lower LoB, explicit linearity claim, more detailed precision data, and a wider range of concentrations with good precision).

Regarding the study proving the device meets the acceptance criteria:

The document describes the submission as a 510(k) for an in-vitro diagnostic kit. The "study" here refers to the analytical performance evaluation conducted by the manufacturer to demonstrate substantial equivalence to the predicate device. The information provided is a summary of the device's analytical characteristics.

2. Sample size used for the test set and the data provenance:

Sample Size: Not explicitly stated in terms of number of individual patient samples. The precision data lists several concentration levels (e.g., 16.7 ng/mL, 22.5 ng/mL, etc.), implying multiple measurements were taken at each level. The cross-reactivity and hook effect studies would have involved specific spiked samples.
Data Provenance: Not explicitly stated (e.g., country of origin). It's an in-house analytical validation, likely conducted at the manufacturer's facility. It is a retrospective analysis of laboratory-prepared samples or collected blood spots rather than a prospective clinical study involving external patient recruitment.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:

This question is more applicable to AI/ML devices that rely on expert interpretation for ground truth. For this in-vitro diagnostic assay, the "ground truth" for reported values (e.g., IRT concentration) is established by the analytical method itself and calibration against known standards. There's no mention of external expert consensus for establishing ground truth for the analytical performance data.

4. Adjudication method (e.g. 2+1, 3+1, none) for the test set:

Not applicable for this type of analytical performance study of an in-vitro diagnostic kit. Adjudication methods like 2+1 or 3+1 are typically used in clinical studies where multiple human readers interpret medical images or clinical data, and a disagreement resolution process is needed to establish a definitive ground truth.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:

Not applicable. This is not an AI/ML device designed to assist human readers. It's an automated immunoassay system for quantitative measurement.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:

No, this is not an AI/ML algorithm. It is an automated immunoassay kit where the "algorithm" is the biochemical reaction and the instrument's measurement and calculation of IRT concentration. The device operates in a standalone analytical capacity to measure IRT.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc):

The ground truth for the analytical performance characteristics (such as concentration, linearity, limit of blank, limit of detection, cross-reactivity, hook effect, and precision) would be established by:
- Reference materials/known standards: For calibration, linearity, and determining accurate concentrations.
- Spiked samples: For cross-reactivity and hook effect studies where known interferents or high concentrations are added.
- Repeated measurements: For precision studies.

8. The sample size for the training set:

Not explicitly stated, and the concept of a "training set" as understood in AI/ML is not directly applicable. For this type of device, development involves optimizing the assay components and conditions, which is an iterative process using various samples (e.g., patient samples, spiked samples, controls) but not typically referred to as a discrete "training set" in the AI/ML context.

9. How the ground truth for the training set was established:

As above, the concept of a "training set" with established ground truth in the AI/ML sense is not relevant here. Ground truth for internal development and optimization would be based on the known biochemical properties of the reagents, reference standards, and performance evaluation criteria.

Ask a Question

Ask a specific question about this device

K Number

K103484

Device Name

GSP NEONATAL THYROXINE (T4)

Manufacturer

WALLAC OY, A SUBSIDIARY OF PERKINELMER, INC.

Date Cleared

2011-04-22

(147 days)

Product Code

Regulation Number

Type

Panel

Reference & Predicate Devices

K943416

Predicate For

N/A

Intended Use

The GSP Neonatal Thyroxine (T4) kit is intended for the quantitative determination of human thyroxine (T4) in blood specimens dried on filter paper as an aid in screening newborns for congenital (neonatal) hypothyroidism using the GSP instrument.

Device Description

The GSP Neonatal T4 assay is a solid phase time-resolved fluoroimmunoassay based on the competitive reaction between europium-labeled T4 and sample T4 for a limited amount of binding sites on T4 specific monoclonal antibodies (derived from mice). The use of 8-anilino-1-naphthalenesulfonic acid (ANS) and salicylate in the T4 Assay Buffer facilitates the release of T4 from the binding proteins. Thus the assay measures the total amount of T4 in the test specimen. A second antibody, directed against mouse IgG, is coated to the solid phase, and binds the IgG-thyroxine complex, giving convenient separation of the antibody-bound and free antigen. DELFIA Inducer dissociates europium ions from the labeled antibody into solution where they form highly fluorescent chelates with components of DELFIA Inducer. The fluorescence in each well is then measured. The fluorescence of each sample is inversely proportional to the concentration of T4 in the sample.

AI/ML Overview

The provided text describes a 510(k) premarket notification for an in vitro diagnostic device, the GSP Neonatal Thyroxine (T4) kit. This type of submission focuses on demonstrating substantial equivalence to a legally marketed predicate device rather than conducting a full clinical study with specific acceptance criteria and ground truth for disease diagnosis in the same way an AI/ML powered device might.

Therefore, the requested information regarding "acceptance criteria" for an AI device, "sample size for the test set," "number of experts," "adjudication method," "MRMC study," "standalone performance," and "ground truth for training/testing" in the context of an AI/ML study does not directly apply to this submission.

However, I can extract the closest analogous information available within this document, focusing on the performance characteristics presented to demonstrate equivalence.

Here's an attempt to answer the questions based on the provided document, interpreting "acceptance criteria" as performance metrics for this diagnostic kit.

1. Table of Acceptance Criteria and Reported Device Performance

For an in-vitro diagnostic kit like this, "acceptance criteria" are typically defined by demonstrating that the new device performs comparably to or within acceptable ranges relative to a predicate device and established analytical performance specifications. The document provides a comparison of various features and performance characteristics between the new GSP Neonatal T4 kit and its predicate device, AutoDELFIA Neonatal T4 Kit.

Performance Characteristic	Predicate Device (AutoDELFIA T4) Performance (Analogous to "Acceptance Criteria" for comparison)	GSP Neonatal T4 Kit Reported Performance (Analogous to "Device Performance")
Precision (CVs)	Control 1; 3.95 µg/dL serum - Intra-assay variation 14.9 % - Inter-assay variation 10.0 % - Total variation 18.0 % Control 2; 8.08 µg/dL serum - Intra-assay variation 10.6 % - Inter-assay variation 7.1 % - Total variation 12.7 % Control 3; 18.2 µg/dL serum - Intra-assay variation 8.2% - Inter-assay variation 4.3% - Total variation 9.3 %	Sample 1; 2.0 µg/dL - Within run 1.0% - Within lot 15.5% - Total variation 15.8% Sample 2; 4.8 µg/dL - Within run 7.3% - Within lot 10.7% - Total variation 11.4% Sample 3; 7.5 µg/dL - Within run 6.5% - Within lot 8.4% - Total variation 8.6% Sample 4; 16.6 µg/dL - Within run 4.5% - Within lot 7.8% - Total variation 8.5% Sample 5; 19.8 µg/dL - Within run 7.2% - Within lot 9.9% - Total variation 10.3% Sample 6; 21.4 µg/dL - Within run 7.1% - Within lot 9.8% - Total variation 10.1%
Measuring Range	1.5 µg/dL to the highest level calibrator	1.6 to 30 µg/dL serum
Limit of Blank (LoB)	< 1.5 µg/dL	0.457 µg/dL
Limit of Detection (LoD)	Not available	0.99 µg/dL
Limit of Quantitation (LoQ)	Not available	1.61 µg/dL
Interference	Bilirubin at 20 mg/dL has no significant effect.	Icteric (unconjugated bilirubin ≤ 342 µmol/L, equivalent to 20 mg/dL in serum, and conjugated bilirubin ≤ 237 µmol/L, equivalent to 20 mg/dL in serum), Lipemic (Intralipid¹ ≤ 15 mg/mL in serum), and Hemoglobin up to 15 g/L samples do not interfere. [¹Intralipid is a registered trademark of Fresenius Kabi AB.]
Cross-reactivity	LT3: 0.89% 3,3',5-Triiodoacetic acid: 0.45% 3,5-Diiodo-L-thyronine: < 0.1% 3,5-Diiodotyrosine (DIT): < 0.1% 5,5 Diphenylhydantoin: < 0.1% 3-iodo-L-tyrosine (MIT): < 0.1% Phenylbutazone: < 0.1% 6-n-Propyl-2-thiouracil: < 0.1% Methimazole: < 0.1% L-Tyrosine: < 0.1% Acetylsalicylic acid: < 0.1%	LT3: 1.67% 3,3',5-Triiodothyroacetic acid: 0.14% 3,5-Diiodo-L-thyronine: < 0.1% 3,5-Diiodotyrosine (DIT) dihydrate: < 0.1% 5,5-Diphenylhydantoin: < 0.1% 3-iodo-L-tyrosine (MIT): < 0.1% Phenylbutazone: < 0.1% 6-n-Propyl-2-thiouracil: < 0.1% Methimazole: < 0.1% L-Tyrosine: < 0.1% Acetylsalicylic acid: < 0.01%

2. Sample size used for the test set and the data provenance

The document does not specify a separate "test set" sample size in the context of an AI/ML algorithm validation. Instead, it describes analytical performance studies.

Precision study: The precision data (Within run, Within lot, Total variation) is presented for six different samples at various concentration levels (2.0, 4.8, 7.5, 16.6, 19.8, 21.4 µg/dL). The number of replicates or runs for each sample is not explicitly stated.
Interference study: Not explicitly stated, but the document mentions testing with specific concentrations of bilirubin, Intralipid, and hemoglobin.
Cross-reactivity study: Not explicitly stated, but specific substances and their cross-reactivity percentages are listed.
Data Provenance: The document does not specify the country of origin or whether the data was retrospective or prospective. Given the nature of a premarket submission for an IVD kit, these studies are typically conducted by the manufacturer as part of the validation process.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

This question is not applicable. For an immunoassay kit like this, the "ground truth" is established by the direct measurement of T4 concentration using the device itself, calibrated against known standards. There are no human "experts" establishing ground truth through image review or clinical assessment in the way an AI/ML device for diagnosis would require. The "qualification" of personnel pertains to "adequately trained laboratory personnel" running the assay.

4. Adjudication method for the test set

This question is not applicable. Adjudication methods are relevant for studies where multiple independent human readers or algorithms are assessing the same data, and their results need to be reconciled (e.g., in medical image interpretation). This is an analytical performance study of an immunoassay.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

This question is not applicable. This is an immunoassay kit, not an AI/ML-powered device intended for human-in-the-loop assistance in clinical decision-making or image interpretation. Therefore, no MRMC study was performed.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

This question is not applicable. The device itself is a standalone diagnostic kit (reagents and instrument) that produces quantitative T4 measurements. It is not an AI/ML algorithm.

7. The type of ground truth used

For an immunoassay, the "ground truth" for evaluating the device's performance is typically established through:

Known Calibrator Concentrations: The device's measurements are compared against a standard curve generated from calibrators with precisely known T4 concentrations.
Reference Methods: If available, comparison to a gold standard reference method for T4 measurement would be part of validation (though not explicitly detailed as "ground truth" in this summary).
Control Materials: Use of quality control materials with target T4 concentrations.

The document discusses "calibrators" and "controls" which are used to establish and verify the accuracy of the measurements.

8. The sample size for the training set

This question is not applicable. This is not an AI/ML device that requires a "training set" for model development. The performance data presented relates to the validation of the manufactured kit through analytical studies.

9. How the ground truth for the training set was established

This question is not applicable as there is no "training set" in the context of an AI/ML model for this type of medical device submission.

Ask a Question

Ask a specific question about this device

Page 1 of 1