Search Results

The NeoLSD MSMS Kit is intended for the quantitative measurement of the activity of the enzymes acid-pglucocerebrosidase (ABG), acid-sphingomyelinase (ASM), acid-a-glucosidase (GAA), B-galactocerebrosidase (GALC), α-galactosidase A (GLA) and α-L-iduronidase (IDUA) in dried blood spots (DBS) from newborn babies. The analysis of the enzymatic activity is intended as an aid in screening newborns for the following lysosomal storage disorders (LSD) respectively; Gaucher Disease, Nieman-Pick A/B Disease, Pompe Disease, Fabry Disease, and MPS I Disease.

Device Description

The NeoLSD MSMS test system uses mass spectrometry to quantitatively measure the activity of six lysosomal enzymes simultaneously from a dried blood spot sample. The NeoLSD MSMS test system is comprised of:

NeoLSD MSMS kit, including substrates, internal standards, solutions and controls
Waters TQD MSMS instrument comprised of,
a. Waters 1525 sample pump
b. Waters 2777c autosampler
c. Waters MassLynx v4.1 firmware C.
d. Power cables, tubing, syringes, connection cables
Waters NeoLynx v4.1 software and computer with monitor
PerkinElmer MSMS Workstation Software

The NeoLSD MSMS kit evaluates enzyme activities by measuring the product generated when an enzyme reacts with a synthesized substrate to create a specific end product. The activities of the six lysosomal enzymes present in a 3.2 mm punch from a dried blood spot (DBS) are simultaneously measured by the NeoLSD MSMS kit. The punches are incubated with the assay reagent mixture which contains;
. six substrates, one corresponding to each lysosomal enzyme
. six stable-isotope mass-labeled internal standards (IS) each designed to chemically resemble each product generated
. a buffer to maintain the reaction pH, and to carry inhibitors to limit activity from competing enzymes if present and additives to enhance the targeted enzyme reactions.

AI/ML Overview

The NeoLSD MSMS Kit is intended for the quantitative measurement of the activity of six lysosomal enzymes (acid-β-glucocerebrosidase (ABG), acid-sphingomyelinase (ASM), acid-α-glucosidase (GAA), β-galactocerebrosidase (GALC), α-galactosidase A (GLA), and α-L-iduronidase (IDUA)) in dried blood spots (DBS) from newborn babies. The analysis of enzymatic activity serves as an aid in screening newborns for Gaucher Disease, Niemann-Pick A/B Disease, Pompe Disease, Krabbe Disease, Fabry Disease, and MPS I Disease.

Here's an analysis of the acceptance criteria and the study that proves the device meets them:

1. Table of Acceptance Criteria and Reported Device Performance

The acceptance criteria are generally implied by the performance metrics reported, such as linearity ranges, precision (reproducibility %CV), and LoQ values. The screening performance, particularly sensitivity and specificity, are key for a screening tool.

Performance Characteristic	Acceptance Criteria (Implied)	Reported Device Performance (NeoLSD MSMS Kit)
Linear Range	Broad enough to cover physiological and pathological ranges	IDUA: 0.34 – 17.2 µmol/L/hGAA: 0.44 – 24.2 µmol/L/hABG: 0.69 – 20.1 µmol/L/hGLA: 0.97 – 20.9 µmol/L/hASM: 0.90 – 20.5 µmol/L/hGALC: 0.63 – 6.3 µmol/L/h
Lower Limit of Quantitation (LoQ)	Low enough to detect deficient enzyme activity (within acceptable CV%)	IDUA: 0.44 µmol/L/h (CV% at LoQ: 18.2%)GAA: 0.63 µmol/L/h (CV% at LoQ: 17.5%)ABG: 0.69 µmol/L/h (CV% at LoQ: 21.7%)GLA: 0.97 µmol/L/h (CV% at LoQ: 17.5%)ASM: 0.90 µmol/L/h (CV% at LoQ: 20.0%)GALC: 0.34 µmol/L/h (CV% at LoQ: 20.6%)
Reproducibility (%CV)	Within acceptable limits for a diagnostic assay (e.g., <20-30%)	Within-Laboratory CV% RangeIDUA: 4.7 – 6.9%GAA: 4.2 – 5.5%ABG: 11.6 – 13.8%GLA: 5.0 – 13.3%ASM: 7.3 – 11.0%GALC: 7.9 – 19.5%Between-Laboratory CV% RangeIDUA: 4.4 – 8.1%GAA: 3.5 – 7.6%ABG: 4.7 – 15.8%GLA: 5.6 – 8.4%ASM: 1.8 – 6.6%GALC: 2.1 – 7.0%Overall Reproducibility CV% RangeIDUA: 6.9 – 10.0%GAA: 5.6 – 9.4%ABG: 13.0 – 21.0%GLA: 8.6 – 15.7%ASM: 7.6 – 11.4%GALC: 9.3 – 20.7%
Sensitivity (overall)	High, to minimize false negatives in screening (e.g., >90%)	92.9% (76.5%-99.1%) (excluding invalid and lost-to-follow-up, including 2 Fabry females that were false negatives) With female Fabry subjects excluded, the test system has no false negative results for any of the enzymes.
Specificity (overall)	High, to minimize false positives (e.g., >95%)	99.4% (99.1%-99.6%) (excluding invalid and lost-to-follow-up)
False Positive Rate (overall)	Low, to minimize unnecessary follow-up (e.g., <5%)	0.6% (0.4% - 0.9%)
False Negative Rate (overall)	Very low, critical for screening (e.g., <1%)	7.1%* (0.9% - 23.5%) (*includes 2 Fabry females). When female Fabry subjects are excluded, the test system has no false negative results for any of the enzymes.
Interference	Minimal, or clearly identified and manageable	Several potential interferents identified (e.g., Glucose, Hematocrit, Hemoglobin, Triglycerides, EDTA), with their effects and implications described. For most, the interferences are not pronounced enough to impair affected/unaffected separation or occur at clinically irrelevant concentrations. Specific warnings are provided for high glucose, hematocrit, and triglyceride levels near cut-off values.

Note: The document provides performance metrics, implying these are the acceptance criteria that the device has met or is expected to meet for its intended use as a newborn screening aid.

2. Sample Size Used for the Test Set and Data Provenance

Sample Size for Screening Performance Study:
- Routine Samples: 4011 newborn specimens (retrospective, 4 years old, used for follow-up of clinical status).
- Confirmed LSD Positive Samples (enriched): 30 newborn DBS specimens (from the site's biobank, ranging from 5.8 to 17.6 years of age).
- Total Test Set: 4041 specimens (4011 routine + 30 confirmed positive).
Data Provenance:
- Routine Samples: Retrospective routine newborn screening samples, 4 years old, from an EU (European) newborn screening laboratory.
- Confirmed LSD Positive Samples: From the site's biobank (likely the same EU lab), with ages ranging from 5.8 to 17.6 years.

3. Number of Experts Used to Establish Ground Truth for the Test Set and Qualifications

The document does not explicitly state the number or qualifications of experts used to establish the ground truth for the test set.

Instead, the ground truth for the 4011 routine samples was established based on:

Clinical outcome: "Clinical outcome was used as a comparator for all samples, including the 4011 routine screening samples, as derived from the civil registry status and national hospital registry. Subject´s survival at 4 years of age without LSD diagnosis or clinical signs suggestive of an LSD was used as clinical confirmation of an unaffected newborn."
For the 30 confirmed LSD positive samples: Their status was "known" as "confirmed LSD positive newborn DBS specimens."

Therefore, the ground truth relies on clinical follow-up data and prior confirmed diagnoses, rather than a panel of experts adjudicating each case for the study.

4. Adjudication Method for the Test Set

No explicit "adjudication method" in the sense of expert review (e.g., 2+1, 3+1) is described for the test set. The ground truth was established by:

Clinical outcome and registry data for routine samples to determine "unaffected" status.
Known (prior confirmed) diagnoses for the "confirmed positive" samples.
Screening algorithm: For routine samples, those below the initial cut-off were re-tested in duplicate to classify as normal, presumptive positive, or invalid.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done

No, a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was not done. This device is a diagnostic kit that quantitatively measures enzyme activity, not an interpretative imaging AI tool that assists human readers. Therefore, the concept of "human readers improve with AI vs without AI assistance" does not apply here.

6. If a Standalone (Algorithm Only Without Human-in-the-Loop Performance) was done

Yes, the screening performance study essentially represents a standalone (algorithm only) performance for the NeoLSD MSMS kit. The device measures enzyme activity, and the "screening results" (positive, negative, invalid) are derived directly from these quantitative measurements compared against predefined cut-off values and a re-testing algorithm. While a human laboratory technician performs the assay, the interpretation of the results as "screen positive" or "screen negative" is determined by the device's output and the established algorithm, without human interpretative judgment affecting the individual sample classification.

7. The Type of Ground Truth Used

The ground truth used was primarily:

Outcomes data/Clinical Confirmation: For the 4011 routine samples, "Subject´s survival at 4 years of age without LSD diagnosis or clinical signs suggestive of an LSD was used as clinical confirmation of an unaffected newborn." This is a form of clinical outcome data.
Pathology/Confirmed Diagnosis: For the 30 enriched samples, they were "confirmed LSD positive" specimens, indicating a definitive medical diagnosis.

8. The Sample Size for the Training Set

The document describes studies for establishing reference ranges and calibration, but it does not explicitly describe a "training set" in the context of machine learning model development. The development of this assay likely involved extensive analytical validation (e.g., linearity, LoQ, interference) and establishing reference ranges using large sample sets, which might be considered analogous to a training or development phase for defining assay parameters and cut-offs.

Reference Range Establishment:
- EU site: 5041 newborn samples were tested to establish cut-off values. These were "retrospective routine newborn screening samples" from newborns 0-30 days of age.
- US Site A: 5251 newborn DBS specimens, newborns ≤ 4 days.
- US Site B: 5053 newborn DBS specimens, newborns ≤ 7 days.

These large cohorts were used to determine population distributions, medians, and percentiles to set initial and retest cut-off values. While not a "training set" for an AI algorithm, they serve a similar purpose in defining the operational parameters for the device's classification logic.

9. How the Ground Truth for the Training Set Was Established

Given that there isn't a "training set" for an AI model, the "ground truth" for establishing the reference ranges and cut-offs was based on:

Population Distribution: Statistical analysis of enzyme activity levels in large cohorts of presumably healthy newborns (5041 from EU, 5251 from US Site A, 5053 from US Site B).
Expert-defined Percentiles: The initial cut-off values were based conservatively on "0.1 - 0.3 percentile of enzyme activity distribution and converted to a percentage of population median activity," which reflects expert consensus on appropriate thresholds for screening. The "retest cut-off values were set 5% lower from the initial cut-off percentage."

This process is standard for establishing normal ranges and screening cut-offs for diagnostic assays and involves statistical methods and clinical expert judgment in setting initial thresholds.

Ask a Question

Ask a specific question about this device

K Number

K133652

Validate with FDA (Live)

Device Name

GSP NEONATAL TOTAL GALACTOSE KIT

Manufacturer

WALLAC OY, A SUBSIDIARY OF PERKINELMER, INC.

Date Cleared

2014-04-28

(152 days)

Product Code

Regulation Number

Type

Panel

Age Range

All

Reference & Predicate Devices

K090846,K071649

Predicate For

K190335

Intended Use

The GSP Neonatal Total Galactose kit is intended for the quantitative determination of total galactose (galactose and galactose-1-phosphate) concentrations in blood specimens dried on filter paper as an aid in screening newborns for galactosemia using the GSP® instrument.

Device Description

The GSP Neonatal Total Galactose kit contains sufficient reagents to perform 1152 assays. The GSP Neonatal Total Galactose test system measures total galactose, i.e. both galactose and galactose-1-phosphate, using a fluorescent galactose oxidase method. The fluorescence is measured using an excitation wavelength of 505 nm and an emission wavelength of 580 nm. The kit contains Neonatal Total Galactose Assay Reagent 1, Neonatal Total Galactose Assay Reagent 2, Neonatal Total Galactose Assay Buffer, Neonatal Total Galactose Assay Reconstitution Solution, and Neonatal Extraction Solution. Calibrators and Controls are also included.

AI/ML Overview

1. Table of Acceptance Criteria and Reported Device Performance

Acceptance Criteria	Reported Device Performance
Precision (Total Variation)	Ranged from 9.3% to 14.1% CV.
Limit of Blank (LoB)	0.34 mg/dL
Limit of Detection (LoD)	0.97 mg/dL
Limit of Quantitation (LoQ)	1.15 mg/dL (defined as the lowest concentration with a total CV equal to or less than 20%).
Linearity	Demonstrated linear performance throughout the measuring range (from 1.15 mg/dL to 50 mg/dL).
Recovery	Average recovery of 109% for galactose, 117% for galactose-1-phosphate, and 103% for both combined from three contrived dried blood spot samples.
Interference	- Acetaminophen: Concentrations above 2.75 mg/dL caused a significant decrease (>15%) in measured total galactose. Maximum tested (5.5 mg/dL) caused a decrease of ~20-22%.- Conjugated Bilirubin: Concentrations above 16.6 mg/dL caused a significant decrease (>15%) in measured total galactose. At 24.9 mg/dL and above, the decrease was 100% at some total galactose concentrations.- Intralipid: Concentrations above 250 mg/dL (at 5 and 10 mg/dL total galactose) or 375 mg/dL (at 15 mg/dL total galactose) caused a significant increase (>15%) in measured total galactose. Maximum tested (1500 mg/dL) caused an increase of ~52-77%.- Hemoglobin (with Bilirubin): Hemoglobin levels at 198 g/L and above in combination with an elevated bilirubin level of 15 mg/dL caused a significant increase (>15%) in measured total galactose at certain total galactose concentrations. For example, at 5 mg/dL total galactose, 198 g/L Hb led to a 26.3% increase.- Non-Interfering Substances: Unconjugated bilirubin (20 mg/dL), ß-Nicotinamide adenine dinucleotide (100 µmol/L), Glutathione (3 mmol/L), Human Serum Albumin (30 mg/mL), Ascorbate (6 mg/dL), D-glucose (1000 mg/dL), D-mannose (100 mg/dL), D-fructose (18 mg/dL), Ampicillin (152 µmol/L), and Lithium heparin (0.375 mg/ml), and Hematocrit levels from 30% to 66% (102-230 g/L Hemoglobin) were found not to interfere.
Screening Performance vs. Predicate (95th percentile)	Overall percent agreement = 96.0%Positive percent agreement = 63.6%Negative percent agreement = 97.9%
Screening Performance vs. Predicate (99th percentile)	Overall percent agreement = 98.8%Positive percent agreement = 53.3%Negative percent agreement = 99.4%

2. Sample Size Used for the Test Set and Data Provenance

Sample Size for Precision Study: 7 samples, with 216 total measurements per sample (4 replicates per sample, in 27 runs over 21 days using 3 kit lots and 3 GSP instruments).
Sample Size for LoD: 216 determinations of 4 low-level samples.
Sample Size for LoB: 150 blank samples.
Sample Size for Recovery: 3 contrived dried blood spot samples.
Sample Size for Interference Studies: Not explicitly stated for each concentration, but involved various concentrations of interfering substances at three total galactose concentrations (5, 10, and 15 mg/dL).
Sample Size for Internal Method Comparison: 141 routine screening and spiked blood spot specimens.
Sample Size for Screening Performance Study: 2320 samples (6 confirmed positive samples and 2314 routine samples).
Data Provenance: The screening performance study was conducted at "one newborn screening laboratory in the United States." Other non-clinical studies (precision, linearity, LoB/LoD/LoQ, recovery, interference) appear to be internal laboratory studies without specific geographic provenance mentioned, but presumably also conducted in the US or Finland (Wallac Oy headquarters). The studies were retrospective, using banked samples and contrived samples.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

The document does not mention the use of experts to establish ground truth for the test set. For the screening performance study, "6 confirmed positive samples" are mentioned, implying prior clinical diagnosis as the ground truth. The qualifications of who confirmed these positive cases or how the "routine samples" were classified as normal are not specified.

4. Adjudication Method for the Test Set

Not applicable. The document does not describe any expert adjudication process for the test set. Ground truth for confirmed positive samples likely came from clinical diagnosis.

5. If a Multi Reader Multi Case (MRMC) Comparative Effectiveness Study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

Not applicable. This device is an in-vitro diagnostic test kit (laboratory assay) for quantitative determination, not an imaging device or AI-assisted diagnostic tool that would involve human readers interpreting results in a MRMC study.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

This refers to the performance of the assay itself. The entire submission details the standalone performance of the GSP Neonatal Total Galactose kit (assay only) without human-in-the-loop interpretation beyond standard laboratory procedures and reporting. The "GSP® instrument" is automated, as stated in the comparison chart ("GSP instrument, automated").

7. The Type of Ground Truth Used

For Analytical Performance (Precision, LoB, LoD, LoQ, Linearity, Recovery, Interference): Ground truth was established by preparing samples with known concentrations of total galactose or specific interfering substances. For example, for recovery, the "recovery of galactose, galactose-1-phosphate, and both combined was determined from three contrived dried blood spot samples," meaning these samples were prepared with known amounts.
For Screening Performance Study: The ground truth for the 6 positive samples was "confirmed positive." This implies a clinical diagnosis of galactosemia, likely through follow-up diagnostic testing. The other 2314 samples are referred to as "routine samples" and classified as "normal" in the context of screening performance, likely based on either their negative predicate device result or their actual clinical status. The document also compares the new device's results against the predicate device's results as a reference for "Manual result."

8. The Sample Size for the Training Set

Not applicable in the conventional sense of machine learning training sets. This is a chemical assay, not an AI/ML device that requires a distinct training set for model development. The development and optimization of the assay would involve various experiments, but these are not referred to as "training sets" in this context.

9. How the Ground Truth for the Training Set Was Established

Not applicable, as there is no "training set" in the context of an AI/ML model for this chemical assay. The development of the calibrators and controls (which are prepared with known concentrations of galactose and galactose-1-phosphate) serves an analogous function in ensuring accuracy and consistency of the assay.

Ask a Question

Ask a specific question about this device

K Number

K110274

Validate with FDA (Live)

Device Name

AUTODELFIA NEONATAL IRT KIT

Manufacturer

WALLAC OY, A SUBSIDIARY OF PERKINELMER, INC.

Date Cleared

2011-06-10

(130 days)

Product Code

Regulation Number

Type

Panel

Age Range

All

Reference & Predicate Devices

N/A

Predicate For

N/A

Intended Use

The AutoDELFIA Neonatal IRT kit is intended for the quantitative determination of human immunoreactive trypsin(ogen) (IRT) in blood specimens dried on filter paper as an aid in screening newborns for cystic fibrosis using the 1235 AutoDELFIA® automatic immunoassay system.

Device Description

The AutoDELFIA Neonatal IRT assay is a solid phase, two-site fluoroimmunometric assay based on the direct sandwich technique in which two monoclonal antibodies (derived from mice) are directed against two separate antigenic determinants on the IRT molecule. Calibrators, controls and test specimens containing IRT are reacted simultaneously with immobilized monoclonal antibodies directed against a specific antigenic site on the IRT molecule and europium-labeled monoclonal antibodies (directed against a different antigenic site) in assay buffer. The assay buffer elutes IRT from the dried blood on filter paper disks. The complete assay requires only one incubation step. Enhancement Solution dissociates europium ions from the labeled antibody into solution where they form highly fluorescent chelates with components of the Enhancement Solution. The fluorescence in each well is then measured. The fluorescence of each sample is proportional to the concentration of IRT in the sample.

AI/ML Overview

The provided text describes a 510(k) premarket notification for the AutoDELFIA Neonatal IRT kit. This submission focuses on demonstrating substantial equivalence to a predicate device, rather than providing a detailed study report with specific acceptance criteria and performance data in the format typically used for AI/ML device evaluations. As such, many of the requested fields for AI/ML device studies are not directly applicable or explicitly stated in this document.

However, I can extract information related to the device's analytical performance characteristics, which serve as a form of acceptance criteria for this type of in-vitro diagnostic device.

Here's an attempt to populate the table and answer the questions based on the provided text, indicating where information is not available.

1. Table of Acceptance Criteria and the Reported Device Performance

For this in-vitro diagnostic device, "acceptance criteria" are generally established by demonstrating performance characteristics that are comparable to or better than a legally marketed predicate device, and that meet the required analytical performance for its intended use.

Characteristic (Feature)	Acceptance Criteria (from Predicate Device)	Reported Device Performance (New Device: B005-212/B005-204)
Measuring Range	4 (as defined by LoB) to 500 (as defined by upper calibrator) ng/mL blood	16 to 480 ng/mL blood
Linearity Range	No claims for linearity in labeling.	16 to 480 ng/mL blood
Analytical Sensitivity / Limit of Blank (LoB)	< 4 ng/mL blood	0.53 ng/mL blood
Limit of Detection (LoD)	Not explicitly stated, implied to be around 4 ng/mL blood (from LoB)	2.9 ng/mL blood
Antibody Cross-Reactions	α2-macroglobulin < 4 ng/ml blood, α1-antitrypsin < 4 ng/ml blood, Phospholipase A2 < 4 ng/ml blood, Chymotrypsin < 4 ng/ml blood, Human IgG < 4 ng/ml blood, Uropepsinogen < 4 ng/ml blood	α2-macroglobulin < 4 ng/ml blood, α1-antitrypsin < 4 ng/ml blood, Phospholipase A2 < 4 ng/ml blood, Chymotrypsin < 4 ng/ml blood, Human IgG < 4 ng/ml blood, Uropepsinogen < 4 ng/ml blood (All "Same" as predicate, which explicitly lists these values)
Hook effect	No hook effect has been found with IRT concentrations up to 40,000 ng/mL	No hook effect has been found with IRT concentrations up to 40,000 ng/mL
Precision (Total Variation CV%)	42.6 ng/mL blood CV% 9.3, 98.8 ng/mL blood CV% 10.0, 266 ng/mL blood CV% 9.6	16.7 ng/mL blood CV% 8.7, 22.5 ng/mL blood CV% 9.6, 48.0 ng/mL blood CV% 9.1, 104 ng/mL blood CV% 8.0, 247 ng/mL blood CV% 8.3, 401 ng/mL blood CV% 8.4, 449 ng/mL blood CV% 9.4

Note on "Acceptance Criteria": For this 510(k) submission, the "acceptance criteria" are implied to be achieving analytical performance characteristics that are comparable to or improved from the predicate device, thereby demonstrating substantial equivalence. The table shows that the new device generally performs comparably or better (e.g., lower LoB, explicit linearity claim, more detailed precision data, and a wider range of concentrations with good precision).

Regarding the study proving the device meets the acceptance criteria:

The document describes the submission as a 510(k) for an in-vitro diagnostic kit. The "study" here refers to the analytical performance evaluation conducted by the manufacturer to demonstrate substantial equivalence to the predicate device. The information provided is a summary of the device's analytical characteristics.

2. Sample size used for the test set and the data provenance:

Sample Size: Not explicitly stated in terms of number of individual patient samples. The precision data lists several concentration levels (e.g., 16.7 ng/mL, 22.5 ng/mL, etc.), implying multiple measurements were taken at each level. The cross-reactivity and hook effect studies would have involved specific spiked samples.
Data Provenance: Not explicitly stated (e.g., country of origin). It's an in-house analytical validation, likely conducted at the manufacturer's facility. It is a retrospective analysis of laboratory-prepared samples or collected blood spots rather than a prospective clinical study involving external patient recruitment.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts:

This question is more applicable to AI/ML devices that rely on expert interpretation for ground truth. For this in-vitro diagnostic assay, the "ground truth" for reported values (e.g., IRT concentration) is established by the analytical method itself and calibration against known standards. There's no mention of external expert consensus for establishing ground truth for the analytical performance data.

4. Adjudication method (e.g. 2+1, 3+1, none) for the test set:

Not applicable for this type of analytical performance study of an in-vitro diagnostic kit. Adjudication methods like 2+1 or 3+1 are typically used in clinical studies where multiple human readers interpret medical images or clinical data, and a disagreement resolution process is needed to establish a definitive ground truth.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance:

Not applicable. This is not an AI/ML device designed to assist human readers. It's an automated immunoassay system for quantitative measurement.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done:

No, this is not an AI/ML algorithm. It is an automated immunoassay kit where the "algorithm" is the biochemical reaction and the instrument's measurement and calculation of IRT concentration. The device operates in a standalone analytical capacity to measure IRT.

7. The type of ground truth used (expert consensus, pathology, outcomes data, etc):

The ground truth for the analytical performance characteristics (such as concentration, linearity, limit of blank, limit of detection, cross-reactivity, hook effect, and precision) would be established by:
- Reference materials/known standards: For calibration, linearity, and determining accurate concentrations.
- Spiked samples: For cross-reactivity and hook effect studies where known interferents or high concentrations are added.
- Repeated measurements: For precision studies.

8. The sample size for the training set:

Not explicitly stated, and the concept of a "training set" as understood in AI/ML is not directly applicable. For this type of device, development involves optimizing the assay components and conditions, which is an iterative process using various samples (e.g., patient samples, spiked samples, controls) but not typically referred to as a discrete "training set" in the AI/ML context.

9. How the ground truth for the training set was established:

As above, the concept of a "training set" with established ground truth in the AI/ML sense is not relevant here. Ground truth for internal development and optimization would be based on the known biochemical properties of the reagents, reference standards, and performance evaluation criteria.

Ask a Question

Ask a specific question about this device

K Number

K103484

Validate with FDA (Live)

Device Name

GSP NEONATAL THYROXINE (T4)

Manufacturer

WALLAC OY, A SUBSIDIARY OF PERKINELMER, INC.

Date Cleared

2011-04-22

(147 days)

Product Code

Regulation Number

Type

Panel

Age Range

All

Reference & Predicate Devices

K943416

Predicate For

N/A

Intended Use

The GSP Neonatal Thyroxine (T4) kit is intended for the quantitative determination of human thyroxine (T4) in blood specimens dried on filter paper as an aid in screening newborns for congenital (neonatal) hypothyroidism using the GSP instrument.

Device Description

The GSP Neonatal T4 assay is a solid phase time-resolved fluoroimmunoassay based on the competitive reaction between europium-labeled T4 and sample T4 for a limited amount of binding sites on T4 specific monoclonal antibodies (derived from mice). The use of 8-anilino-1-naphthalenesulfonic acid (ANS) and salicylate in the T4 Assay Buffer facilitates the release of T4 from the binding proteins. Thus the assay measures the total amount of T4 in the test specimen. A second antibody, directed against mouse IgG, is coated to the solid phase, and binds the IgG-thyroxine complex, giving convenient separation of the antibody-bound and free antigen. DELFIA Inducer dissociates europium ions from the labeled antibody into solution where they form highly fluorescent chelates with components of DELFIA Inducer. The fluorescence in each well is then measured. The fluorescence of each sample is inversely proportional to the concentration of T4 in the sample.

AI/ML Overview

The provided text describes a 510(k) premarket notification for an in vitro diagnostic device, the GSP Neonatal Thyroxine (T4) kit. This type of submission focuses on demonstrating substantial equivalence to a legally marketed predicate device rather than conducting a full clinical study with specific acceptance criteria and ground truth for disease diagnosis in the same way an AI/ML powered device might.

Therefore, the requested information regarding "acceptance criteria" for an AI device, "sample size for the test set," "number of experts," "adjudication method," "MRMC study," "standalone performance," and "ground truth for training/testing" in the context of an AI/ML study does not directly apply to this submission.

However, I can extract the closest analogous information available within this document, focusing on the performance characteristics presented to demonstrate equivalence.

Here's an attempt to answer the questions based on the provided document, interpreting "acceptance criteria" as performance metrics for this diagnostic kit.

1. Table of Acceptance Criteria and Reported Device Performance

For an in-vitro diagnostic kit like this, "acceptance criteria" are typically defined by demonstrating that the new device performs comparably to or within acceptable ranges relative to a predicate device and established analytical performance specifications. The document provides a comparison of various features and performance characteristics between the new GSP Neonatal T4 kit and its predicate device, AutoDELFIA Neonatal T4 Kit.

Performance Characteristic	Predicate Device (AutoDELFIA T4) Performance (Analogous to "Acceptance Criteria" for comparison)	GSP Neonatal T4 Kit Reported Performance (Analogous to "Device Performance")
Precision (CVs)	Control 1; 3.95 µg/dL serum - Intra-assay variation 14.9 % - Inter-assay variation 10.0 % - Total variation 18.0 % Control 2; 8.08 µg/dL serum - Intra-assay variation 10.6 % - Inter-assay variation 7.1 % - Total variation 12.7 % Control 3; 18.2 µg/dL serum - Intra-assay variation 8.2% - Inter-assay variation 4.3% - Total variation 9.3 %	Sample 1; 2.0 µg/dL - Within run 1.0% - Within lot 15.5% - Total variation 15.8% Sample 2; 4.8 µg/dL - Within run 7.3% - Within lot 10.7% - Total variation 11.4% Sample 3; 7.5 µg/dL - Within run 6.5% - Within lot 8.4% - Total variation 8.6% Sample 4; 16.6 µg/dL - Within run 4.5% - Within lot 7.8% - Total variation 8.5% Sample 5; 19.8 µg/dL - Within run 7.2% - Within lot 9.9% - Total variation 10.3% Sample 6; 21.4 µg/dL - Within run 7.1% - Within lot 9.8% - Total variation 10.1%
Measuring Range	1.5 µg/dL to the highest level calibrator	1.6 to 30 µg/dL serum
Limit of Blank (LoB)	< 1.5 µg/dL	0.457 µg/dL
Limit of Detection (LoD)	Not available	0.99 µg/dL
Limit of Quantitation (LoQ)	Not available	1.61 µg/dL
Interference	Bilirubin at 20 mg/dL has no significant effect.	Icteric (unconjugated bilirubin ≤ 342 µmol/L, equivalent to 20 mg/dL in serum, and conjugated bilirubin ≤ 237 µmol/L, equivalent to 20 mg/dL in serum), Lipemic (Intralipid¹ ≤ 15 mg/mL in serum), and Hemoglobin up to 15 g/L samples do not interfere. [¹Intralipid is a registered trademark of Fresenius Kabi AB.]
Cross-reactivity	LT3: 0.89% 3,3',5-Triiodoacetic acid: 0.45% 3,5-Diiodo-L-thyronine: < 0.1% 3,5-Diiodotyrosine (DIT): < 0.1% 5,5 Diphenylhydantoin: < 0.1% 3-iodo-L-tyrosine (MIT): < 0.1% Phenylbutazone: < 0.1% 6-n-Propyl-2-thiouracil: < 0.1% Methimazole: < 0.1% L-Tyrosine: < 0.1% Acetylsalicylic acid: < 0.1%	LT3: 1.67% 3,3',5-Triiodothyroacetic acid: 0.14% 3,5-Diiodo-L-thyronine: < 0.1% 3,5-Diiodotyrosine (DIT) dihydrate: < 0.1% 5,5-Diphenylhydantoin: < 0.1% 3-iodo-L-tyrosine (MIT): < 0.1% Phenylbutazone: < 0.1% 6-n-Propyl-2-thiouracil: < 0.1% Methimazole: < 0.1% L-Tyrosine: < 0.1% Acetylsalicylic acid: < 0.01%

2. Sample size used for the test set and the data provenance

The document does not specify a separate "test set" sample size in the context of an AI/ML algorithm validation. Instead, it describes analytical performance studies.

Precision study: The precision data (Within run, Within lot, Total variation) is presented for six different samples at various concentration levels (2.0, 4.8, 7.5, 16.6, 19.8, 21.4 µg/dL). The number of replicates or runs for each sample is not explicitly stated.
Interference study: Not explicitly stated, but the document mentions testing with specific concentrations of bilirubin, Intralipid, and hemoglobin.
Cross-reactivity study: Not explicitly stated, but specific substances and their cross-reactivity percentages are listed.
Data Provenance: The document does not specify the country of origin or whether the data was retrospective or prospective. Given the nature of a premarket submission for an IVD kit, these studies are typically conducted by the manufacturer as part of the validation process.

3. Number of experts used to establish the ground truth for the test set and the qualifications of those experts

This question is not applicable. For an immunoassay kit like this, the "ground truth" is established by the direct measurement of T4 concentration using the device itself, calibrated against known standards. There are no human "experts" establishing ground truth through image review or clinical assessment in the way an AI/ML device for diagnosis would require. The "qualification" of personnel pertains to "adequately trained laboratory personnel" running the assay.

4. Adjudication method for the test set

This question is not applicable. Adjudication methods are relevant for studies where multiple independent human readers or algorithms are assessing the same data, and their results need to be reconciled (e.g., in medical image interpretation). This is an analytical performance study of an immunoassay.

5. If a multi reader multi case (MRMC) comparative effectiveness study was done, If so, what was the effect size of how much human readers improve with AI vs without AI assistance

This question is not applicable. This is an immunoassay kit, not an AI/ML-powered device intended for human-in-the-loop assistance in clinical decision-making or image interpretation. Therefore, no MRMC study was performed.

6. If a standalone (i.e. algorithm only without human-in-the-loop performance) was done

This question is not applicable. The device itself is a standalone diagnostic kit (reagents and instrument) that produces quantitative T4 measurements. It is not an AI/ML algorithm.

7. The type of ground truth used

For an immunoassay, the "ground truth" for evaluating the device's performance is typically established through:

Known Calibrator Concentrations: The device's measurements are compared against a standard curve generated from calibrators with precisely known T4 concentrations.
Reference Methods: If available, comparison to a gold standard reference method for T4 measurement would be part of validation (though not explicitly detailed as "ground truth" in this summary).
Control Materials: Use of quality control materials with target T4 concentrations.

The document discusses "calibrators" and "controls" which are used to establish and verify the accuracy of the measurements.

8. The sample size for the training set

This question is not applicable. This is not an AI/ML device that requires a "training set" for model development. The performance data presented relates to the validation of the manufactured kit through analytical studies.

9. How the ground truth for the training set was established

This question is not applicable as there is no "training set" in the context of an AI/ML model for this type of medical device submission.

Ask a Question

Ask a specific question about this device

Page 1 of 1