K Number
K063868
Manufacturer
Date Cleared
2007-05-25

(147 days)

Product Code
Regulation Number
862.1678
Panel
TX
Reference & Predicate Devices
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

The Waters MassTrak Immunosuppressants Kit is indicated for the quantification of the immunosuppressive drug Tacrolimus (FK506; Prograf) in liver and kidney transplant patient whole blood samples for the purposes of monitoring drug levels to direct subsequent patient dosing.

Device Description

The MassTrak Immunosuppressants Kit for Tacrolimus is an in vitro medical device intended to facilitate quantitative determination of Tacrolimus in human whole blood as an aid in the management of kidney and liver transplant patients receiving Tacrolimus drug therapy. The components of the kit are intended for use with an LC/MS/MS system. The kit materials - calibrators, quality control materials, internal standards, and neat solutions, as well as a MassTrakTM 2.1 x 10 mm C18 cartridge column - have been optimized for use with the Waters Quattro micro and Alliance 2795 System, but can be used with any LC/MS/MS configuration optimized for quantification.

AI/ML Overview

The provided text describes the Waters MassTrak™ Immunosuppressants Kit, an in vitro diagnostic device used for quantifying Tacrolimus in human whole blood. The submission outlines performance data and comparison studies to establish substantial equivalence to predicate devices.

Here's a breakdown of the requested information:


1. Table of Acceptance Criteria and Reported Device Performance

The document describes various performance studies and their acceptance criteria. The table below synthesizes this information for the MassTrak™ Immunosuppressants Kit.

Performance CharacteristicAcceptance CriteriaReported Device Performance
Precision/ReproducibilityFor within-run precision and total precision, a coefficient of variation (%CV) ≤ 10% is deemed acceptable, based on CLSI EP5-A2.Precision studies were conducted by assaying three levels of spiked whole blood pools. The results of these studies are not explicitly stated in the summary but were "presented in Section 20: Design Testing Summary Report of the 510(k) submission" and presumably met the ≤ 10% CV criterion.
Linearity/Assay Reportable RangeThe assay is deemed to be linear if the coefficients for the second-order and third-order terms are not significantly different from zero at the 95% confidence level, based on CLSI EP6-A.Linearity was assessed using nine test samples prepared from patient specimens. The summary states that these results are "presented in Section 20: Design Testing Summary Report" and implies they met the linearity criteria.
Patient Sample StabilityConditions that did not cause a statistically significant change from the initial Tacrolimus concentration (t-test, p ≥ 0.05) or caused a change of ≤ 10% from the initial Tacrolimus concentration were acceptable. (Freeze-thaw study: three cycles).Patient samples were analyzed before and after storage under defined conditions, including a three-cycle freeze-thaw study. The results are referred to Section 20 and inferred to have met the acceptance criteria.
Sample DilutionResults were considered acceptable if, for each sample, the Tacrolimus concentration measured for the undiluted sample and the Tacrolimus concentration calculated from the 1:1 diluted sample varied by ≤ 10% of the initial concentration.A minimum of ten patient samples with Tacrolimus concentrations > 15 ng/mL were analyzed before and after 1:1 dilution with drug-free whole blood and with MassTrak Immunosuppressants Kit Calibrator 1. Results are presented in Section 20 and inferred to have met the acceptance criteria.
Spike & RecoveryRecovery is considered acceptable if the overall mean recovery for each concentration is in the range 90% - 110%.Recovery performance was assessed using patient samples supplemented with Tacrolimus (5, 10, or 20 ng/mL) and drug-free whole blood spiked with Tacrolimus (0.5 - 30 ng/mL). The summary states these results are in Section 20 and implies they met the 90%-110% recovery range.
Interference StudiesAny interference that causes a change in measured Tacrolimus concentration of > 10% is considered to have a significant effect and must be investigated further to determine the maximum concentration at which no interference is observed. (95% confidence, 95% power).Potential interferences (exogenous/endogenous materials, anticoagulants, hematocrit) were evaluated according to CLSI EP7-A and FDA guidance. The results are referred to Section 20. The implication is that the concentrations evaluated did not cause a change > 10% or were appropriately investigated.
AccuracyThe results for all samples were considered acceptable by the Tacrolimus International Proficiency Testing Scheme (±3 SD of method mean).Accuracy was established by measuring Tacrolimus concentrations in 44 samples provided by the Tacrolimus International Proficiency Testing Scheme (www.bioanalytics.co.uk). The summary states that "The results for all samples were considered acceptable by the Scheme (±3 SD of method mean)."
Method ComparisonThe results of the Test Method for each tissue type are considered acceptable if the predicted bias at both the lower (5 ng/mL) and upper (15 ng/mL) limits of the therapeutic range for Tacrolimus is ≤ 10%, based on Deming Regression analysis (CLSI EP9-A2). (Minimum of 50 samples for each transplant type compared at each site).Method comparison studies were conducted against the test laboratory's current methodology, comparing a minimum of 50 samples for each transplant type at each site. The predicted bias was calculated using Deming Regression analysis. The summary implies these criteria were met, with results "presented in Section 20: Design Testing Summary Report". Additionally, results from International Proficiency Testing Samples were compared to EMIT 2000 Tacrolimus device data.

2. Sample Sizes Used for the Test Set and Data Provenance

  • Sample Size for Test Set:
    • Precision/Reproducibility: Not explicitly stated beyond "three levels of spiked whole blood pools," with "Specimens at each level were analyzed in duplicate twice per day for 20 days." This implies a total of (3 levels * 2 duplicates * 2 runs/day * 20 days) = 240 measurements.
    • Linearity: "nine test samples."
    • Sample Dilution: "A minimum of ten patient samples."
    • Accuracy: "series of 44 samples provided by the Tacrolimus International Proficiency Testing Scheme."
    • Method Comparison: "A minimum of 50 samples for each transplant type were compared at each site." The types mentioned are kidney and liver, so at least 100 samples across the two types if only one site, or more if multiple sites were used.
  • Data Provenance:
    • The studies were conducted "at the clinical sites using Alliance HT 2795 High Performance Liquid Chromatography Systems coupled to Quattro micro triple quadrupole mass spectrometers." This suggests a clinical laboratory setting.
    • Specific country of origin is not directly stated, but the reference to the "Tacrolimus International Proficiency Testing Scheme (www.bioanalytics.co.uk)" implies international data sources for accuracy and potentially method comparison (UK-based scheme).
    • The linearity, sample dilution, and method comparison studies used "patient specimens" or "patient samples," indicating the data is from relevant patient populations.
    • The nature of "proficiency testing scheme" samples means they represent retrospective, collected samples used for external quality assessment.
    • The precision, stability, spike & recovery, and interference studies utilized "spiked whole blood pools" or "drug-free whole blood," which are laboratory-prepared samples.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications of Those Experts

  • The concept of "ground truth" established by experts, as typically understood in AI/ML contexts for image interpretation or diagnosis, does not directly apply to this device. This device is an analytical instrument for quantitative determination of a drug concentration.
  • For Accuracy and Method Comparison: The "ground truth" or reference values for accuracy and method comparison are derived from established reference methods or proficiency testing schemes.
    • For the Accuracy study, the reference was "the Tacrolimus International Proficiency Testing Scheme (www.bioanalytics.co.uk)." The acceptable range was defined as "±3 SD of method mean," indicating a consensus or calculated mean from multiple laboratories/methods participating in the scheme, rather than a single expert.
    • For Method Comparison, the "ground truth" was implied by the "test laboratory's current methodology (the 'Comparative Method')" which the device was compared against. This comparative method would be a previously validated analytical technique for Tacrolimus quantification.
    • No information is provided about individual "experts" establishing a single ground truth; rather, it hinges on established analytical reference systems or comparative methods.

4. Adjudication Method for the Test Set

  • Adjudication methods (like 2+1, 3+1) are typically used in studies where human readers independently assess data and then resolve discrepancies, often in image interpretation.
  • This is an in vitro diagnostic device for quantitative drug measurement. The "adjudication" (if one can call it that) is inherent in the analytical process:
    • Measurements are performed in replicate (e.g., "duplicate twice per day for 20 days" for precision, "replicate determinations" for linearity, "triplicate determinations" for spike & recovery).
    • Statistical methods (e.g., %CV, 95% confidence level, Deming Regression, t-test, ±3 SD of method mean) are used to assess the agreement, consistency, and accuracy of these quantitative measurements against predefined criteria or reference values.
    • Thus, there is no human adjudication process described in the conventional sense.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, If So, What Was the Effect Size of How Much Human Readers Improve With AI vs Without AI Assistance

  • No, an MRMC comparative effectiveness study involving human readers and AI assistance was not done.
  • This device is an in vitro diagnostic (IVD) kit for quantifying drug levels, not an AI-powered diagnostic tool for interpretation by human readers. Its primary function is to provide a quantitative measurement.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

  • Yes, this entire submission describes the standalone performance of the analytical device (MassTrak™ Immunosuppressants Kit) without human-in-the-loop performance in the sense of interpretative AI systems.
  • The device itself, an LC/MS/MS system with specialized reagents, performs the quantification. While a human operates the instrument and interprets the numerical output, the "performance" described relates to the accuracy, precision, linearity, etc., of the instrument and reagents in generating those numbers.
  • The acceptance criteria and reported performances directly reflect the standalone analytical capabilities of the kit.

7. The Type of Ground Truth Used (Expert Consensus, Pathology, Outcomes Data, etc.)

For this quantitative diagnostic device, the "ground truth" is established by:

  • Reference Methods/Materials: For accuracy testing, the Tacrolimus International Proficiency Testing Scheme provides reference values (implied consensus/method mean). For method comparison, another established and validated laboratory methodology serves as the comparator.
  • Calculated or Spiked Concentrations: For studies like linearity, precision, spike & recovery, and interference, the "ground truth" concentrations are known either because they were precisely spiked into a matrix or calculated from known mixtures.
  • Statistical Definitions: Acceptance criteria often define "ground truth" in terms of statistical acceptability (e.g., %CV ≤ 10%, bias ≤ 10%, p ≥ 0.05).

8. The Sample Size for the Training Set

  • The concept of "training set" is not directly applicable in the context of this 510(k) submission for a non-AI-based IVD device.
  • This device is an analytical chemistry kit, not a machine learning algorithm that is "trained" on a dataset.
  • The manufacturer would have performed internal R&D, method development, and optimization studies, which could be considered analogous to "training," but these activities do not typically involve a formally defined "training set" with established ground truth in the same way an AI model does.

9. How the Ground Truth for the Training Set Was Established

  • As stated above, a formal "training set" as understood in AI/ML is not applicable.
  • The "ground truth" during method development would have been established through a combination of:
    • Reference standards: Using highly pure Tacrolimus standards at known concentrations to calibrate and verify the analytical method.
    • Internal validation: Running samples with known "spiked-in" concentrations or comparing with existing, well-established (often more laborious) reference methods to optimize the LC/MS/MS parameters, reagent formulations, and analytical protocols.
    • Iterative refinement: Adjusting parameters until desired performance characteristics (e.g., sensitivity, specificity, linearity) are achieved.

§ 862.1678 Tacrolimus test system.

(a)
Identification. A tacrolimus test system is a device intended to quantitatively determine tacrolimus concentrations as an aid in the management of transplant patients receiving therapy with this drug. This generic type of device includes immunoassays and chromatographic assays for tacrolimus.(b)
Classification. Class II (special controls). The special control is “Class II Special Controls Guidance Document: Cyclosporine and Tacrolimus Assays; Guidance for Industry and FDA.” See § 862.1(d) for the availability of this guidance document.