AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

The Chemitrue® Multi-Panel Drug Screen Cup Tests are rapid lateral flow immunossays for the qualitative detection of Amphetamine, Barbiturates, Benzodiazepines, Buprenorphine, Cocaine, Marijuana, Methamphetamine, Morphine, Phencyclidine, Ecstasy, Methadone, Propoxyphene and Tricyclic Antidepressants (TCA) drugs in human urine. The test cut-off concentrations and the compounds the tests are calibrated to are as follows:

AnalyteAbbreviationCalibratorCutoff Concentration (ng/mL)
AmphetamineAMPd-Amphetamine300
AmphetamineAMPd-Amphetamine500
AmphetamineAMPd-Amphetamine1000
BarbituratesBARSecobarbital/Pentobarbital200
BarbituratesBARSecobarbital/Pentobarbital300
BenzodiazepinesBZOOxazepam200
BenzodiazepinesBZOOxazepam300
BuprenorphineBUPBuprenorphine10
CocaineCOCBenzoylecgonine150
CocaineCOCBenzoylecgonine300
EcstasyMDMAd,l-Methylenedioxy
methamphetamine500
MethamphetamineMAMPd-Methamphetamine300
MethamphetamineMAMPd-Methamphetamine500
MethamphetamineMAMPd-Methamphetamine1000
MarijuanaTHC11-nor-Δ9-THC-9-COOH50
MethadoneMTDMethadone300
OpiatesOPIMorphine2000
OxycodoneOXYOxycodone100
PhencyclidinePCPPhencyclidine25
PropoxyphenePPXPropoxyphene300
Tricyclic
AntidepressantsTCANortriptyline1000

The multi test panels can consist of up to fourteen (14) of the above issed analytes in any combination. Only one cutoff concentration will be included per analyte per device. The tests are intended for prescription and Over-The-Counter (OTC) use.

The tests provide only a preliminary result. A more specific alternative chemical must be used in order to obtain a confirmed assay result. Gas Chromatography / Mass Spectrometry (GCMS) or Liquid Chromatography / Mass Spectrometry (LC/MS) are the preferred confirmatory methods. Clinical consideration and professional judgment should be applied to any drugs of abuse test result, particularly when preliminary positive results are indicated.

The tests are not intended to differentiate between drugs of abuse and prescription use of Benzodizepines, Barbiturates, Buprenorphine, Oxycodone, Propoxyphene and Tricyclic Antidepressants. There are no uniformly recognized cut-off concentration levels for these drugs in urine.

Device Description

The Chemtrue® Drug Screen Tests are colloidal gold based lateral flow immunoassays for the rapid, qualitative detection of drugs of abuse in human urine. The tests are single-use, in vitro diagnostic devices, which come in Dip Card or Cup formats, as indicated by the test name.

AI/ML Overview

Acceptance Criteria and Device Performance for Chemtrue Multi-Panel Drug Screen Dip Card/Cup Tests

This document describes the acceptance criteria and the study that demonstrates the Chemtrue Multi-Panel Drug Screen Dip Card/Cup Tests meet these criteria. The device is a rapid lateral flow immunoassay for the qualitative detection of various drugs of abuse in human urine, intended for prescription and Over-The-Counter (OTC) use.


1. Table of Acceptance Criteria and Reported Device Performance

The acceptance criteria for the device are implicitly demonstrated through the performance characteristics presented in the 510(k) summary. These include:

  • Precision (Reproducibility): Consistent results at and around the cut-off concentrations across different operators and lots. The exact acceptance percentage for concordance at cut-off is not explicitly stated as a numerical criterion but is demonstrated by the reported distribution of positive/negative results.
  • Specificity (Cross-Reactivity): Detect the target analytes and related compounds at specified concentrations, while exhibiting minimal cross-reactivity with non-target substances. Acceptance is qualitative based on the reported cross-reactivity percentages.
  • Interference: No interference from common endogenous compounds or non-structurally related compounds. Acceptance is qualitative, showing "no interference" for a wide range of substances at tested concentrations.
  • Effect of Urine pH and Specific Gravity: Test performance should not be affected by variations in urine pH and specific gravity within specified ranges. Acceptance is demonstrated by stating these factors "do not affect the test performance."
  • Stability: Maintain performance over a specified shelf-life. Acceptance is a 2-year shelf life at 2°C to 30°C.
  • Accuracy (Method Comparison): High agreement with a confirmed reference method (GC/MS). The acceptance criterion for accuracy is ≥ 99% agreement between the Chemtrue® Drug Screen test device and GC/MS values.
  • Lay-user Accuracy and Usability (for OTC): High accuracy when performed by lay-users and ease of understanding instructions. The acceptance criterion is demonstrated by ≥ 99% agreement with GC/MS values for lay-user studies and ≥ 96% of lay users easily following instructions.

Below is a summary of the key performance criteria and the reported device performance. Note that for precision and cross-reactivity, specific numerical acceptance thresholds were not explicitly stated as distinct criteria, but the reported performance data in the tables indicate successful demonstration of these characteristics.

Table 1: Acceptance Criteria and Reported Device Performance

Performance CharacteristicAcceptance Criterion (Implicit where not numerical)Reported Device Performance
Precision (Reproducibility)Consistent qualitative results (+/-) at negative, 50%, 75%, cut-off, 125%, and 150% of cut-off across operators and lots.At Cutoff (e.g., AMP 300 ng/mL Dip Card): 16 positive, 14 negative out of 30.
At 125% & 150% of Cutoff: 30 positive out of 30.
At 50% & 75% of Cutoff & Negative: 0 positive, 30 negative out of 30.
Similar distributions for other analytes and formats, indicating consistent qualitative detection around the cutoff.
Specificity (Cross-Reactivity)Detect target analytes and related compounds; minimal cross-reactivity with non-target substances.Reported cross-reactivity percentages for various related compounds (e.g., d-Amphetamine 100% at 500 ng/mL, d,l-Amphetamine 62.5% at 800 ng/mL for AMP 500).
Non-structurally related compounds found "not to cross-react" at 100 µg/mL.
InterferenceNo interference from 103 specified potential interferents (endogenous and non-structurally related compounds) at tested concentrations."It was found not to cross-react when tested at concentrations of 100 µg/mL at ±25% of the drug cut-off concentrations."
Listed 103 compounds explicitly stated as "do not interfere."
Effect of Urine pHTest performance unaffected within pH range of 2.0 to 9.0 at ±25% of the drug cut-off concentrations."The testing results demonstrate that the urine pH ranges from 2.0 to 9.0 at ±25% of the drug cut-off concentrations do not affect the test performance."
Effect of Specific GravityTest performance unaffected within SG ranges of 1.001, 1.015, 1.020, 1.025, and 1.030 at ±25% of the drug cut-off concentrations."The specific gravity (SG) ranges of 1.001, 1.015, 1.020, 1.025 and 1.030 at ±25% of the drug cut-off concentrations do not affect the test results."
Stability (Shelf Life)Support a 2-year shelf-life at 2℃ to 30℃."The stability study results support two (2) years shelf-life of the products at (2℃ to 30℃). The real time stability study is still on going."
Accuracy (Method Comparison)≥ 99% agreement with GC/MS reference method.Reported: ≥ 97.6% to 100% agreement with GC/MS for individual analytes in Method Comparison Study (Tables 6a & 6b).
"The results demonstrate that the agreement between the Chemtrue® Drug Screen test device and GC/MS values is ≥ 99%." (This combines all analytes for both Dip Card and Cup).
OTC Lay-user Accuracy≥ 99% agreement with GC/MS reference method when performed by lay-users.Reported: 99% to 100% agreement with GC/MS for individual analytes in OTC Accuracy studies (Tables 7a & 7b).
"The results demonstrate that the agreement between the Chemtrue® Drug Screen test device and GC/MS values is ≥ 99%." (This combines all analytes for both Dip Card and Cup).
OTC Lay-user Usability≥ 96% of lay users can easily follow instructions to perform the test and interpret results, with a Flesch-Kincaid reading level of 7th grade."The results demonstrate that ≥ 96% of the lay users can easily follow the instructions to perform the test and interpret the results. A Flesch-Kincaid reading analysis supports a 7th grade reading level."

2. Sample Size Used for the Test Set and Data Provenance

The document does not explicitly delineate a "test set" in the context of the training/testing split common in AI/ML studies, as this device is a lateral flow immunoassay. Instead, it refers to various validation studies.

  • Precision (Reproducibility) Studies:
    • Sample Size: 30 samples per concentration level (Negative, 50% of cutoff, 75% of cutoff, Cutoff, 125% of cutoff, 150% of cutoff) for each analyte and device format. This equates to 180 samples per analyte/format combination (e.g., AMP Dip Card Test: Cutoff: 300 ng/mL used 180 samples).
    • Data Provenance: Not specified, but the samples were "GC/MS confirmed drug spiked urine controls." This implies controlled laboratory generation rather than real clinical samples. No country of origin is mentioned. These are prospective, controlled experiments due to the spiked nature.
  • Specificity Studies:
    • Sample Size: Not explicitly stated as a number of "samples." Various related compounds were tested for cross-reactivity at different concentrations, and 103 potential interferents were tested at a concentration of 100 µg/mL at ±25% of the drug cut-off concentrations.
    • Data Provenance: Not specified, likely laboratory-generated samples with controlled spiking of compounds. Prospective.
  • Interference Studies:
    • Sample Size: 103 potential interferents tested. Each was tested with "one lot each of the test device format." The implied sample count would be (number of interferents) * (number of device formats).
    • Data Provenance: Laboratory-generated, spiked samples. Prospective.
  • Effect of Urine pH and Specific Gravity Studies:
    • Sample Size: Not explicitly stated, but tests were performed within specified pH and SG ranges at ±25% of the drug cut-off concentrations.
    • Data Provenance: Laboratory-generated, controlled urine samples. Prospective.
  • Accuracy (Method Comparison) Studies:
    • Sample Size: "On average of 85 clinical specimens for each drug test and a total of 685 samples were tested" for the analytes specifically included in this submission (AMP300/500, BAR200/BZO200, COC150, MET300/500 and PPX).
    • Data Provenance: "blind-labeled clinical specimen correlation study." This indicates real human urine samples, but the country of origin is not specified. These are retrospective analyses as they compare the device's results to pre-existing or independently confirmed GC/MS values.
  • OTC Lay-user Accuracy Studies:
    • Sample Size: "One hundred (130) intended lay-users participated in the evaluation for each of the device format (Dip Card and Cup)." The number of tests performed is much higher, as each lay-user was given "up to two (2) random blind labeled samples" with concentrations covering negative, 50%, 75%, 125%, 150%, and 200% of the cutoff.
    • Data Provenance: "GC/MS confirmed urine samples," which were "spiking drugs into drug-free urine pool." This suggests laboratory-controlled, spiked samples rather than true "clinical" samples from drug users. Prospective experiment.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and Qualifications

  • For Precision, Specificity, Interference, pH/SG, OTC Lay-user Studies: The ground truth was established by GC/MS confirmation of drug-spiked urine samples. This method is considered the gold standard for drug detection and quantification in urine. Therefore, the "experts" are the technicians/analysts operating the GC/MS equipment, whose qualifications are implicitly high-level laboratory personnel but not explicitly stated (e.g., clinical chemists, toxicologists).
  • For Accuracy (Method Comparison) Studies: The ground truth was also established by GC/MS values. These were "blind-labeled clinical specimen" samples. Again, the experts are the GC/MS laboratory personnel. The document states, "Three operators performed the testing," but this refers to the Chemtrue device operators, not necessarily the GC/MS operators who established ground truth.

4. Adjudication Method for the Test Set

  • For Precision Studies: The ground truth for the spiked samples was known. Qualitative results (+/-) from the device were compared against the expected positive/negative based on the known concentration relative to the cutoff. Discordant results are implicitly handled by the quantitative reporting in the tables (e.g., at cutoff, some were positive, some negative, reflecting the expected variability around the cutoff).
  • For Accuracy (Method Comparison) Studies: The ground truth was the GC/MS value. This method serves as the definitive reference, so no adjudication among multiple experts for ground truth was performed; it was a direct comparison to the GC/MS result. The results describe "discordant results" where the device's reading didn't match the GC/MS, and these were "confirmed at the drug cutoff level with the GC/MS concentrations," which is a further verification against the gold standard.
  • For OTC Lay-user Accuracy Studies: Similar to the accuracy studies, the ground truth was the GC/MS values from the spiked samples.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was done

No, a Multi-Reader Multi-Case (MRMC) comparative effectiveness study was not conducted. MRMC studies are typically used to assess the improvement in human reader performance (e.g., radiologists interpreting images) with and without AI assistance. The Chemtrue device is a rapid diagnostic test (lateral flow immunoassay) that provides a direct qualitative result, not an AI system designed to assist human interpretation of complex data. Therefore, the concept of "human readers improving with AI vs. without AI assistance" does not apply here.


6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) was done

Yes, the device essentially operates in a standalone manner. The Chemtrue Multi-Panel Drug Screen Dip Card/Cup Tests are standalone devices that generate a qualitative (positive/negative) result without human intervention in the result generation process itself. Humans are "in the loop" for running the test and interpreting the visual lines, but the device's chemical reactions provide the "algorithm's" output.

The validation studies (precision, specificity, interference, pH/SG, and the majority of the accuracy studies) demonstrate this standalone performance. The "OTC Lay-user Accuracy Studies" did involve human interpretation of the visual lines by lay-users, evaluating their ability to correctly use and interpret the standalone device.


7. The Type of Ground Truth Used

The primary ground truth used throughout the studies is Gas Chromatography / Mass Spectrometry (GC/MS). For all accuracy studies, spiked samples and clinical specimens were confirmed by GC/MS. This is explicitly stated: "GC/MS or Liquid Chromatography / Mass Spectrometry (LC/MS) are the preferred confirmatory methods." Given the context of drug screening devices, GC/MS is widely considered the definitive reference method for identifying and quantifying drugs and their metabolites in biological matrices.


8. The Sample Size for the Training Set

The concept of a "training set" is not applicable to traditional lateral flow immunoassay devices like the Chemtrue Multi-Panel Drug Screen tests. These are not machine learning or AI models that require training data to develop their "algorithm." Their mechanism is based on specific antigen-antibody reactions. Therefore, there is no "training set" in the computational sense. The device's components (antibodies, drug-protein conjugates) are developed and optimized through laboratory research and development, but this is a different process than "training" a dataset.


9. How the Ground Truth for the Training Set was Established

As explained in point 8, there is no "training set" in the context of this device. The development of the assay relies on established biochemical principles and extensive R&D to select and optimize antibodies and reagents. The "ground truth" for the development and optimization of the assay components would involve controlled laboratory experiments using known concentrations of analytes and interferents, confirmed by reference methods like GC/MS, to ensure the assay's sensitivity and specificity are appropriate to achieve the desired cut-offs. This is part of the extensive biochemical and quality control processes inherent in developing such diagnostic tests.

§ 862.3650 Opiate test system.

(a)
Identification. An opiate test system is a device intended to measure any of the addictive narcotic pain-relieving opiate drugs in blood, serum, urine, gastric contents, and saliva. An opiate is any natural or synthetic drug that has morphine-like pharmocological actions. The opiates include drugs such as morphine, morphine glucoronide, heroin, codeine, nalorphine, and meperedine. Measurements obtained by this device are used in the diagnosis and treatment of opiate use or overdose and in monitoring the levels of opiate administration to ensure appropriate therapy.(b)
Classification. Class II (special controls). An opiate test system is not exempt if it is intended for any use other than employment or insurance testing or is intended for Federal drug testing programs. The device is exempt from the premarket notification procedures in subpart E of part 807 of this chapter subject to the limitations in § 862.9, provided the test system is intended for employment and insurance testing and includes a statement in the labeling that the device is intended solely for use in employment and insurance testing, and does not include devices intended for Federal drug testing programs (e.g., programs run by the Substance Abuse and Mental Health Services Administration (SAMHSA), the Department of Transportation (DOT), and the U.S. military).