K Number
K221801
Date Cleared
2023-06-02

(346 days)

Product Code
Regulation Number
862.1092
Panel
TX
Reference & Predicate Devices
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP AuthorizedThirdpartyExpeditedreview
Intended Use

The ADVIA Centaur® Anti-Müllerian Hormone (AMH) assay is for in vitro diagnostic use in the quantitative determination of anti-Müllerian hormone (AMH) in human serum and plasma (lithium heparin) using the ADVIA Centaur® XP system.

The measurement of AMH is used as an aid in the assessment of the ovarian reserve in women presenting to fertility clinics. This assay is intended to distinguish between women with AFC (antral follicle count) values > 15 (high ovarian reserve) and women with AFC values ≤ 15 (normal or diminished ovarian reserve).

This assay is intended to be used in conjunction with other clinical and laboratory findings, such as AFC, before starting fertility therapy. This assay is not intended to be used for monitoring women undergoing controlled ovarian stimulation in an Assisted Reproduction Technology program.

Device Description

The ADVIA Centaur AMH assay is a sandwich immunoassay using direct acridinium ester-based chemiluminometric technology. Two monoclonal anti-AMH antibodies are employed in the assay. The first antibody in the Lite Reagent is a mouse monoclonal anti-AMH antibody labeled with acridinium ester. The second antibody is a biotinylated mouse monoclonal anti-AMH antibody coupled to streptavidin-coated magnetic particles in the Solid Phase.

A direct relationship exists between the amount of AMH present in the patient sample and the amount of relative light units detected by the system. Dose concentration results (ng/mL) are calculated based on a 2-point calibration from a pre-defined master curve.

Materials include:
ADVIA Centaur AMH ReadyPack® primary reagent pack: Solid Phase (Streptavidin-coated paramagnetic microparticles with biotinylated mouse monoclonal anti-human AMH antibody in buffer; sodium azide (

AI/ML Overview

Here's the detailed breakdown of the acceptance criteria and study information for the ADVIA Centaur® Anti-Müllerian Hormone (AMH) assay, based on the provided FDA 510(k) summary:

1. Table of Acceptance Criteria and Reported Device Performance

Performance CharacteristicAcceptance Criteria (Design Goal)Reported Device Performance
Detection Capability
Limit of Blank (LoB)(Implicitly compared to predicate LoB of ≤ 0.01 ng/mL)0.010 ng/mL (0.071 pmol/L)
Limit of Detection (LoD)(Implicitly compared to predicate LoD of ≤ 0.02 ng/mL)0.020 ng/mL (0.143 pmol/L)
Limit of Quantitation (LoQ)(Implicitly compared to predicate LoQ of ≤ 0.08 ng/mL)0.043 ng/mL (0.307 pmol/L)
Precision (Total CV)≤ 10% CV for concentration ≥ 0.100 ng/mLRanges from 2.5% to 4.4% for samples at various AMH concentrations (0.112 ng/mL to 16.4 ng/mL). For controls, ranges from 2.9% to 3.3% (0.955 ng/mL to 14.1 ng/mL). All reported total CVs are ≤ 4.4% at or above 0.112 ng/mL, meeting the criterion.
Reproducibility (Total CV)(Implicitly compared to predicate Total CV of ≤ 10% for concentration ≥ 0.16 ng/mL)Ranges from 2.1% to 3.1% for samples at various AMH concentrations (0.199 ng/mL to 17.0 ng/mL). For controls, ranges from 2.6% to 2.9% (1.01 ng/mL to 14.4 ng/mL). All reported reproducibility CVs are ≤ 3.1% at or above 0.199 ng/mL, meeting the likely implied criterion.
LinearityLinear for the measuring interval of 0.043-24.0 ng/mLLinear for the measuring interval of 0.043-24.0 ng/mL (0.307-171 pmol/L).
Assay ComparisonCorrelation coefficient ≥ 0.950, slope of 1.00 ± 0.10, intercept of ± 0.035 ng/mL (vs. commercial AMH assay)Serum:
Correlation coefficient (r) = 0.994
Regression Equation: y = 1.04x - 0.032 ng/mL (Slope: 1.04, Intercept: -0.032 ng/mL). This meets the criteria for correlation, slope (within 1.00 ± 0.10), and intercept (within ± 0.035 ng/mL).
Specimen EquivalenceCorrelation coefficient ≥ 0.950, slope of 0.90-1.10, intercept of ± 0.035 ng/mL (vs. serum)Gel-barrier tube (serum) vs. Serum:
Correlation coefficient (r) = 0.997
Regression Equation: y = 1.00x + 0.003 ng/mL (Slope: 1.00, Intercept: +0.003 ng/mL). Meets criteria.

Plasma, lithium heparin vs. Serum:
Correlation coefficient (r) = 0.997
Regression Equation: y = 1.08x - 0.004 ng/mL (Slope: 1.08, Intercept: -0.004 ng/mL). Meets criteria. |
| Interferences (HIL) | Bias due to substances not to exceed 10% at specified AMH concentrations | Hemoglobin: No interference (1000 mg/dL).
Bilirubin, conjugated: No interference (66.0 mg/dL).
Bilirubin, unconjugated: No interference up to 39.0 mg/dL; however, >10% bias observed at ≥ 40 mg/dL (10.6% bias at 6.79 ng/mL AMH, 11.4% bias at 0.936 ng/mL AMH).
Lipemia (Intralipid): No interference (2000 mg/dL). |
| Interferences (Other Substances) | Bias due to substances not to exceed 10% at specified AMH concentrations | Acetaminophen, Acetylcysteine, Acetylsalicylic Acid, Ampicillin sodium, L-Ascorbic acid, Biotin, Cefoxitin sodium salt, Cholesterol, Cyclosporine, Doxycycline hyclate, Folic acid, Gonapeptyl, Heparin, Human IgA, Human IgG, Human IgM, Ibuprofen, Levodopa, Levothyroxine, Metformin hydrochloride, Methyldopa, Metronidazole, Phenylbutazone, Rheumatoid Factor, Rifampicin, Theophylline, Total Protein, Uric acid: All showed no interference (bias ≤ 10%) at tested concentrations. |
| Cross-Reactivity | Bias does not exceed 10% | Activin A, Activin B, Activin AB, Inhibin A, Inhibin B, TGF b-1: ≤ 0.1% cross-reactivity.
Follicle stimulating hormone (FSH) at 500 mIU/mL: Not Detectable, 0.2% bias.
Luteinizing hormone (LH) at 500 mIU/mL: Not Detectable, 2.9% bias. All considered insignificant. |
| Stability (On-board Reagents) | Reagents stable for 70 days | Determined to be 70 days. |
| Stability (Calibrators)| Calibrators stable at 2-8°C and ≤ -20°C for 90 days after reconstitution | Determined to be stable at 2-8°C and ≤ -20°C for 90 days after reconstitution. |
| High Dose Hook | (No explicit criterion given, but predicate states no hook effect up to 1000 ng/mL) | No hook effect observed up to 1151 ng/mL (8218 pmo/L). This exceeds the predicate. |
| Clinical Performance (Overall)| (Aid in distinguishing AFC > 15 vs ≤ 15 in fertility clinics) | Sensitivity: 90.5% (256/283) (95% CI: 86.47, 93.36)
Specificity: 52.0% (130/250) (95% CI: 45.82, 58.12)
PPV: 68.1% (256/376) (95% CI: 63.21, 72.59)
NPV: 82.8% (130/157) (95% CI: 76.13, 87.90) |
| **Clinical Performance (Age 15 vs ≤ 15 in fertility clinics for this age group) | Prevalence (AFC > 15): 67.4%
PPV: 73.6% (67.47, 78.88)
NPV: 65.1% (50.17, 77.58) |
| Clinical Performance (Age ≥ 35)| (Aid in distinguishing AFC > 15 vs ≤ 15 in fertility clinics for this age group) | Prevalence (AFC > 15): 38.4%
PPV: 59.7% (51.71, 67.27)
NPV: 89.5% (82.50, 93.88) |


2. Sample Size Used for the Test Set and the Data Provenance

  • Detection Capability (LoB, LoD, LoQ): Not explicitly stated, but determined as described in CLSI protocol EP17-A2. These are typically derived from analytical studies involving numerous replicates of blank, low-concentration, and relevant samples.
  • Precision and Reproducibility:
    • Precision: 480 measurements per sample/control (replicates of 2, 2 runs/day, 20-day protocol). This was across 2 instruments and 3 reagent lots.
    • Reproducibility: 90 measurements per sample/control (triplicate, 2 runs/day, 5 days) across 3 sites and 1 reagent lot.
  • Assay Comparison: 120 samples (serum) vs. a commercial AMH assay.
  • Specimen Equivalence: 88 samples for Gel-barrier tube (serum) vs. Serum, and 88 samples for Plasma (lithium heparin) vs. Serum.
  • Interferences: The number of samples tested for each substance is not specified, but the testing was performed in accordance with CLSI Document EP07-ed3 and EP37-ed1.
  • Cross-Reactivity: Number of samples not specified, performed in accordance with CLSI Document EP07-ed3.
  • Stability: Not sample-based but rather experimental conditions and time points.
  • Expected Values (Reference Intervals):
    • Females (18–25 years): 209 samples
    • Females (26–30 years): 122 samples
    • Females (31–35 years): 123 samples
    • Females (36–40 years): 126 samples
    • Females (41–45 years): 152 samples
    • Females (46–50 years): 121 samples
    • Females (51 years and older): 139 samples
  • Clinical Sensitivity and Specificity: 533 women.
  • Data Provenance:
    • Clinical Study: Prospectively collected from women presenting to fertility clinics for evaluation.
    • Country of Origin: 11 sites across the United States.
    • Expected Values: Samples were collected retrospectively or prospectively from "apparently healthy subjects," but the exact nature (retrospective/prospective) and location of collection is not explicitly detailed beyond "apparently healthy subjects". Given the clinical study provenance, it's likely linked, but not explicitly stated.

3. Number of Experts Used to Establish the Ground Truth for the Test Set and the Qualifications of Those Experts

For the clinical performance study (Clinical Sensitivity and Specificity), the ground truth for ovarian reserve was established by Antral Follicle Count (AFC) values determined by transvaginal ultrasound.

  • The document does not specify the number of experts (e.g., sonographers, radiologists) who performed or interpreted these ultrasounds.
  • It also does not specify the individual qualifications (e.g., years of experience, board certification) of these experts. It only states that the AFC result was determined by transvaginal ultrasound.

4. Adjudication Method for the Test Set

The document does not explicitly describe an adjudication method (such as 2+1, 3+1) for establishing the ground truth (AFC values). The AFC results were simply stated as being "determined by transvaginal ultrasound." This suggests that the individual AFC measurements were taken as the singular truth, without a multi-reader review or adjudication process outlined to resolve discrepancies, if any.

5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study Was Done, If So, What Was the Effect Size of How Much Human Readers Improve with AI vs. Without AI Assistance

No, an MRMC comparative effectiveness study involving human readers and AI assistance was not conducted. This device is an in vitro diagnostic (IVD) assay that quantitatively measures a biomarker (AMH) in serum/plasma. It does not involve interpretation of medical images or other data by human readers, and thus, AI assistance in the context of human reading is not applicable here.

6. If a Standalone (i.e., algorithm only without human-in-the-loop performance) Was Done

Yes, the studies reported are essentially "standalone" performance evaluations of the ADVIA Centaur® Anti-Müllerian Hormone (AMH) assay itself. The clinical performance study evaluates the assay's ability to distinguish between high and normal/diminished ovarian reserve based on AMH measurements, against the ground truth of AFC. This is the performance of the device (the assay) as an algorithm or test method, independent of subsequent human interpretation enhancements.

7. The Type of Ground Truth Used

  • Analytical Studies (Detection Capability, Precision, Linearity, Interference, Cross-reactivity, Stability, Hook Effect): The ground truth for these studies is typically derived from highly characterized reference materials, spiked samples with known concentrations, or established analytical methods.
  • Expected Values (Reference Intervals): Established by collecting samples from "apparently healthy subjects" and calculating statistical percentiles (90th and 95th reference intervals). The ground truth here is the statistical distribution of AMH levels in a healthy population defined by age.
  • Clinical Sensitivity and Specificity: The ground truth for ovarian reserve was Antral Follicle Count (AFC) values, as measured by transvaginal ultrasound. This is a clinical measure widely accepted in fertility assessment.

8. The Sample Size for the Training Set

The document does not specify a separate "training set" in the context of a machine learning or AI algorithm development. This is an in vitro diagnostic assay, where performance is typically established through analytical validation and clinical correlation studies, not through AI model training. The data used for most performance characteristics are considered validation data.

  • The "Expected Values" data set (totaling 209+122+123+126+152+121+139 = 992 samples) could be seen as reference data used to establish norms, but not a "training set" for an algorithm in the AI sense.
  • The "Clinical Sensitivity and Specificity" study of 533 women served as a clinical validation dataset.

9. How the Ground Truth for the Training Set Was Established

As no "training set" in the AI sense is explicitly mentioned for algorithm development, there's no described method for establishing ground truth for such a set. For the validation data described:

  • Clinical Study Ground Truth: The ground truth was Antral Follicle Count (AFC) values, determined by transvaginal ultrasound by unspecified qualified personnel at 11 sites across the US.
  • Expected Values Ground Truth: These are based on AMH measurements from "apparently healthy subjects," where their health status (absence of relevant pathologies) constitutes the ground truth for establishing normal ranges.

§ 862.1092 Anti-mullerian hormone test system.

(a)
Identification. An anti-mullerian hormone test system is an in vitro diagnostic device intended to measure anti-mullerian hormone in human serum and plasma. An anti-mullerian hormone test system is intended to be used for assessing ovarian reserve in women.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Design verification and validation must include:
(i) An adequate traceability plan to minimize the risk of drift in anti-mullerian hormone test system results over time.
(ii) Detailed documentation of a prospective clinical study to demonstrate clinical performance or, if appropriate, results from an equivalent sample set. This detailed documentation must include the following information:
(A) Results must demonstrate adequate clinical performance relative to a well-accepted comparator.
(B) Clinical sample results must demonstrate consistency of device output throughout the device measuring range that is appropriate for the intended use population.
(C) Clinical study documentation must include the original study protocol (including predefined statistical analysis plan), study report documenting support for the proposed indications for use(s), and results of all statistical analyses.
(iii) Reference intervals generated by testing an adequate number of samples from apparently healthy normal individuals in the intended use population.
(2) The labeling required under § 809.10(b) of this chapter must include a warning statement that the device is intended to be used for assessing the ovarian reserve in conjunction with other clinical and laboratory findings before starting any fertility therapy, and that the device should be used in conjunction with the antral follicle count.