(269 days)
The ipsogen JAK2 RGQ PCR Kit is a qualitative in vitro diagnostic test for the detection of the JAK2 V617F/G1849T allele in genomic DNA extracted from EDTA whole blood. The ipsogen JAK2 RGQ PCR Kit is a real time PCR test performed on the QIAGEN Rotor-Gene Q MDx instrument. The test is intended for use as an adjunct to evaluation of suspected Polycythemia Vera, in conjunction with other clinicopathological factors.
This test does not detect less common mutations associated with Polycythemia Vera including mutations in exon 12 and is not intended for stand-alone diagnosis of Polycythemia Vera.
The ipsogen® JAK2 RGQ PCR Kit (hereafter also referred to as the JAK2 Kit or JAK2 assay) employs allele-specific, quantitative, polymerase chain reaction (PCR) using an amplification refractory mutation system (ARMS). DNA is extracted from K2-EDTA anticoagulated whole blood using the QIAsymphony instrument (QSSP) and QIAsymphony® DSP DNA Mini Kit. Purified DNA must be diluted to 10 ng/ul using the TE buffer provided in the JAK2 Kit. Each PCR reaction of the Rotor-Gene Q MDx is optimized for 50 ng of purified gDNA diluted in a final volume of 5 ul. A total of 100 ng per tested sample (50 ng for each reaction) is needed. The Kit contains sufficient reagents to test 24 reactions.
The provided text describes the performance evaluation of the ipsogen® JAK2 RGQ PCR Kit. Here's a breakdown of the acceptance criteria and the study that proves the device meets them, organized as requested:
Acceptance Criteria and Reported Device Performance
Table 1: Acceptance Criteria and Reported Device Performance
Acceptance Criteria Category | Specific Criteria & Description | Reported Device Performance |
---|---|---|
Analytical Performance: Precision | Within-laboratory Precision: Acceptable imprecision over a range of mutation levels (0% to 70%), with imprecision at the clinical decision point (1%) ranging from 0 to 27%. | Imprecision at the clinical decision point (1%) ranged from 0 to 27%. The assay has acceptable precision for its intended use. (Table 2) |
Between-instrument and Between Lot Precision: Acceptable reproducibility between three instruments and three reagent kit lots. | Between instrument and between lot reproducibility demonstrated to be acceptable. (Table 3) | |
Site-to-Site Reproducibility (Study 1 - Cell Lines): |
- A260/280 OD within 1.7-2.0 and gDNA concentrations $\ge$ 10 ng/µL with a 90% pass rate.
- Variability between sites must be less than that of within-laboratory study. | Study 1 met pre-specified acceptance criteria. Variability was acceptable. For the 1.02% JAK2 MUT mean, 25/92 tests were positive. For 0% JAK2 MUT mean, 0/92 tests were positive. (Tables 4, 5) |
| | Site-to-Site Reproducibility (Study 2 - Clinical Samples): All calls (positive/negative) must be correctly identified with 100% concordance (95% CI, 90.3%, 100%). Observed variability to be lower than in the precision study for comparable mutation levels. | For all 6 clinical specimens, 100% of calls were correctly positive or negative (95% CI, 90.3%, 100%). Observed variability was lower than in the precision study. (Table 6) |
| | Precision of Controls: Acceptable repeatability and reproducibility including variations between operators, kits, instruments, and days for mutant and wild-type controls, and internal controls (HEX channel Ct values 25 to 37.79). | Mutant Control: Mean 99.98% JAK2 MT, SD 0.02, %CV 0.02. Wild Type Control: Mean 0.00% JAK2 MT, SD 0.00, %CV 0.00. HEX Mutant Ct: Mean 33.09, Total SD 0.70, %CV 2.11. HEX Wild Type Ct: Mean 32.90, Total SD 0.83, %CV 2.53. Controls were acceptable. (Tables 7, 8) |
| | DNA Extraction Method Reproducibility: No evidence of cross-contamination greater than the clinical cut-off (1%). | One sample in the control extraction run produced 0.012% MUT and one in the test run 0.001% MUT; both below the 1% cut-off. The study established the lack of cross-contamination. |
| Analytical Performance: Linearity/Reportable Range and DNA Input | Linearity: Assay must be linear for tested DNA input levels across the measuring interval (0-70% mutation). At 1% cutoff, degree of linearity must not be statistically different from 0. | Regression analysis demonstrated linearity at DNA inputs of 5, 10, and 20 ng/uL. The assay is not linear at 2 or 30 ng/uL. Significant effect when DNA input was 2 ng/uL and %MUT was LoB >90% of the time. | LoB: All results below LoD, supporting that LoB is below LoD and not detectable. LoD: Verified to be 0.042% MUT. The 10th percentile was calculated and determined to be greater than the LoB >90% of the time. |
| Analytical Performance: Analytical Specificity | Interfering Substances: No interference with assay performance (except Proteinase K, which was further examined). Key acceptance for Proteinase K: control and control+TE not different, internal control Ct > 38.51, A260/280 1.7-2.0, gDNA $\ge$ 10 ng/µL, %MUT comparison p-value 1%. | All testing met acceptance criteria for reagent stability, transport conditions, and freeze-thaw cycles. Supports a shelf life of months (specific duration redacted) and freeze-thaw cycles for the JAK2 Assay kit with the study ongoing. (Tables 13, 14, 20) |
| | Specimen Stability (Whole Blood): Stability for up to 4 days (96 hrs) after collection in K2 EDTA tubes at 2-8°C and room temperature. | Study designed to support 96 hours stability, consistent with literature. (Specific results not fully detailed, but implies acceptance criteria were met). |
| | Specimen Stability (Genomic DNA): Stability of extracted DNA at -15 to -30°C and after 4 freeze/thaw cycles for 15 months (studies ongoing to extend to 24 months). | Testing parameters within specifications to 18 months, with studies ongoing to 25 months. The extracted gDNA can be stored up to 15 months with studies ongoing to extend this claim. |
| Comparison Studies: Accuracy Method Comparison | Agreement with Bi-directional Sanger Sequencing: High positive percent agreement (PPA) and negative percent agreement (NPA). | PPA: 100% (71/71 subjects; 95% CI: 94.4%, 100%). NPA: 99.5% (204/205 subjects; 95% CI: 97.3%, 100%). One discordant case with Sanger negative but JAK2 Kit positive (5.6% MUT, below Sanger LoD). (Table 15) |
| Clinical Performance | Sensitivity: High sensitivity (expected to detect PV in vast majority of subjects with disease).
Specificity: High specificity (expected to rule out PV in vast majority of subjects without PV). | Sensitivity: 94.6% (53/56 subjects; CI: 85.1%, 98.8%). Specificity: 98.1% (157/160 subjects; 95% CI: 94.6%, 99.6%). (Tables 18, 19) |
Study Details
2. Sample Size and Data Provenance
- Test Set Sample Size:
- Analytical Performance (Precision):
- Within-laboratory: 11 samples (0% to 70% mutation), 108 measurements per sample.
- Site-to-Site Reproducibility Study 1 (cell lines): 8 samples (50% to 0% MUT), 96 data points per mutation level.
- Site-to-Site Reproducibility Study 2 (clinical samples): 6 clinical samples (4 positive, 2 negative), 36 measurements per sample.
- Analytical Performance (Linearity): 11 different mutation percentages (0-70%) at 5 DNA input levels (30, 20, 10, 5, 2 ng/uL), 4 data points per %MUT per DNA input.
- Analytical Performance (LoD): First study: 3 positive + 3 negative clinical samples, 6-point dilution series, 20 replicates each in 3 kit lots (180 total measurements). Second study: 2 samples, 30 replicates each (60 data points).
- Analytical Performance (Interfering Substances): Samples at LoD, with various interfering substances. Proteinase K study used pooled healthy and PV whole blood, spiked, 4 replicates per sample.
- Accuracy Method Comparison: 276 clinical specimens.
- Clinical Performance: 216 evaluable subjects (from 286 consented subjects).
- Analytical Performance (Precision):
- Data Provenance:
- Country of Origin: Clinical samples for the primary clinical study were collected from 9 study sites in the United States (7 enrolled subjects), 12 study sites in France (all 12 enrolled subjects), and 9 study sites in Italy (5 enrolled subjects).
- Retrospective/Prospective: The clinical performance study was a multicenter, international, prospective, interventional study. Analytical studies primarily used prepared or pooled samples, with some clinical samples, making them mostly prospective for the specific test scenario.
3. Number of Experts and Qualifications for Ground Truth
- Number of Experts: Not explicitly stated as a number of individual experts.
- Qualifications: Ground truth for the clinical performance study was established by independent assessment of patient status at the clinical site based on the 2008 WHO diagnostic criteria, combined with bidirectional Sanger sequencing and, in some cases, bone marrow biopsy with histologic and cytogenetic analysis. This implies a consensus among clinical professionals (hematologists, pathologists) using established diagnostic guidelines. For the accuracy method comparison, bidirectional Sanger sequencing served as the comparator method.
4. Adjudication Method for the Test Set
- The text describes a process for establishing "clinical truth" in the clinical performance study. The reference for JAK2 status determination was independent assessment of patient status at the clinical site based on the 2008 WHO diagnostic criteria. If initial blood tests by bidirectional assessment (BDS) were negative, bone marrow from patients was tested by bidirectional sequencing. This suggests an adjudicated process based on a diagnostic algorithm and multiple testing modalities to arrive at a final clinical diagnosis for ground truth. There's no explicit mention of an external, blinded adjudication panel in a "X+Y" format, but rather a structured clinical diagnostic pathway.
5. Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study
- No, an MRMC comparative effectiveness study was not performed as described. This is an automated IVD kit, not an imaging device requiring human reader interpretation. The performance of the device (ipsogen JAK2 RGQ PCR Kit) was compared against a reference standard (bidirectional Sanger sequencing for analytical accuracy and WHO diagnostic criteria combined with BDS for clinical performance), not against human readers' performance with and without AI assistance.
6. Standalone Performance
- Yes, standalone performance was done. The entire performance evaluation described focuses on the analytical and clinical performance of the ipsogen JAK2 RGQ PCR Kit as a standalone diagnostic test (without human-in-the-loop assistance for interpretation beyond standard laboratory procedures). The device outputs a qualitative result ("Mutation detected," "No Mutation Detected," or "Invalid") and a quantitative %MUT value, which are then used by clinicians as an adjunct to diagnosis. The "Rotor-Gene AssayManager (RGAM) software associated to the JAK2 Plug-in" automates data analysis, meaning no manual analysis required.
7. Type of Ground Truth Used
- Analytical Studies:
- Known concentrations/dilutions: For precision, linearity, LoD, and stability studies, samples were prepared to contain known percentages of mutated DNA or were derived from characterized cell lines (MUTZ-9, HEL, K-562).
- Orthogonal methods: For traceability/calibration of standards, an independent orthogonal method was used to confirm values.
- Bi-directional Sanger sequencing: Used as the comparator method to establish accuracy.
- Clinical Performance Study:
- The primary ground truth for the clinical study was the 2008 World Health Organization (WHO) diagnostic criteria for myeloproliferative neoplasms, with specific emphasis on PV. This standard incorporates various clinicopathological factors, including hemoglobin levels, hematocrit, EPO levels, and JAK2 mutation status (determined by bidirectional Sanger sequencing). Bone marrow biopsy and cytogenetic analysis were performed when required by the algorithm to confirm diagnosis.
8. Sample Size for Training Set
- The document describes a commercial in vitro diagnostic kit, not an AI/ML model that undergoes "training" in the traditional sense with a distinct training dataset. The development and validation of such a kit typically involve optimization studies and design verification using various samples (cell lines, pooled DNA, clinical samples) which inform the final design and performance characteristics, but these are not referred to as a "training set" in the context of machine learning. Therefore, a specific "training set sample size" as one would expect for an AI model is not applicable/not provided in this document.
9. How Ground Truth for Training Set was Established
- As above, given this is an IVD kit and not an AI/ML model, the concept of a "training set" with ground truth in the AI context does not directly apply. The kit's design and analytical parameters were likely developed and optimized using:
- Quantified DNA samples: Known concentrations of mutant and wild-type DNA, often from cell lines (like MUTZ-8, HEL, K-562) with characterized mutational status.
- Clinical samples: Utilized during development and analytical validation to ensure the test performs as expected with real-world variability.
- Established molecular biology techniques: Such as quantitative PCR, and knowledge of the JAK2 V617F mutation's characteristics.
The "ground truth" during this development (rather than "training") would be based on these highly characterized materials and established genomic methods.
§ 866.6070 Mutation detection test for myeloproliferative neoplasms.
(a)
Identification. A mutation detection test for myeloproliferative neoplasms is an in vitro diagnostic device intended for the detection of the JAK2 V617F/G1849T allele in genomic DNA extracted from whole blood. The test is intended for use as an adjunct to evaluation of suspected polycythemia vera in conjunction with other clinicopathological factors.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Design verification and validation must include:
(i) A detailed description of all components in the test, including the following:
(A) A detailed description, including illustrations or photographs of non-standard equipment or methods, of the test components, including all required reagents, instrumentation, and equipment.
(B) Detailed documentation of the device software, including standalone software applications and hardware-based devices that incorporate software.
(C) A detailed description of methodology and assay procedures including appropriate internal and external quality controls that are recommended or provided. The description must identify those control elements that are incorporated into the testing procedure.
(D) A detailed specification for sample collection, processing, and storage.
(E) A description of the criteria for test result interpretation and reporting including result outputs, analytical sensitivity of the assay, and the values that will be reported.
(ii) Information that demonstrates the performance characteristics of the test, including:
(A) For indications for use based on a threshold established in a predicate device of this generic type, device performance data from either a method comparison study to the predicate device or through a clinical study demonstrating clinical validity using well-characterized prospectively or retrospectively obtained clinical specimens, as appropriate, representative of the intended use population.
(B) For indications for use based on a threshold not established in a predicate device of this generic type, device performance data from a clinical study demonstrating clinical validity using well-characterized prospectively or retrospectively obtained clinical specimens, as appropriate, representative of the intended use population.
(C) Device reproducibility data generated, using a minimum of three sites, of which at least two sites must be external sites, with two operators at each site. Each site must conduct a study that includes at least two operators per site, two runs per operator per day over a minimum of three non-consecutive days evaluating a sample panel that contains allelic frequencies that span the claimed measuring range, and include the clinical threshold allelic frequency. Pre-specified acceptance criteria must be provided and followed.
(D) Information on device traceability and a description of the value assignment process for calibrators and controls.
(E) Device precision data using clinical samples and controls to evaluate the within-lot, between-lot, within-run, between-run, and total variation.
(F) Device linearity data generated from samples covering the device measuring range and for any standards used in the quantitation of allelic frequencies.
(G) Device analytic sensitivity data, including limit of blank and limit of detection.
(H) Device specificity data, including interference and cross-contamination.
(I) Device and clinical specimen stability data, including real-time stability (long-term storage and in-use stability) and stability evaluating various storage times, temperatures, and freeze-thaw conditions, as appropriate.
(iii) Identification of risk mitigation elements used by the device, including a detailed description of all additional procedures, methods, and practices incorporated into the instructions for use that mitigate risks associated with testing using the device.
(2) The labeling required under § 809.10(b) of this chapter must include:
(i) An intended use statement, including an indication for use that includes the variant(s) for which the assay was designed and validated, for example, JAK2 G1849T.
(ii) A detailed description of the performance studies conducted to comply with paragraph (b)(1)(ii) of this section and a summary of the results.