(51 days)
The MSK-IMPACT assay is a qualitative in vitro diagnostic test that uses targeted next generation sequencing of formalin-fixed paraffin-embedded tumor tissue matched with normal specimens from patients with solid malignant neoplasms to detect tumor gene alterations in a broad multi gene panel. The test is intended to provide information on somatic mutations (point mutations and small insertions and deletions) and microsatellite instability for use by qualified health care professionals in accordance with professional guidelines, and is not conclusive or prescriptive for labeled use of any specific therapeutic product. MSK-IMPACT is a single-site assay performed at Memorial Sloan Kettering Cancer Center.
A description of required equipment, software, reagents, vendors, and storage conditions were provided, and are described in the product labeling (MSK-IMPACT manual). MSK assumes responsibility for the device.
Here's a breakdown of the acceptance criteria and the study that proves the device meets them, based on the provided text for the MSK-IMPACT assay.
Device Name: MSK-IMPACT (Integrated Mutation Profiling of Actionable Cancer Targets)
Type of Test: Next generation sequencing tumor profiling test
Purpose: Qualitative in vitro diagnostic test for detecting somatic mutations (point mutations, small insertions and deletions) and microsatellite instability (MSI) in formalin-fixed paraffin-embedded (FFPE) tumor tissue matched with normal specimens from patients with solid malignant neoplasms.
1. Table of Acceptance Criteria and Reported Device Performance
The acceptance criteria are generally embedded within the "Performance" section and the "Reporting" section's "Table 3. Sample Level Quality Control Metrics." The reported performance is found throughout the "Performance" section.
Acceptance Criteria (from text) | Reported Device Performance (from text) |
---|---|
Specimen Requirements: | |
Minimum Tumor Proportion: >10% of tumor cells; >20% viable tumor preferred; >25% for MSI testing. | The minimum tumor proportion required for the MSI assay was established as 25% (based on CRC specimens, assay and score reproducible to 8% tumor proportion qualitatively, but decreased trend quantitatively) (Table 1). The DNA extraction method was validated with historic data from >10,000 specimens, demonstrating invalid rates of 7.2% to 18.4%, supporting performance across FFPE tumor types (Table 5). |
Quality Control Metrics (Table 3): | |
Average target coverage: > 200X | For normal samples, mean coverage across all targeted exons was 571X (SD = 373X). Analysis of normal samples showed that with mean sample coverage of 571X, 98% of exons are sequenced with coverage greater than 306X (or normalized coverage >0.54), leading to a conservative threshold of 200X mean sample coverage. In silico downsampling to 203X coverage detected 94% of mutations with 10% VAF (Performance L.1.b and Table 3). |
Coverage Uniformity: ≥ 98% target exons above 100X coverage | 99.5% of exons were sequenced to a depth of 100X or greater, and 98.6% to 250X or greater. It’s expected that 98% of exons will be sequenced to >100X coverage when mean sample coverage is 185X. (Performance L.1.b) |
Base Quality: > 80% of bases with QS above > Q30 | Not explicitly detailed in the performance section but stated as a QC metric in Table 3. Implicitly met if overall performance is approved. |
% Cluster passing filter (Cluster PF): > 80% | Not explicitly detailed in the performance section but stated as a QC metric in Table 3. Implicitly met if overall performance is approved. |
% Reads passing filter (Reads PF): > 80% | Not explicitly detailed in the performance section but stated as a QC metric in Table 3. Implicitly met if overall performance is approved. |
Hotspot Mutation calling threshold: DP ≥ 20, AD ≥ 8, VF ≥ 2% | Filtering scheme designed to reject false positives while maintaining detection capability. Example: pre-filter SNVs (hotspot) had 1 false positive, post-filter 0 (Table 4). LoD confirmation: 5 replicates for 6 SNVs at 5% MAF showed 100% positive call rates, except one replicate failing on PTEN exon 6 due to low read depth below 5% (Performance L.2.b.ii and Table 11). |
Non-hotspot Mutation threshold: DP > 20, AD ≥ 10, VF ≥ 5% | Filtering scheme designed to reject false positives while maintaining detection capability. Example: pre-filter SNVs (non-hotspot) had 342 false positives, post-filter 0 (Table 4). LoD study showed most mutations detected at low VAFs (e.g., 2-9% in Tables 10A-J). Confirmed LoD study (Part 2) for various mutations showed 100% positive call rates for variant types except one discordant case (PTEN exon 8 deletion) at 3.6-7.9% VF (Table 11). |
Indels: Fewer than 20% of samples in an established 'standard normal' database (This seems to be a filtering criteria for indels, not a reporting metric.) | Indels had 40,793 pre-filter false positives, reduced to 8 post-filter (Rejection Rate 0.999) (Table 4). LoD confirmation: 5 replicates for 3 deletions and 4 insertions at 5% MAF showed 100% positive call rates, except one deletion (PTEN exon 6), which also failed read depth (Performance L.2.b.ii and Table 11). |
Positive Run Control: The difference between the observed and expected frequencies for the known mutations should be within 5%. | Mixed positive control sample with expected VFs: Results reviewed to confirm known mutations called and observed frequencies match expected values within 5% (Controls, b). |
Negative Run Control: The correlation between expected and observed mutation frequencies should be 0.9 or higher. | Pooled negative control: Observed mutation frequencies compared against expected for 862 common SNPs; correlation expected to be 0.9 or higher (Controls, c). Figure 2 shows correlation of 0.975 (with slope 0.971 and intercept -0.004) for observed vs. expected variant frequency, establishing consistent correlation >0.9. |
Sample-Mix up QC: Flagged if pairs of samples from the same patient with > 5% discordance and from different patients with 5% for same patient ("unexpected mismatches") or 10,000 samples mentioned in pre-analytical performance context). Table 13 presents data by DNA input amounts but not sample count for each bin. |
* **Accuracy (Method Comparison):**
* 267 unique mutations in **433 FFPE tumor specimens** for the main comparison (Table 14).
* **95 specimens** for the supplemental wildtype calls study.
* **138 colorectal cancer (CRC) and 40 endometrial carcinoma (EC) specimens** (training set) for MSI cutoff establishment.
* **135 CRC patients** (66 with both MSK-IMPACT and IHC) for MSI cutoff validation.
* **119 unique non-CRC and non-EC tumor-normal pair samples** for MSI comparison in other cancer types.
- Data Provenance:
- General: The device is performed at Memorial Sloan Kettering Cancer Center (MSK), indicating the data likely originates from their patient population.
- Retrospective/Prospective:
- The pre-analytical performance (specimen invalid rates) used historical data from >10,000 specimens, implying a retrospective chart review.
- The MSI validation study (CRC patients) was a retrospective-prospective chart review.
- The clinical performance section mentions a large-scale, prospective clinical sequencing initiative using MSK-IMPACT involving >10,000 patients, whose data are publicly accessible. This cohort likely informed the broader context and understanding of the device but was not explicitly stated as the test set for the analytical validation.
- The analytical performance studies (precision, LoD, accuracy) used clinical samples/specimens, which could be retrospective or prospectively collected for the purpose of the study. The text doesn't explicitly state for each study.
3. Number of Experts Used to Establish the Ground Truth for the Test Set and the Qualifications of Those Experts
The text does not specify the number of experts used to establish the ground truth for the test set, nor their specific qualifications (e.g., "radiologist with 10 years of experience").
However, it does indicate:
- For the accuracy studies, results were compared to "original results obtained with the validated orthogonal methods." This implies that the ground truth was established by these validated orthogonal methods, which are presumably performed and interpreted by qualified personnel using established clinical diagnostics.
- For MSI, the MSIsensor results were compared to "a validated MSI-PCR or MMR IHC test," a "commercially available PCR assay," or a "validated IHC panel (MLH1, MSH2, MSH6 and PMS2)." Again, this suggests ground truth from established, clinical laboratory methods.
- The "Clinical Evidence Curation" section mentions that "OncoKB undergoes periodic updates through the review of new information by a panel of experts," which informs the clinical interpretation of detected mutations. This expert panel contributes to the broader clinical context of the mutations, but not directly the ground truth for the analytical test set itself.
4. Adjudication Method (e.g., 2+1, 3+1, none) for the Test Set
The text does not describe a formal adjudication method (like 2+1 or 3+1 consensus with experts) for establishing the ground truth of the test set cases. Instead, the ground truth was derived from "validated orthogonal methods."
For example, in the accuracy study, the MSK-IMPACT results were "compared to the original results obtained with the validated orthogonal methods." This indicates that the results from the comparison methods served as the reference standard, rather than requiring an additional expert adjudication process on top of those existing validated methods.
5. If a Multi-Reader Multi-Case (MRMC) Comparative Effectiveness Study was Done
No, an MRMC comparative effectiveness study was not described. This document pertains to the analytical validation of a genetic sequencing assay, which inherently does not involve human readers interpreting images in a multi-reader, multi-case setup. Therefore, a comparative effectiveness study measuring human reader improvement with AI assistance (which is typical for imaging AI) is not applicable here.
6. If a Standalone (Algorithm Only Without Human-in-the-Loop Performance) Was Done
Yes, the analytical performance studies (precision, analytical sensitivity, and analytical accuracy) described are all measures of the standalone performance of the MSK-IMPACT assay, which relies on its sequencing and bioinformatics pipeline without direct human-in-the-loop diagnostic interpretation to produce the raw mutation calls.
- The "Mutation calling SNVs and Indels" section and "Summary of mutation filtering scheme" (Figure 1) describe the automated pipeline for identifying mutations.
- The "Performance" section details how characteristics like precision, LoD, and accuracy were determined for the assay itself by comparing its outputs to known or established results from other validated methods. These do not involve a human interpreting the device's output to make a diagnosis within the performance evaluation but rather assess the accuracy of the device's genomic calls directly.
7. The Type of Ground Truth Used
The primary type of ground truth used was:
- Orthogonal Methods / Comparator Assays: For the accuracy studies, the MSK-IMPACT results were compared against "original results obtained with the validated orthogonal methods." This included comparison to:
- Validated orthogonal methods for SNVs and indels.
- Established MSI-PCR or MMR IHC tests for Microsatellite Instability status.
- Known Reference Material: For precision, a "well characterized reference standard (HapMap cell line NA20810)" was used, with reference genotypes obtained from the 1000 Genomes database.
- Expected Values/Dilution Series: For Limit of Detection studies, serial dilutions of patient samples with "known mutations" and "expected frequencies" were used.
Therefore, the ground truth is a combination of established methods, known reference materials, and empirically derived expected values.
8. The Sample Size for the Training Set
The document explicitly mentions training data primarily in the context of the MSI cutoff:
- MSI Cutoff Training: A "training specimen dataset consisting of 138 colorectal cancer (CRC) and 40 endometrial carcinoma (EC) specimens with matched normal and having MSI status results from a validated MSI-PCR or MMR IHC test."
For the mutation calling pipeline (SNVs and indels), the text refers to:
- Optimization of thresholds: "The threshold values for the filtering criteria were established based on paired-sample mutation analysis on replicates of normal FFPE samples, and optimized to reject all false positive SNVs and almost all false positive indel calls from the reference dataset." The size of this "reference dataset" or "replicates of normal FFPE samples" used for training/optimization of filtering thresholds is not explicitly stated as a defined "training set sample size" for the SNV/indel calling. It implies an internal dataset used during development.
9. How the Ground Truth for the Training Set Was Established
For the MSI cutoff training set:
- The ground truth was established by "validated MSI-PCR or MMR IHC test" results. These are existing, established clinical diagnostic methods for determining MSI status.
For the SNV/indel pipeline optimization/threshold establishment:
- The ground truth for optimizing filtering thresholds was based on "paired-sample mutation analysis on replicates of normal FFPE samples" and "reference dataset." This suggests that the "true" status of these calls (i.e., whether they were true positives, false positives, etc.) would have been known or definitively determined through external means (e.g., highly confident calls from a different (perhaps more laborious or deeply sequenced) method, or a known characteristic of the "normal FFPE samples"). However, the specific method for establishing this ground truth for the filtering optimization is not explicitly detailed beyond being from a "reference dataset."
§ 866.6080 Next generation sequencing based tumor profiling test.
(a)
Identification. A next generation sequencing (NGS) based tumor profiling test is a qualitative in vitro diagnostic test intended for NGS analysis of tissue specimens from malignant solid neoplasms to detect somatic mutations in a broad panel of targeted genes to aid in the management of previously diagnosed cancer patients by qualified health care professionals.(b)
Classification. Class II (special controls). The special controls for this device are:(1) Premarket notification submissions must include the following information:
(i) A detailed description of all somatic mutations that are intended to be detected by the test and that are adequately supported in accordance with paragraph (b)(1)(v) of this section and reported in the test results in accordance with paragraph (b)(2)(iv) of this section, including:
(A) A listing of mutations that are cancer mutations with evidence of clinical significance.
(B) As appropriate, a listing of mutations that are cancer mutations with potential clinical significance.
(ii) The indications for use must specify the following:
(A) The test is indicated for previously diagnosed cancer patients.
(B) The intended specimen type(s) and matrix (
e.g., formalin-fixed, paraffin-embedded tumor tissue).(C) The mutation types (
e.g., single nucleotide variant, insertion, deletion, copy number variation or gene rearrangement) for which validation data has been provided.(D) The name of the testing facility or facilities, as applicable.
(iii) A detailed device description including the following:
(A) A description of the test in terms of genomic coverage, as follows:
(
1 ) Tabulated summary of all mutations reported, grouped according to gene and target region within each gene, along with the specific cDNA and amino acid positions for each mutation.(
2 ) A description of any within-gene targeted regions that cannot be reported and the data behind such conclusion.(B) Specifications for specimen requirements including any specimen collection devices and preservatives, specimen volume, minimum tumor content, specimen handling, DNA extraction, and criteria for DNA quality and quantity metrics that are prerequisite to performing the assay.
(C) A detailed description of all test components, reagents, instrumentation, and software required. Detailed documentation of the device software including but not limited to, software applications and hardware-based devices that incorporate software.
(D) A detailed description of the methodology and protocols for each step of the test, including description of the quality metrics, thresholds, and filters at each step of the test that are implemented for final result reporting and a description of the metrics for run-failures, specimen-failures, invalids, as applicable.
(E) A list of links provided by the device to the user or accessed by the device for internal or external information (
e.g., decision rules or databases) supporting clinical significance of test results for the panel or its elements in accordance with paragraphs (b)(1)(v) and (b)(2)(vi) of this section.(F) A description of internal and external controls that are recommended or provided and control procedures. The description must identify those control elements that are incorporated into the testing procedure.
(iv) Information demonstrating analytical validity of the device according to analytical performance characteristics, evaluated either specifically for each gene/mutation or, when clinically and practically justified, using a representative approach based on other mutations of the same type, including:
(A) Data that adequately supports the intended specimen type (
e.g., formalin-fixed, paraffin-embedded tumor tissue), specimen handling protocol, and nucleic acid purification for specific tumor types or for a pan-tumor claim.(B) A summary of the empirical evidence obtained to demonstrate how the analytical quality metrics and thresholds were optimized.
(C) Device precision data using clinical samples to adequately evaluate intra-run, inter-run, and total variability. The samples must cover all mutation types tested (both positive and negative samples) and include samples near the limit of detection of the device. Precision must be assessed by agreement within replicates on the assay final result for each representative mutation, as applicable, and also supported by sequencing quality metrics for targeted regions across the panel.
(D) Description of the protocols and/or data adequately demonstrating the interchangeability of reagent lots and multiplexing barcodes.
(E) A description of the nucleic acid assay input concentration range and the evidence to adequately support the range.
(F) A description of the data adequately supporting the limit of detection of the device.
(G) A description of the data to adequately support device accuracy using clinical specimens representing the intended specimen type and range of tumor types, as applicable.
(
1 ) Clinical specimens tested to support device accuracy must adequately represent the list of cancer mutations with evidence of clinical significance to be detected by the device.(
2 ) For mutations that are designated as cancer mutations with evidence of clinical significance and that are based on evidence established in the intended specimen type (e.g., tumor tissues) but for a different analyte type (e.g., protein, RNA) and/or a measurement (e.g., incorporating a score or copy number) and/or with an alternative technology (e.g., IHC, RT-qPCR, FISH), evidence of accuracy must include clinically adequate concordance between results for the mutation and the medically established biomarker test (e.g., evidence generated from an appropriately sized method comparison study using clinical specimens from the target population).(
3 ) For qualitative DNA mutations not described in paragraph (b)(1)(iv)(G)(2 ) of this section, accuracy studies must include both mutation-positive and wild-type results.(H) Adequate device stability information.
(v) Information that adequately supports the clinical significance of the panel must include:
(A) Criteria established on what types and levels of evidence will clinically validate a mutation as a cancer mutation with evidence of clinical significance versus a cancer mutation with potential clinical significance.
(B) For representative mutations of those designated as cancer mutations with evidence of clinical significance, a description of the clinical evidence associated with such mutations, such as clinical evidence presented in professional guidelines, as appropriate, with method comparison performance data as described in paragraph (b)(1)(iv)(G) of this section.
(C) For all other mutations designated as cancer mutations with potential clinical significance, a description of the rationale for reporting.
(2) The 21 CFR 809.10 compliant labeling and any product information and test report generated, must include the following, as applicable:
(i) The intended use statement must specify the following:
(A) The test is indicated for previously diagnosed cancer patients.
(B) The intended specimen type(s) and matrix (
e.g., formalin-fixed, paraffin-embedded tumor tissue).(C) The mutation types (
e.g., single nucleotide variant, insertion, deletion, copy number variation or gene rearrangement) for which validation data has been provided.(D) The name of the testing facility or facilities, as applicable.
(ii) A description of the device and summary of the results of the performance studies performed in accordance with paragraphs (b)(1)(iii), (b)(1)(iv), and (b)(1)(v) of this section.
(iii) A description of applicable test limitations, including, for device specific mutations validated with method comparison data to a medically established test in the same intended specimen type, appropriate description of the level of evidence and/or the differences between next generation sequencing results and results from the medically established test (
e.g., as described in professional guidelines).(iv) A listing of all somatic mutations that are intended to be detected by the device and that are reported in the test results under the following two categories or equivalent designations, as appropriate: “cancer mutations panel with evidence of clinical significance” or “cancer mutations panel with potential clinical significance.”
(v) For mutations reported under the category of “cancer mutations panel with potential clinical significance,” a limiting statement that states “For the mutations listed in [cancer mutations panel with potential clinical significance or equivalent designation], the clinical significance has not been demonstrated [with adequate clinical evidence (
e.g., by professional guidelines) in accordance with paragraph (b)(1)(v) of this section] or with this test.”(vi) For mutations under the category of “cancer mutations panel with evidence of clinical significance,” or equivalent designation, link(s) for physicians to access internal or external information concerning decision rules or conclusions about the level of evidence for clinical significance that is associated with the marker in accordance with paragraph (b)(1)(v) of this section.