K Number
DEN190035
Device Name
Helix Laboratory Platform
Manufacturer
Date Cleared
2020-12-23

(509 days)

Product Code
Regulation Number
866.6000
AI/MLSaMDIVD (In Vitro Diagnostic)TherapeuticDiagnosticis PCCP Authorized
Intended Use
The Helix Laboratory Platform is a qualitative in vitro diagnostic device intended for exome sequencing and detection of single nucleotide variants (SNVs) and small insertions and deletions (indels) in human genomic DNA extracted from saliva samples collected with Oragene® Dx OGD-610. The Helix Laboratory Platform is only intended for use with other devices that are germline assays authorized by FDA for use with this device. The device is performed at the Helix laboratory in San Diego, CA.
Device Description
The Helix Laboratory Platform (HLP) is a high throughput DNA sequencing platform for targeted sequencing of an individual's whole exome. It is intended for targeted sequencing of an individual's whole exome for use with a genetic test application. Genetic test applications may be third party partner ("Partner") genetic test applications or a Helix genetic test application such as the Helix Genetic Health Risk App (HRA; k192073). The DNA sequence generated by this device is intended as input to clinical germline DNA assays intended for use with this device that have FDA marketing authorization. A brief overview of the commercialized workflow is shown in Figure 1 (refer to the section titled "Test Principle" for more specific information regarding the commercialized workflow.) HLP consists of a HiSeq sequencing instrument, cBot system, library preparation reagents, sequencing reagents, and data analysis software. The Helix Laboratory Platform also interacts with the Helix Laboratory Automation Systems and Content Mapping Systems which serve as repositories for the data and do not perform data analysis. The test detects single nucleotide variants (SNVs) insertions and deletions (indels) up to 20 base pairs (bp) and is limited to making high-confidence variant calls that meet prespecified quality metrics (i.e., the analytical range) within the reportable range. Sequencing is performed at the Helix clinical laboratory in San Diego. CA.
More Information

No
The document describes a DNA sequencing platform and its performance studies. While it mentions data analysis software and image processing, there is no mention of AI or ML being used in the analysis or any other part of the workflow. The focus is on standard sequencing and variant calling techniques.

No.
This device is an in vitro diagnostic device used for exome sequencing and detection of genetic variants, not for treating or preventing diseases.

Yes

The "Intended Use / Indications for Use" section explicitly states, "The Helix Laboratory Platform is a qualitative in vitro diagnostic device intended for exome sequencing and detection of single nucleotide variants (SNVs) and small insertions and deletions (indels) in human genomic DNA..." The term "in vitro diagnostic device" directly indicates its diagnostic purpose.

No

The device description explicitly states that the Helix Laboratory Platform (HLP) consists of hardware components including a HiSeq sequencing instrument, cBot system, library preparation reagents, and sequencing reagents, in addition to data analysis software.

Based on the provided text, the device is indeed an IVD (In Vitro Diagnostic).

Here's why:

  • Explicit Statement in Intended Use: The very first sentence of the "Intended Use / Indications for Use" section clearly states: "The Helix Laboratory Platform is a qualitative in vitro diagnostic device intended for exome sequencing..."
  • Intended Use aligns with IVD definition: The device is intended for "detection of single nucleotide variants (SNVs) and small insertions and deletions (indels) in human genomic DNA extracted from saliva samples". This involves testing biological specimens (saliva/DNA) outside of the body to provide information about a person's health or condition, which is the core of an IVD.
  • Use in a Clinical Laboratory: The device is performed at the "Helix laboratory in San Diego, CA", which is a clinical laboratory setting.
  • Intended for use with FDA-authorized germline assays: The device is intended to be used with other devices that are "germline assays authorized by FDA for use with this device". This indicates its role in a clinical diagnostic workflow.
  • Performance Studies: The document details extensive performance studies (Accuracy, Precision, Reproducibility, Analytical Specificity, Interfering Substances, Specimen Stability) which are typical for the validation of an IVD.
  • Key Metrics: The document lists key metrics like PPA, NPA, TPPV, PPV, and NPV, which are standard performance measures for diagnostic tests.

Therefore, the description clearly identifies the Helix Laboratory Platform as an In Vitro Diagnostic device.

N/A

Intended Use / Indications for Use

The Helix Laboratory Platform is a qualitative in vitro diagnostic device intended for exome sequencing and detection of single nucleotide variants (SNVs) and small insertions and deletions (indels) in human genomic DNA extracted from saliva samples collected with Oragene® Dx OGD-610. The Helix Laboratory Platform is only intended for use with other devices that are germline assays authorized by FDA for use with this device. The device is performed at the Helix laboratory in San Diego, CA.

Product codes

QNC

Device Description

The Helix Laboratory Platform (HLP) is a high throughput DNA sequencing platform for targeted sequencing of an individual's whole exome. It is intended for targeted sequencing of an individual's whole exome for use with a genetic test application. Genetic test applications may be third party partner ("Partner") genetic test applications or a Helix genetic test application such as the Helix Genetic Health Risk App (HRA; K192073). The DNA sequence generated by this device is intended as input to clinical germline DNA assays intended for use with this device that have FDA marketing authorization. A brief overview of the commercialized workflow is shown in Figure 1 (refer to the section titled "Test Principle" for more specific information regarding the commercialized workflow.) HLP consists of a HiSeq sequencing instrument, cBot system, library preparation reagents, sequencing reagents, and data analysis software. The Helix Laboratory Platform also interacts with the Helix Laboratory Automation Systems and Content Mapping Systems which serve as repositories for the data and do not perform data analysis. The test detects single nucleotide variants (SNVs) insertions and deletions (indels) up to 20 base pairs (bp) and is limited to making high-confidence variant calls that meet prespecified quality metrics (i.e., the analytical range) within the reportable range. Sequencing is performed at the Helix clinical laboratory in San Diego. CA.

Mentions image processing

Yes

Mentions AI, DNN, or ML

Not Found.

Input Imaging Modality

Not Found.

Anatomical Site

Human genomic DNA extracted from saliva samples.

Indicated Patient Age Range

Not Found.

Intended User / Care Setting

Not Found.

Description of the training set, sample size, data source, and annotation protocol

Reference samples were analyzed with (0) of of the bioinformatics pipeline representing different conditions relative to the quality metric criteria (see Section quality metrics above). The control condition retained all of the criteria above and the other | were each removed. Figure 7 shows the PPA, PPV, and NRC performance metrics for SNV, insertion, and deletion variant types in each of the conditions compared to control. The results in Figure 7 show that the removal of any one of the criteria results in reduced accuracy and that retaining all criteria (the control condition) results in the highest accuracy. The T (b)(b) } has the greatest effect on PPA and NRC across all variant types while the DP threshold has the greatest effect on PPV. (Refer to section below for definition of PPA, PPV and NRC).

Description of the test set, sample size, data source, and annotation protocol

A study was conducted to evaluate the accuracy of the HLP with DNA isolated from saliva specimens. Specimens were selected for the presence of clinically relevant variants. A specimen selection protocol was used to select specimens to minimize bias as follows: Each of the samples were sequenced in the Helix lab between July 2017 and April 2018. A list of clinically relevant variants from all variants in ClinVar were identified and filtered for high confidence interpretation. The list was further refined to remove variants outside of Helix reportable range (86 were removed) and from helix blacklist (7 variants removed). The final ClinVar variant list contained 29,492 variants which were screened against the Helix database to determine availability of the variant within the Helix specimen collection. A total of 6, 427 unique variants were identified. The variant list was then randomly chosen from this list. Samples with inadequate saliya volume were removed from analysis. Two (2) samples failed to meet secondary sequencing metrics QC for evaluability based on callability and coverage metrics. Three (3) samples failed to meet secondary sequencing metrics QC based on freemix metric. One (1) sample failed to yield sufficient DNA at OC after library prep (failed at Quant OC - Quant 2). A total of 1002 clinical samples met secondary sequencing metrics QC for evaluability and were used in the final data analyses below.

A validated Sanger sequencing method was used to confirm the accuracy of 1,061 clinical variants in 1.002 clinical samples and 90 unique variants in 96 cell lines. The comparator method was assigned to sequence a region of ~125bp length, which included the variant of interest. The additional sequenced DNA flanking the variant of interest was used to assess the accuracy of the HLP for reference (wild-type sequencing). As a result of this additional sequencing, a total of 103.339 reference calls were extracted as truth for reference sequence at clinical variants sites and sites flanking the variants of interest. The variants are composed of single nucleotide variants, small insertions or small deletions (less than or equal to 20 bp) in clinically relevant genes.

Additionally, 125 clinical variants and reference calls in 96 unique cell line samples were also evaluated. Cell lines were compared to the truth genotypes taken from published variant information. Some of the samples have more than one variant and some of the variants were tested multiple times in different samples.

The study was done according to protocol and results were required to meet quality metric reporting thresholds as described previously.

Summary of Performance Studies

  1. Precision Study - Reference Samples (Cell lines)

    • Study Type: Precision
    • Sample Size: 6 reference cell line samples, each with 72 replicates.
    • Key Results: Passed all acceptance criteria for mean PPA, NPA, and TPPV.
      • SNVs: PPA 99.91%, TPPV 99.93% (all > 99.5%)
      • Insertions: PPA 99.52%, TPPV 99.29% (all > 99.0%)
      • Deletions: PPA 99.59%, TPPV 99.18% (all > 99.0%)
      • PPA for indels > 6bp could be 65% are excluded from reporting.
  2. DNA Input Study

    • Study Type: Limit of Detection / DNA Input
    • Sample Size: 20 samples with known variants tested at 35ng, 50ng, 70ng, and 100ng DNA input, each in triplicate (total 240 samples initially, 219 evaluable).
    • Key Results: Met all acceptance criteria for mean PPA, NPA, and TPPV. Concordant test results for all samples with a DNA input range between 35ng to 70ng. Additional specimen is required when there is less than 50ng.
  3. Index Swapping - Barcoding

    • Study Type: Analytical Specificity
    • Sample Size: 48 saliva samples with known variants, run in triplicate (total 160 libraries, 157 evaluable).
    • Key Results: Mean NRC across the analytical range was 0.999 and for clinical variants it was 1.0.
  4. Index Swapping - Carry-over Study 1 & 2

    • Study Type: Analytical Specificity (Carry-over and Cross-Contamination)
    • Sample Size: Study 1: 2 plates of cell line samples (NA12877, NA12878, NA24143). Study 2: 2 cell line DNAs, 36 replicates per sample (total 72 samples).
    • Key Results: Carryover study 1: Barcode carryover was not observed between runs. Carryover study 2: Intra-run carryover did not impact performance, meeting acceptance criteria.
  5. Endogenous Substances Interference Study

    • Study Type: Analytical Specificity (Interfering Substances)
    • Sample Size: 60 donors, 180 no-treatment libraries, 120 treatment libraries (299 evaluable).
    • Key Results: All conditions passed acceptance criteria, suggesting that normal levels of albumin, amylase, hemoglobin, and IgA in saliva do not interfere with HLP performance.
  6. Exogenous Substances Interference Study

    • Study Type: Analytical Specificity (Interfering Substances)
    • Sample Size: 22 donors, 198 samples processed.
    • Key Results: Drink, chewing gum, and mouthwash groups passed all acceptance criteria. "Immediately after food" group failed to meet acceptance criteria for mean NPA and TPPV due to one sample (manufacturer recommends collecting samples at least 30 minutes after food consumption).
  7. Microbial Interference Study

    • Study Type: Analytical Specificity (Interfering Substances - Bacteria and Yeast)
    • Sample Size: 6 cell line DNA samples tested across 5 conditions (0%, 10%, 20%, 30%, 50% bacterial content), each in triplicate (total 90 samples, 81 evaluable). Also, 3 fresh saliva donors tested with baseline, bacteria spiked-in, and yeast spiked-in, in triplicate (total 27 samples).
    • Key Results: Met all acceptance criteria for mean PPA, NPA, and TPPV. HLP accuracy is not impacted by microbial interference at the levels tested.
  8. Smoking Interference Study

    • Study Type: Analytical Specificity (Interfering Substances)
    • Sample Size: 5 saliva donors, 3 samples each for 3 conditions (total 45 libraries).
    • Key Results: Met all acceptance criteria for mean PPA, NPA, and TPPV, suggesting that smoking does not interfere with performance regardless of time of collection relative to smoking.
  9. Specimen Transport Stability Study

    • Study Type: Specimen Stability (Temperature)
    • Sample Size: 5 saliva donors, 4 samples each (total 20 samples).
    • Key Results: All conditions passed acceptance criteria, suggesting performance is not negatively affected by transport at extreme summer (50+/-5C) and winter (-20+/-5C) temperatures.
  10. Freeze/Thaw Specimen Stability Study

    • Study Type: Specimen Stability (Freeze/Thaw)
    • Sample Size: 17 saliva samples.
    • Key Results: 100% concordance of known variants, indicating HLP performance is not negatively affected by repeat freeze/thaw under the conditions tested.
  11. Accuracy Study 1: Accuracy with Reference Cell Lines

    • Study Type: Accuracy
    • Sample Size: 6 well-characterized cell lines (NA12878, NA24385, NA24149, NA24143, NA24631, NA12877).
    • Key Results: All regions met predefined acceptance criteria with the exception of insertions with size | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rate (>25%) and PPA 6bp are to be independently validated.
  12. Accuracy Study 2: Clinical Specimens

    • Study Type: Accuracy (Sanger Sequencing Comparison)
    • Sample Size: 1,002 clinical samples and 96 unique cell line samples.
    • Key Results: High degree of accuracy for variants evaluated. For 104,398 possible variant calls in clinical samples, there were 2 incorrect calls (false negatives, heterozygous SNVs called as homozygous reference). For cell lines, 2 homozygous calls were incorrectly called heterozygous. The no-call rates ranged from 0.5% (ClinVar Subset, Priority) to 3.0% (Exome).

Key Metrics

  • Non-Reference Concordance (NRC): Agreement when ignoring reference calls. TP/(TP+FP+FN)
  • Positive Percent Agreement (PPA): Ability of the test to correctly identify variants that are present in a sample. TP/(TP+FN)
  • Negative Percent Agreement (NPA): Ability of the test to correctly identify wild-type bases (the probability that the test will not call a variant). TN/(TN+FP)
  • Technical Positive Predictive Value (TPPV): Defined as the probability of a variant calling being a true positive. TP/(TP+FP)
  • Concordance assessment for precision: The degree of agreement between test samples defined as calculating the accuracy for a sample to the reference dataset or gold standard dataset or majority call rate of the same sample. (TP+TN)/(TP+TN+FP+FN)
  • Negative Predictive Value (NPV): Defined as the probability of a variant call being a true negative. TN/(TN +FN)
  • Positive Predictive Value (PPV): Defined as the probability of a variant call being a true positive. TP/(TP+FP)
  • Adjusted Negative Percent Agreement (aNPA): The adjusted Negative Percent Agreement (aNPA) is defined as the ability of the test to correctly identify wild-type bases (the probability that the test will not call a variant). $aNPA = (NPV*(1-p)) / (NPV*(1-p) + (1-PPV)*p)$ where p is the sampling prevalence, i.e. p = percentage of people positive for at least one variant in the set of tested variants. This value was calculated to adjust for ascertainment bias given clinical samples are previously ascertained using HLP.
  • Adjusted Positive Percent Agreement (aPPA): The adjusted Positive Percent Agreement (aPPA) is defined as the ability of the test to correctly identify polymorphic variants that are present in a sample. $aPPA =(PPVp) / (PPVp+(1-NPV)*(1-p))$ where p is the sampling prevalence, i.e. p = percentage of people positive for at least one variant in the set of tested variants. This value was calculated to adjust for ascertainment bias given clinical samples are previously ascertained using HLP.

Predicate Device(s)

K192920

Reference Device(s)

K192073

Predetermined Change Control Plan (PCCP) - All Relevant Information

Change Protocols: Change protocols specify the verification and validation activities that will be performed for anticipated bioinformatic software modifications to reevaluate performance claims or performance specifications that were reviewed and determined acceptable by FDA. The protocol included a list of specific changes, the specimens (type, number and variant representation), analytical methods, statistical analysis methods, and acceptance criteria, for determination that the modifications met the performance specifications of the HLP. Assessment of the risk and the impact on results was also provided and included the processes for communicating to developers of downstream clinical genetic tests, the impact of the bioinformatics software change on the whole exome sequencing constituent system genetic data output.

§ 866.6000 Whole exome sequencing constituent device.

(a)
Identification. A whole exome sequencing constituent device is for germline whole exome sequencing of genomic deoxyribonucleic acid (DNA) isolated from human specimens. The DNA sequence generated by this device is intended as input for clinical germline DNA assays that have FDA marketing authorization and are intended for use with this device.(b)
Classification. Class II (special controls). The special controls for this device are:(1) The intended use on the device's label and labeling required under § 809.10 of this chapter must include:
(i) The indicated variant types for which acceptable, as determined by FDA, validation data has been provided. Distinct variant types are considered as single nucleotide variant, insertion, deletion, tandem repeats, copy number variants, or gene rearrangements, and validated for specific sizes and lengths, as applicable.
(ii) The indicated specimen type(s) for which acceptable, as determined by FDA, validation data has been provided.
(2) The labeling required under § 809.10(b) of this chapter must include:
(i) The identification of, or the specifications for, the collection device or devices to be used for sample collection, as applicable.
(ii) A description of the reportable range, which is the region of the genome for which the assay is intended to provide results, as well as a description of the targeted regions of the genome that have enhanced coverage. This must include a description of any genomic regions that are excluded from the reportable region due to unacceptable risk of erroneous results, or for other reasons. A description of the clinically relevant genes excluded from the reportable range must also be included, if applicable.
(iii) A description of the design features and control elements, including the quality metrics and thresholds which are used for reporting the analytical range (the genomic DNA in the reportable range that passed the quality metrics in the run required for reporting to the user) that are incorporated into the testing procedure, that mitigate the risk of incorrect clinical results. The following metrics are considered applicable in the generation of high confidence data and the established thresholds for these metrics for reporting must be described and be determined to be acceptable by FDA: cluster density and percent of cluster pass quality filter, percent of bases meeting the minimum base quality score, average coverage of reads, percent of reads mapped on target, percent of reportable region with coverage meeting the minimum requirement, percent of unassigned read indices, percent of reads for non-human DNA, allele fraction, and strand bias. Any alternate metrics used must be described and an acceptable, as determined by FDA, rationale for applicability must be provided.
(iv) A representative sample of the device output report(s) provided to users, which must include any relevant limitations of the device, as determined applicable by FDA.
(3) Design verification and validation must include:
(i) A detailed description of the impact of any software, including software applications and hardware-based devices that incorporate software, on the device's function.
(ii) Acceptable data, as determined by FDA, demonstrating how the key quality metrics and quality metric thresholds in the list in paragraph (b)(2)(iii) of this section for reporting were established and optimized for accuracy using appropriate DNA standards with established reference genomic sequence. Data must include, as applicable, base quality score, allele fraction for heterozygosity and coverage, and other applicable metrics.
(iii) Data demonstrating acceptable, as determined by FDA, analytical device performance using patient specimens representing the full spectrum of expected variant types reported across the genome and in genomic regions that are difficult to sequence. The number of specimens tested must be sufficient to obtain estimates of device performance that are representative of the device performance that can be expected for the reportable region and clinically relevant subsets of the reportable region, as applicable. For each study, data must include a summary of the key quality metric data; the number and percentage of true positives (TP), false positives (FP), and false negatives (FN); number and percentage of no-calls; positive percent agreement (PPA); negative percent agreement (NPA); positive predictive value (PPV); technical positive percent value (TPPV); and non-reference concordance (NRC). These data must be provided per sample and stratified by variant type. The variant data must also be further stratified by size and zygosity (homozygous common allele, heterozygous, homozygous rare allele). Data demonstrating the accuracy assay based on guanine and cytosine (GC) content, pseudogenes, and proximity to short tandem repeats must also be presented. The data must be presented for the entire exome and also for clinically relevant subsets of the reportable region. For each study, the number of run failures and repeat/requeued specimens must be summarized.
(iv) Documentation of acceptance criteria that are applied to analytical and clinical validation studies, which must be justified based on the estimated risk of erroneous results on clinically significant genes and variants and must be clinically acceptable, as determined by FDA. The acceptance criteria must be pre-specified prior to clinical and analytical validation studies, and all validation testing results must be documented with respect to those acceptance criteria.
(v) Analytical validation must be demonstrated by conducting studies that provide:
(A) Data demonstrating acceptable, as determined by FDA, accuracy based on agreement with an acceptable, as determined by FDA, comparator method(s) that has been validated to have high accuracy and reproducibility. Accuracy of the test shall be evaluated with reference standards and clinical specimens for each indicated specimen type of a number determined acceptable by FDA, collected and processed in a manner consistent with the test's instructions for use.
(B) Data demonstrating acceptable, as determined by FDA, precision from a precision study using clinical samples to adequately evaluate intra-run, inter-run, and total variability across operator, instrument, lot, day, and site, as applicable. The samples must include the indicated range of DNA input. Precision, including repeatability and reproducibility, must be assessed by agreement between replicates, and also supported by sequencing quality metrics for targeted regions across the panel. Precision must be demonstrated per specimen and in aggregate. Precision data must be calculated and presented with and without no calls/invalid results.
(C) Data demonstrating acceptable, as determined by FDA, accuracy in the presence of clinically relevant levels of potential interfering substances that are present in the specimen type and intended use population, including, for example, endogenous substances, exogenous substances, and microbes, as applicable.
(D) Data demonstrating the absence of sample cross contamination due to index swapping (misassignment).
(E) Data demonstrating that the pre-analytical steps such as DNA extraction are robust such that sources of variability in these steps and procedures do not diminish the accuracy and precision of the device.
(F) Data demonstrating that acceptable, as determined by FDA, device performance is maintained across the range of claimed DNA input concentrations for the assay.
(vi) Design verification and validation for software within the whole exome sequencing constituent device must include the following:
(A) Detailed description of the software, including specifications and requirements for the format of data input and output, such that users can determine if the device conforms to user needs and intended uses.
(B) Device design must include a detailed strategy to ensure cybersecurity risks that could lead to loss of genetic data security, are adequately addressed and mitigated (including device interface specifications and how safe reporting of the genetic test is maintained when software is updated). Verification and validation must include security testing to demonstrate effectiveness of the associated controls.
(C) Device design must ensure that a record of critical events, including a record of all genetic test orders using the whole exome sequencing constituent device, device malfunctions, and associated acknowledgments, is stored and accessible for an adequate period to allow for auditing of communications between the whole exome sequencing constituent device and downstream clinical genetic tests, and to facilitate the sharing of pertinent information with the responsible parties for those devices.
(vii) A protocol reviewed and determined acceptable by FDA, that specifies the verification and validation activities that will be performed for anticipated bioinformatic software modifications to reevaluate performance claims or performance specifications. This protocol must include a process for assessing whether a modification to the bioinformatics software could significantly affect the safety or effectiveness of the device. The protocol must include assessment metrics, acceptance criteria, and analytical methods for the performance testing of changes, as applicable. The protocol must also include the process for communicating to developers of downstream clinical genetic tests the impact of the bioinformatics software change on the whole exome sequencing constituent system genetic data output so they may implement appropriate corresponding actions.

0

EVALUATION OF AUTOMATIC CLASS III DESIGNATION FOR Helix Laboratory Platform DECISION SUMMARY

A. DEN Number:

DEN190035

B. Purpose for Submission:

De Novo request for evaluation of automatic class III designation for the Helix Laboratory Platform

C. Measurands:

Single nucleotide variants, insertions, and deletions in whole exome sequence in human genomic DNA

D. Type of Test:

Qualitative whole exome sequencing

E. Applicant:

Helix OpCo, LLC

F. Proprietary and Established Names:

Helix Laboratory Platform

G. Regulatory Information:

    1. Regulation section:
      21 CFR 866.6000
    1. Classification:
      Class II
    1. Product code(s):
      QNC
    1. Panel:
      88- Pathology

1

H. Indications for use:

    1. Indications for use:
      The Helix Laboratory Platform is a qualitative in vitro diagnostic device intended for exome sequencing and detection of single nucleotide variants (SNVs) and small insertions and deletions (indels) in human genomic DNA extracted from saliva samples collected with Oragene® Dx OGD-610. The Helix Laboratory Platform is only intended for use with other devices that are germline assays authorized by FDA for use with this device. The device is performed at the Helix laboratory in San Diego, CA.
    1. Special conditions for use statement(s):
      For prescription use

For in vitro diagnostic use

    1. Special instrument requirements:
      Illumina HiSeq X Sequencer (qualified by Helix)

Oragene®.Dx OGD-610 (DNA Genotek, Inc; K192920)

I. Device Description:

The Helix Laboratory Platform (HLP) is a high throughput DNA sequencing platform for targeted sequencing of an individual's whole exome. It is intended for targeted sequencing of an individual's whole exome for use with a genetic test application. Genetic test applications may be third party partner ("Partner") genetic test applications or a Helix genetic test application such as the Helix Genetic Health Risk App (HRA; K192073). The DNA sequence generated by this device is intended as input to clinical germline DNA assays intended for use with this device that have FDA marketing authorization. A brief overview of the commercialized workflow is shown in Figure 1 (refer to the section titled "Test Principle" for more specific information regarding the commercialized workflow.) HLP consists of a HiSeq sequencing instrument, cBot system, library preparation reagents, sequencing reagents, and data analysis software. The Helix Laboratory Platform also interacts with the Helix Laboratory Automation Systems and Content Mapping Systems which serve as repositories for the data and do not perform data analysis. The test detects single nucleotide variants (SNVs) insertions and deletions (indels) up to 20 base pairs (bp) and is limited to making high-confidence variant calls that meet prespecified quality metrics (i.e., the analytical range) within the reportable range. Sequencing is performed at the Helix clinical laboratory in San Diego. CA.

2

Image /page/2/Figure/0 description: This image shows the steps of a process. The first step is product selection, where the customer is presented with a set of products to choose from on the Helix website. The second step is the order and checkout process, where the customer completes the ordering and checkout process on Helix.com, receives an order confirmation email, and Helix ships a saliva collection kit to the customer. The third step is account creation, where the customer creates an account with Helix and, if selecting a partner product, also creates an account with the partner. The fourth step is kit registration and sample collection, where Helix provides step-by-step instructions for the customer to register the kit online and collect a saliva sample, the customer mails their sample to the Helix lab, and the customer receives sample sequencing status updates via email. The fifth step is sequencing and data upload, where Helix sequences the sample and, if a partner product, VCF data is uploaded to the partner account. The sixth step is report generation and delivery, where the owner of the genetic test interprets variant data and generates a report, the customer receives notification that results are available, and the customer logs in to view the report.

Figure 1. Workflow Relationship of Helix and "Partner" Genetic Tests Overview

The HLP instruments, reagents and software are qualified by Helix and are comprised of the following:

1. Specimen Collection and DNA Preparation

Saliva is collected in the Oragene® Dx OGD-610 collection device. At least 1 mL saliva sample must be collected. DNA is extracted using Maxwell HT Saliva DNA Prep. Extracted DNA concentration must be equal or greater than 3.5 ng/yL. A total of 50 to 70ng of DNA input is used for the library preparation. DNA may be stored at -20℃ for up to two weeks.

2. Library Preparation

Library preparation consists of five key steps: Fragmentation, Adapter addition by polymerase chain reaction (PCR), Hybridization capture, PCR amplification, and Library normalization. In the first step, purified genomic DNA is broken into small pieces by enzymatic fragmentation, generating overlapping small DNA fragments tiling the entire human genome. Library fragments are sizeselected to enable paired-end sequencing with between read pairs. Sequencing adapters are then added by PCR. DNA samples are barcoded using a dual-indexing strategy with barcode sequences that require multiple sequencing errors to become ambiguous. Quality Control (QC) is performed on the purified libraries by DNA quantitation. Purified DNA libraries are combined into a multiplex for enrichment by hybrid-capture. This process includes enrichment using probes to capture the genomic regions of interest. The DNA concentration of the enriched libraries is determined and normalized to allow equal loading of libraries on the flow cell along with clustering reagents into a cBot instrument for cluster generation. The enriched libraries are then clustered on the sequencing flow cell and loaded into the DNA sequencing instrument. Sample concentration of equal or greater than will pass quality control. The cBot System is an automated system that creates clonal clusters from single molecule DNA templates, preparing them for sequencing by synthesis (SBS) on the HiSeq instrument. The cBot isothermally amplifies cDNA fragments that

De novo Summary (DEN190035)

3

have been captured by complementary adapter oligonucleotides covalently bound to the surface of flow cells (this is called the Cluster Generation Process).

Different sets of reagents are used for this process. HiSeq DNA sequencing reagents, cluster generation reagents and a flow cell are obtained from Illumina and qualified by Helix. The Helix Exome+ 8 reagents includes library preparation, sample indexing reagents and the capture probe reagents that targets the whole exome.

3. Sequencing and Data Analysis

When cluster generation is complete, the flow cell is inserted with SBS reagents into a HiSeq instrument to perform paired-end DNA sequencing with (014) read lengths. SBS technology uses four fluorescently labeled nucleotides to sequence the (014) of clusters on the flow cell surface in parallel. During sequencing, images are captured from the flow cell by the HiSeq instrument and processed through primary analysis after each sequencing cvcle. Primary analysis is performed by the RTA (Real Time Analysis) software without user intervention. This analysis consists of base calling of each cluster at each cycle. Reads are filtered to require a minimum rate of high-quality base calls. Flow cells that do not meet a minimum vield of filtered reads will undergo additional sequencing. The bcl2fastq software de-multiplexes the data and generates sample specific FASTQ files. Next-generation sequencing (NGS) secondary analysis is performed by the Helix bioinformatics pipeline to process base calls into genomic variant calls, starting with FASTO files. The Helix bioinformatics pipeline is hosted in the cloud and analyzes the targeted human genome sequencing data. The pipeline consists of software analytical tools for short read alignment (Aligner), variant calling (Variant Caller), variant refinement (Variant Refinement), and quality control (Quality Control). The Helix bioinformatics pipeline is a suite of both proprietary and opensource software programs for high-throughput processing and analysis of sequence data. Individual major components of the Helix bioinformatics pipeline are described in Figure 2. Quality control metrics of varying depth and resolution are incorporated into several checkpoints along the data processing and analysis pipeline. The Helix bioinformatics inputs are as follows:

  • . Metadata: De-identified sample identifier, self-reported sex and age, and the molecular barcode assigned to the library
  • . Sequencing reads: The set of sequencing reads generated for the sample by the Helix Laboratory Platform
  • . Annotation data: BED files specifying the reportable range, short tandem repeats (STRs), and known variant truth data for control sample concordance analysis
  • . Reference genome: The "Helix Reference Genome" based on the Human Reference Genome Consortium Build 38 assembly with additional alternative contigs as described in detail in the Helix Reference Genome section in the Helix Laboratory Platform User Manual.

4

Figure 2. High-level modular and functional overview of the bioinformatics pipeline.

Image /page/4/Picture/1 description: The image is a solid gray color. There is a small text string at the top of the image that says '(b)(4)'. The image is a rectangle shape. The gray color fills the entire image.

The analysis steps include read alignment, variant caller, variant refinement and quality control.

  • a) Alignment: A software aligns short-read nucleotide sequences from FASTQ files to the Helix Reference Genome to generate aligned nucleotide sequences and associated mapping quality data. The nucleotide sequences in the FASTQ files are aligned to the human reference genome GRCh38 as (0) 4)
    (b)(e)

The human reference genome, GRCh38, is a human genome reference (b)(4) generated by the Genome Reference Consortium. Reference for the human oral microbiome is obtained from the Human Oral Microbiome Database (HOMD) and (b)(0)

1579 not present within HOMD. from the ATCC® MSA-1002™ panel genomes. Aligned reads are then processed to flag any reads that may be polymerase chain reaction (PCR) or optical (hardware limitation of the sequencer) duplicates. Duplicate marking is necessary to ensure that duplicates are not treated as independent evidence of a genomic sequence. The aligned sequence reads, with the associated base and mapping quality information, are stored in a standard file format, such as a binary alignment mapping file (BAM) or the compressed columnar file format CRAM. These formats are community accepted formats for storing the aligned nucleotide sequences and their corresponding quality scores. They also serve as input files for the Variant Caller.

  • b) Variant Calling: The variant calling software uses existing OTS software. The Variant Caller

5

is able to detect single nucleotide variants (SNVs) and small (≤ 20 nucleotides) insertion or deletion variants (indels). Genotype calls are stored in an industry standard format such as variant call format (VCF) or its binary format BCF and genomic VCF (gVCF).

The VCF includes the following fields:

  • Chromosome .
  • Start Coordinate .
  • End Coordinate, if not equal to start + length of reference allele 1 (1-based inclusive . numbering, following the gVCF standard).
  • . Reference Allele: Reference base
  • . Alternate Allele: Non-reference base(s) called in the sample, if any
  • . Genotype: This field indicates the alleles carried by the sample.
  • . Additional information:
    • Genotype Likelihood: The likelihood of each of the possible genotypes given observed data
    • Genotype Quality: Difference in genotype likelihood between the most likely and o second-most likely genotypes. Higher numbers represent higher-confidence genotype calls.
    • Read Depth: The number of reads at a position o
    • · Allele Depth: The number of reads observed for each allele

Each genomic locus will have a genotype call with an associated genotype likelihood. The genotype likelihood is a standard probability-based score used to evaluate the likelihood of a genotype call at a given locus conditioned on observed data. The observed data includes the aligned sequence data with associated quality data. More specifically, at each locus, the algorithm will count the number of occurrences of each distinct nucleotide from the aligned nucleotide sequences. The number of aligned sequencing reads represents the "coverage" at a given locus. This coverage data is combined with associated mapping quality, base call quality, and other prior information to generate a genotype call with an associated genotype likelihood. The genotype likelihood can be used to determine if there is insufficient information for a confident call (resulting in a no-call). Two separate genotype call files are created by the variant caller. The first file provides reference calls and the second file provides finalized posterior genotype likelihoods.

  • c) Variant Refinement: This software analytic tool uses existing open-source libraries and internal code to perform additional processing on variant and reference calls produced during the Variant Calling step to generate the final observed variant call output. Variant refinement first merges the two variant data files generated by the Variant Caller for the sample into a single variant data file by merging records in the file that represent adjacent reference calls and records in the file that represent overlapping variant calls. This is followed by a sex inference using a statistical model based on the counts of fragments mapping to chromosomes X and Y compared to fragments mapping to the autosomes. This model also accounts for possible chromosome Y loss using the self-reported age range of the individual when it is available. Furthermore, it performs ploidy correction as the Variant Caller assumes that all contigs are diploid. The ploidy correction process will retain variant calls that are inconsistent with the expected ploidy with sufficiently high quality.

6

  • d) Variant Scrubbing: Metadata for individual files are collapsed. Additional curation against empirically determined "blacklist" and "whitelist" genomic regions with intrinsically poor mapping quality, known polymorphism (e.g. HLA loci), and significant violation of Hardy-Weinberg equilibrium (i.e., non-neutral selective pressures). Haplotype (i.e., phase set) information is calculated.
    1. Sequencing Quality Control methods:

Quality Control (OC) methods are incorporated into the Helix bioinformatics pipeline at every step. Quality control methods include Sample-level OC for each region. In addition, the metrics are monitored and used for root cause analysis if samples have a OC failure. These include: (0)43) (67.9)

A set of QC metrics are applied to distinct regions of the reportable range. These are collectively referred to as "callability". Callability = 1-No-call rate, is calculated base by base and ensures that the no call rate is acceptable. The sample must pass all thresholds listed in Table 1. For example, the Priority white list has a callability metric of 10/4/1 |which is a no call-rate of [0] If any of the thresholds is not met, the sample(s) are routed back to the appropriate restarting point - library preparation, pooling for enrichment or cluster generation - to be re-tested. Samples may be retested up to times.

Table 1. Secondary Analysis Quality Control (OC) Metrics for Sample Evaluation
----------------------------------------------------------------------------------------------
MetricThreshold
(b)(4)

5. Content Mapping Systems

  • a) The Helix Interpretation Module (HIM) is a mapping tool that does not perform data analytics. Rather, it maps the end user's variant data against the reporting rules
    o identify and match the correct variant call data with
    the correct clinical result for that customer (also known as the user). In the case of the Helix
    Genetic Health Risk (K192073),
(b)(4)
(b)(4)
(b)(4)

The Helix Laboratory Platform provides data to the partner in one of two ways (1) variant call data (VCD) on the genetic variants for a predefined set of genetic coordinates for their genetic test indication. The VCD includes information that is typically contained in a Variant Call

7

Format (VCF or BCF) file but is transferred securely to the partner via an API. The partner app will review the VCD and make the interpretation and generate the report or, (2) The Helix Laboratory Platform provides the final genotype (b) 3

The partner generates their final report using the content 15751 from the JSON file.

In both scenarios, Helix provides data only to the owner of the genetic test. As the sequencing laboratory, the Helix Laboratory Platform will provide data to partners as described under scenario 2.

Image /page/7/Figure/3 description: The image contains two gray rectangles with red borders. The top rectangle is labeled with the letter 'b)' on the left and '(b)(4)' in the center. The bottom rectangle is labeled with the letter 'c)' on the left and '(b)(4)' in the center. The rectangles are similar in size and shape.

6. Software (Laboratory Automation Sub Systems and User Interfaces)

The Helix Laboratory Automation sub-systems are used to automate physical sample data tracking. The Helix Laboratory Automation system's purpose is to track samples from the point at which they enter the clinical laboratory, through sample processing and sequencing workflow steps, and to results delivery (either to a customer for the Helix Genetic Health Risk App or to a Helix third party partner). The various sub-systems assess the quality of the runs and provide information about sample queuing to laboratory personnel and partners. The sub-systems include the Accessioning Subsystem, Laboratory Information Management System (LIMS), Pre-and Post-review sub-systems, Data Delivery Review sub-system, Helix Partner API sub-system and Genomic Data Service sub-system.

7. Controls

A Process Control is defined as the DNA reference material with known sequence that can be used to determine the success of a sequencing run. The NA12878 process controls used by HLP are cell lines from the Coriell Institute for Medical Research. These two cell lines are required in the

8

Helix workflow and are introduced prior to library prep and carried through enrichment and sequencing. The Process Control workflow is as follows:

Image /page/8/Figure/1 description: The image shows three redacted sections with the text "(b)(4)" appearing in each. Below the redacted sections is the text "Table 2. Secondary analysis metrics for process control evaluation". The image appears to be a table with sensitive information that has been redacted.

MetricThreshold
(b)(4)
    1. Definitions of the Genomic Regions Reported by the HLP:
    • a) Reportable Range: Helix has defined the Reportable Range as all coding exons ("coding region" also referred to as "white list") minus prespecified regions that are not reported (referred to as "black list") which is defined as a list of regions in the HLP that are either not covered by the reagents or are excluded from reporting because they have empirically observed

1

9

elevated false positive rates as defined by deviations from Hardy-Weinberg equilibrium. The reportable range excludes the following:

  • Regions outside the coding regions. .
  • Regions that are difficult to map or have poor mapping quality, defined as regions where . fewer than 20% of reads have a mapping quality greater than 20.
  • Regions with highly polymorphic gene clusters (portions of the HLA loci, . immunoglobulin-like receptors on chr19 and Golgin family on chr15)
  • Sites with variants called that deviate significantly from Hardy-Weinberg equilibrium .
  • . Regions with > 25% of bases with quality score 99.5%, and PPA 99.52% and TPPV 99.29% for insertions and PPA 99.59% and TPPV 99.18% for deletions when considered in aggregate, however PPA for indels > 6bp could be =20x
    Coding | Mean
    Percent

=20x
Mendeli-
ome | Mean
Percent
=20x
Priority | Average
Coverage
Coding | Average
Coverage
Mendeli-
ome | Average
Coverage
Priority |
|---------------|-------------------------------|----------------------------------------|---------------------------------|------------------------------------|---------------------------------------------|--------------------------------------|-------------------------------|----------------------------------------|---------------------------------|
| NA12877 | | | | | (b)(4) | | | | |
| NA12878 | | | | | | | | | |
| NA24143 | | | | | | | | | |
| NA24149 | | | | | | | | | |
| NA24385 | | | | | | | | | |
| NA24631 | | | | | | | | | |
| SF-1926-57155 | | | | | | | | | |
| SF-2911-23958 | | | | | | | | | |
| SF-3016-50811 | | | | | | | | | |
| SF-3171-12777 | | | | | | | | | |

De novo Summary (DEN190035)

47

| Sample Name | Mean
Callability

Coding | Mean
Callability

Mendeli-
ome | Mean
Callability

Priority | Mean
Percent

=20x
Coding | Mean
Percent
=20x
Mendeli-
ome | Mean
Percent
=20x
Priority | Average
Coverage

Coding | Average
Coverage

Mendeli-
ome | Average
Coverage

Priority |
|---------------|------------------------------------|---------------------------------------------|--------------------------------------|------------------------------------|---------------------------------------------|--------------------------------------|------------------------------------|---------------------------------------------|--------------------------------------|
| SF-3724-39960 | | | | | (b)(4) | | | | |
| SF-3762-53211 | | | | | | | | | |
| SF-4908-78821 | | | | | | | | | |
| SF-5309-54434 | | | | | | | | | |
| SF-5727-76022 | | | | | | | | | |
| SF-6482-77281 | | | | | | | | | |
| SF-7909-95368 | | | | | | | | | |
| SF-8249-10361 | | | | | | | | | |
| SF-8270-35552 | | | | | | | | | |
| SF-8472-70262 | | | | | | | | | |
| SF-8478-76128 | | | | | | | | | |
| SF-8568-55453 | | | | | | | | | |
| SF-9559-80394 | | | | | | | | | |
| SF-9984-59520 | | | | | | | | | |

The mean PPA and mean PPA lower bound, mean TPPV, mean TPPV lower bound, mean NPA and mean NPA lower bound for each library prep lot, enrichment lot, clustering lots, and sequencing lots and for select ClinVar variants, which also included mean NRC analytical range and mean NRC clinical variants were calculated. All pre-defined study acceptance criterial were met for SNVs, insertions and deletions combined. Acceptance criteria were not established for data when sorted into size ranges, however, as highlighted in the table and consistent with the precision studies, insertion calling (primarily) @(4) |in some samples was less accurate (Table 19-21).

Table 19. Between-Lot Reproducibility -Coding Region; Cell Lines and Clinical Specimens

| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV |
|---------------|-----------------|-------------------|--------------------------------------|---------------------------|--------|--------|----|-----|------|
| NA12877 | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | SNV (hemi) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | Deletion | All | | | | (b)(4) | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV |
| NA12878 | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | Insertion | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | Deletion | 3-5 | | | | | | | |
| | 6-20 | | | | | | | | |
| | All | | | | | | | | |
| NA24143 | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | Insertion | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | Deletion | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| NA24149 | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | Insertion | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | Deletion | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| NA24385 | | 1-2 | | | | | | | |
| | Insertion | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV |
| NA24631 | Deletion | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | Deletion | 1-2 | | | | | | | |
| | | | 3-5 | | | | | | |
| | | | 6-20 | | | | | | |
| SF-1926-57155 | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | Deletion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | All | | | | | | | | |
| SF-2911-23958 | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | SNV (hemi) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | Deletion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | SF-3016-50811 | SNV (het) | All | | | | | | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV |
| | SNV (hom) | All | | | | | | | |
| | SNV (hemi) | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | Insertion | 6-20 | | | | | | | |
| | | All | | | | (0)(4) | | | |
| | | 1-2 | | | | | | | |
| | Deletion | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | SNV (hemi) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| SF-3171-12777 | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | Deletion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| SF-3724-39960 | Insertion | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | Deletion | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| SF-4908-78821 | SNV (hemi) | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | Insertion | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV |
| SF-4908-78821 | Deletion | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| SF-4908-78821 | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | SNV (hemi) | All | | | | | | | |
| | | Insertion | 1-2 | | | | | | |
| 3-5 | | | | | | | | | |
| 6-20 | | | | | | | | | |
| All | | | | | | | | | |
| SF-5309-54434 | Deletion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| SF-5309-54434 | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | | Insertion | 1-2 | | | | | | |
| | | | 3-5 | | | | | | |
| 6-20 | | | | | | | | | |
| All | | | | | | | | | |
| SF-5727-76022 | Deletion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| SF-5727-76022 | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | SNV (hemi) | All | | | | | | | |
| | | Insertion | 1-2 | | | | | | |
| 3-5 | | | | | | | | | |
| 6-20 | | | | | | | | | |
| All | | | | | | | | | |
| SF-5727-76022 | Deletion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | | A11 | | | | (b)(4) | | | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV |
| SF-6482-77281 | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | Deletion | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| SF-7909-95368 | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | SNV (hemi) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | Deletion | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| SF-8249-10361 | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | SNV (hemi) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | Deletion | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| SF-8270-35552 | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | (b)(4) | | | | | | | | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV |
| | | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | Deletion | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | Insertion | 6-20 | | | | | | | |
| SF-8472-70262 | | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | Deletion | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| SF-8478-76128 | | All | | | | | | | |
| | Deletion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | SNV (hemi) | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| SF-8568-55453 | Insertion | 6-20 | | | | | | | |
| | | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | Deletion | 6-20 | | | | | | | |
| | | All | | | | | | | |
| SF-9559-80394 | SNV (het) | All | | | | | | | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV |
| | SNV (hom) | All | | | (b)(4) | | | | |
| | SNV (hemi) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | Deletion | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| SF-9984-59520 | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | Deletion | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | All | | | | | | | | |

48

49

50

51

52

53

54

Table 20. Between-Lot Reproducibility - Mendeliome; Cell Lines and Clinical Specimens

| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV | |
|---------------|-----------------|-------------------|--------------------------------------|---------------------------|--------|--------|----|-----|--------|--|
| NA12877 | SNV (het) | All | | | | (b)(4) | | | | |
| | SNV (hom) | All | | | | | | | | |
| | SNV (hemi) | All | | | | | | | | |
| | Insertion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| NA12878 | SNV (het) | All | | | | | | | | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV | |
| | SNV (hom) | All | | | | | | | | |
| | | 1-2 | | | | | | | | |
| | Insertion | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | | 1-2 | | | | | | | | |
| | Deletion | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| NA24143 | | 1-2 | | | | | | | | |
| | Insertion | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| NA24149 | | 1-2 | | | | | | | | |
| | Insertion | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | All | | | | | | | | | |
| | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| NA24385 | | 1-2 | | | | | | | | |
| | Insertion | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV | |
| NA24631 | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | Insertion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | | 6-20 | | | | | | | |
| SF-1926-57155 | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | Insertion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | | 6-20 | | | | | | | |
| SF-2911-23958 | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | SNV (hemi) | All | | | | | | | | |
| | Insertion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | All | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| SF-3016-50811 | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | SNV (hemi) | All | | | | | | | | |
| | Insertion | 1-2 | | | | | | | | |
| | | All | | | | | | | | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| SF-3171-12777 | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | SNV (hemi) | All | | | | | | | | |
| | Insertion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| SF-3724-39960 | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | Insertion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | SF-3762-53211 | SNV (het) | All | | | | | | | |
| SNV (hom) | | All | | | | | | | | |
| SNV (hemi) | | All | | | | | | | | |
| Insertion | | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| Deletion | | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | | | (b)(4) | | | | | | |
| | | | | | | | | | | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV | |
| SF-4908-78821 | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | SNV (hemi) | All | | | | | | | | |
| | Insertion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | All | | | | | | | | | |
| Deletion | 1-2 | | | | | | | | | |
| | 3-5 | | | | | | | | | |
| | 6-20 | | | | | | | | | |
| | All | | | | | | | | | |
| SF-5309-54434 | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | Insertion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | All | | | | | | | | | |
| Deletion | 1-2 | | | | | | | | | |
| | 3-5 | | | | | | | | | |
| | 6-20 | | | | | | | | | |
| | All | | | | | | | | | |
| SF-5727-76022 | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | SNV (hemi) | All | | | | | | | | |
| | Insertion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | All | | | | | | | | | |
| Deletion | 1-2 | | | | | | | | | |
| | 3-5 | | | | | | | | | |
| | 6-20 | | | | | | | | | |
| | All | | | | | | | | | |
| SF-6482-77281 | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | Insertion | 1-2 | | | | | | | | |
| | | | | | | | | | | |
| b | | | | | | | | | | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV | |
| | | 3-5 | | | | (2)(4) | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | | 1-2 | | | | | | | | |
| | Deletion | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | SNV (hemi) | All | | | | | | | | |
| SF-7909-95368 | Insertion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | SNV (hemi) | All | | | | | | | | |
| SF-8249-10361 | Insertion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | | 1-2 | | | | | | | | |
| SF-8270-35552 | Insertion | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | | 1-2 | | | | | | | | |
| | Deletion | 3-5 | | | | | | | | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV | |
| SF-8472-70262 | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | Insertion | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | Deletion | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| SF-8478-76128 | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | Insertion | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | Deletion | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| SF-8568-55453 | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | SNV (hemi) | All | | | | | | | | |
| | | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | Insertion | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | Deletion | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| SF-9559-80394 | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | SNV (hemi) | All | | | | | | | | |
| | | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | Insertion | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | Deletion | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | (b)(4) | |
| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV | |
| | | 3-5 | | | (b)(4) | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| SF-9984-59520 | SNV (het) | All | | | | | | | | |
| | SNV (hom) | All | | | | | | | | |
| | Insertion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |
| | Deletion | 1-2 | | | | | | | | |
| | | 3-5 | | | | | | | | |
| | | 6-20 | | | | | | | | |
| | | All | | | | | | | | |

De novo Summary (DEN190035)

55

56

57

58

59

60

61

Table 21: Between-Lot Reproducibility - Priority; Cell Lines and Clinical Specimens

| Sample Name | Variant
Type | Variant
Length | Number
of
Expected
Variants | Number
of No-
calls | TP | FP | FN | PPA | TPPV |
|-------------|-----------------|-------------------|--------------------------------------|---------------------------|----|--------|----|-----|------|
| NA12877 | SNV (het) | All | | | | (b)(4) | | | |
| | SNV (hom) | All | | | | | | | |
| | SNV (hemi) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | | 6-20 | | | | | | | |
| | Deletion | All | | | | | | | |
| | | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| NA12878 | SNV (het) | All | | | | | | | |
| | SNV (hom) | All | | | | | | | |
| | Insertion | 1-2 | | | | | | | |
| | | 3-5 | | | | | | | |
| | Deletion | All | | | | | | | |
| | | 1-2 | | | | | | | |

62

3-5
All
SNV (het)All
SNV (hom)All
1-2
3-5
NA24143Insertion6-20
All
Deletion1-2
3-5
All
SNV (het)All
SNV (hom)All
1-2
NA24149Insertion3-5
All
Deletion1-2
3-5
All
SNV (het)All
SNV (hom)All
1-2
NA24385Insertion3-5
All
1-2
Deletion3-5
All
SNV (het)All
SNV (hom)All
1-2
NA24631Insertion3-5
All
Deletion1-2
3-5
All
SNV (het)All
SF-1926-57155SNV (hom)All
1-2
Insertion3-5
All
Deletion1-2
3-5
All
SNV (het)All
SNV (hom)All
SNV (hemi)All
1-2
SF-2911-23958Insertion3-5
All
1-2
Deletion3-5
All
SNV (het)All
SNV (hom)All
SNV (hemi)All
1-2
SF-3016-50811Insertion3-5
All
1-2
Deletion3-5
All
SNV (het)All
SNV (hom)All
SNV (hemi)All
1-2
SF-3171-12777Insertion3-5
All
Deletion1-2
3-5
All
SNV (het)All
SNV (hom)All
1-2
SF-3724-39960Insertion3-5
All
1-2
Deletion3-5
All
SNV (het)All
SF-3762-53211SNV (hom)All
SNV (hemi)All
Insertion1-2
3-5
All
SF-4908-78821Deletion1-2
3-5
All
SNV (het)All
SNV (hom)All
SNV (hemi)All
Insertion1-2
3-5
6-20
SF-5309-54434All
Deletion1-2
3-5
All
SNV (het)All
SNV (hom)All
SNV (hemi)All
Insertion1-2
3-5
6-20
SF-5727-76022All
Deletion1-2
3-5
All
SNV (het)All
SNV (hom)All
SNV (hemi)All
Insertion1-2
3-5
All
SF-6482-77281Deletion1-2
3-5
All
SNV (het)All
SNV (hom)All
1-2
Insertion3-5
All
Deletion1-2
3-5
All
SF-7909-95368SNV (het)All
SNV (hom)All
SNV (hemi)All
Insertion1-2
3-5
6-20
DeletionAll
1-2
3-5
SF-8249-10361SNV (het)All
SNV (hom)All
SNV (hemi)All
Insertion1-2
3-5
DeletionAll
1-2
3-5
SF-8270-35552SNV (het)All
SNV (hom)All
Insertion1-2
3-5
6-20
DeletionAll
1-2
3-5
All
SF-8472-70262SNV (het)All
SNV (hom)All
Insertion1-2
3-5
DeletionAll
1-2
3-5
SF-8478-76128SNV (het)All
SNV (hom)All
SNV (hemi)All
Insertion1-2
3-5
SF-8568-55453Deletion1-2
3-5
All
SNV (het)All
SNV (hom)All
SNV (hemi)All
Insertion1-2
3-5
6-20
All
SF-9559-80394Deletion1-2
3-5
All
SNV (het)All
SNV (hom)All
SNV (hemi)All
Insertion1-2
3-5
All
SF-9984-59520Deletion1-2
3-5
All
SNV (het)All
SNV (hom)All
Insertion1-2
3-5
All

63

64

65

66

iv. Precision - GC content

The performance of variant calling based on the GC content of sequence context in coding regions was examined across the following ranges of GC content: 0-15%,>15-40%,>40-65%, >65-90%, and >90-100% in two data sets from the Precision and Between-Lot reproducibility studies. The mean PPA and mean TPPV for each sample is shown below (Table 22, 23). All data passed the specifications with the exception of indel calling in some samples, however this was due to indel size and not due to GC content. The suboptimal performance of insertion calling for GC content at range of >15-40 was confounded with size of 6-20bp indels (Figure 12). Nonetheless, Indels in regions with GC

De novo Summary (DEN190035)

67

content >65% are excluded from reporting as these specimens were excluded from the data analyses. Acceptance criteria were the same as for all the studies and data less than the overall acceptance criteria is shown in bold below.

| Sample
Name | GC
content
range | Variant Type | Number
of
Variants
in
Reference
Data | Mean
Percentage
of Variants
in Reference
Data with
No-Call | Mean
PPA | Mean
TPPV |
|-------------------|------------------------|--------------|-----------------------------------------------------|---------------------------------------------------------------------------|-------------|--------------|
| NA12877 | 0-15 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >15-40 | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | >90-
100 | SNV (het) | | | |
| NA12878 | 0-15 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >15-40 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >40-65 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| Sample
Name | GC
content
range | Variant Type | Number
of
Variants
in
Reference
Data | Mean
Percentage
of Variants
in Reference
Data with
No-Call | Mean
PPA | Mean
TPPV |
| NA24143 | >90-
100 | SNV (hom) | | (b)(4) | | |
| | >90-
100 | SNV (het) | | | | |
| | >90-
100 | SNV (hom) | | | | |
| | 0-15 | DELETION | | | | |
| | 0-15 | INSERTION | | | | |
| | 0-15 | SNV (het) | | | | |
| | 0-15 | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | >15-40 | INSERTION | | | | |
| | >15-40 | SNV (het) | | | | |
| | >15-40 | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | >40-65 | INSERTION | | | | |
| | >40-65 | SNV (het) | | | | |
| | >40-65 | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | >65-90 | SNV (hom) | | | | |
| | >90-
100 | SNV (het) | | | | |
| | >90-
100 | SNV (hom) | | | | |
| NA24149 | 0-15 | DELETION | | | | |
| | 0-15 | INSERTION | | | | |
| | 0-15 | SNV (het) | | | | |
| | 0-15 | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | >15-40 | INSERTION | | | | |
| | >15-40 | SNV (het) | | | | |
| | >15-40 | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | >40-65 | INSERTION | | | | |
| | >40-65 | SNV (het) | | | | |
| | >40-65 | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | >65-90 | SNV (hom) | | | | |
| | >90-
100 | SNV (het) | | | | |
| | >90-
100 | SNV (hom) | | | | |
| Sample
Name | GC
content
range | Variant Type | Number
of
Variants
in
Reference
Data | Mean
Percentage
of Variants
in Reference
Data with
No-Call | Mean
PPA | Mean
TPPV |
| NA24385 | 0-15 | DELETION | | (b)(4) | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >90-
100 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| NA24631 | 0-15 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >90-
100 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| SF-1070-
81401 | 0-15 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| Sample
Name | GC
content
range | Variant Type | Number
of
Variants
in
Reference
Data | Mean
Percentage
of Variants
in Reference
Data with
No-Call | Mean
PPA | Mean
TPPV |
| | >15-40 | DELETION | | (b)(4) | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| SF-1901-
15675 | 0-15 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| SF-2711-
11665 | 0-15 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| Sample
Name | GC
content
range | Variant Type | Number
of
Variants
in
Reference
Data | Mean
Percentage
of Variants
in Reference
Data with
No-Call | Mean
PPA | Mean
TPPV |
| | >40-65 | SNV (hom) | | (b)(4) | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >40-65 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | >65-90 | SNV (hom) | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | 0-15 | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | 0-15 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| SF-2757-
45105 | >15-40 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | >40-65 | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | 0-15 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | 0-15 | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| SF-2856-
29005 | >15-40 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >15-40 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | >40-65 | SNV (hom) | | | | |
| Sample
Name | GC
content
range | Variant Type | Number
of
Variants
in
Reference
Data | Mean
Percentage
of Variants
in Reference
Data with
No-Call | Mean
PPA | Mean
TPPV |
| SF-3002-
89172 | >65-90 | SNV (het) | | (b)(4) | | |
| | | SNV (hom) | | | | |
| | | INSERTION | | | | |
| | 0-15 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >15-40 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >40-65 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| SF-3312-
24549 | 0-15 | INSERTION | | | | |
| | | | SNV (het) | | | |
| | | | SNV (hom) | | | |
| | | DELETION | | | | |
| | >15-40 | INSERTION | | | | |
| | | | SNV (het) | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| SF-3607-
46036 | 0-15 | DELETION | | | | |
| | | | INSERTION | | | |
| | | | SNV (het) | | | |
| | | | SNV (hom) | | | |
| | >15-40 | DELETION | | | | |
| | | | INSERTION | | | |
| | | | SNV (het) | | | |
| Sample
Name | GC
content
range | Variant Type | Number
of
Variants
in
Reference
Data | Mean
Percentage
of Variants
in Reference
Data with
No-Call | Mean
PPA | Mean
TPPV |
| SF-5044-
81848 | >40-65 | SNV (hom) | | (b)(4) | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | INSERTION | | | | |
| | 0-15 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| SF-5221-
11824 | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | >15-40 | SNV (hemi) | | | |
| | | | SNV (het) | | | |
| | | | SNV (hom) | | | |
| | | >40-65 | DELETION | | | |
| | | INSERTION | | | | |
| Sample
Name | GC
content
range | Variant Type | Number
of
Variants
in
Reference
Data | Mean
Percentage
of Variants
in Reference
Data with
No-Call | Mean
PPA | Mean
TPPV |
| SF-5730-
56578 | | SNV (hemi) | | (b)(4) | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >90-
100 | SNV (het) | | | | |
| | 0-15 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| SF-5744-
44790 | 0-15 | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| Sample
Name | GC
content
range | Variant Type | Number
of
Variants
in
Reference
Data | Mean
Percentage
of Variants
in Reference
Data with
No-Call | Mean
PPA | Mean
TPPV |
| SF-7430-
13589 | >65-90 | SNV (hemi) | | (b)(4) | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | 0-15 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >15-40 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >40-65 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| SF-7804-
33749 | 0-15 | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >15-40 | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >40-65 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| Sample
Name | GC
content
range | Variant Type | Number
of
Variants
in
Reference
Data | Mean
Percentage
of Variants
in Reference
Data with
No-Call | Mean
PPA | Mean
TPPV |
| SF-8038-
20902 | >65-90 | SNV (hemi) | | | | |
| | | SNV (het) | | (b)(4) | | |
| | | SNV (hom) | | | | |
| | | INSERTION | | | | |
| | | 0-15 | SNV (het) | | | |
| | | | SNV (hom) | | | |
| | | | DELETION | | | |
| | | INSERTION | | | | |
| | | >15-40 | SNV (het) | | | |
| | | | SNV (hom) | | | |
| | | | DELETION | | | |
| | >40-65 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | | SNV (hom) | | | |
| | >65-90 | SNV (het) | | | | |
| | | | SNV (hom) | | | |
| | | DELETION | | | | |
| | 0-15 | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >15-40 | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| SF-8085-
16718 | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | >40-65 | INSERTION | | | |
| | | | SNV (hemi) | | | |
| | | | SNV (het) | | | |
| | | | SNV (hom) | | | |
| | | >65-90 | SNV (hemi) | | | |
| | | | SNV (het) | | | |
| | | 0-15 | SNV (hom) | | | |
| | | INSERTION | | | | |
| Sample
Name | GC
content
range | Variant Type | Number
of
Variants
in
Reference
Data | Mean
Percentage
of Variants
in Reference
Data with
No-Call | Mean
PPA | Mean
TPPV |
| SF-8763-
29117 | >15-40 | SNV (hemi) | | (b)(4) | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| >40-65 | SNV (hemi) | | | | | |
| | SNV (het) | | | | | |
| | SNV (hom) | | | | | |
| | DELETION | | | | | |
| | INSERTION | | | | | |
| >65-90 | SNV (hemi) | | | | | |
| | SNV (het) | | | | | |
| | SNV (hom) | | | | | |

Table 22: Variant calling Performance Based on GC Content in the Precision studies

68

69

70

71

72

73

74

75

76

77

Table 23: Variant calling Performance Based on GC Content in the Between-Lot Reproducibility Study

| Sample
Name | GC
content
range | Variant
Type | Number of
Variants
in
Reference
Data | Mean
Number of
False
Negatives | Mean
PPA | Mean
TPPV |
|-------------------|------------------------|-----------------|--------------------------------------------------|-----------------------------------------|-------------|--------------|
| NA12877 | 0-15 | INSERTION | | (b)(4) | | |
| | 0-15 | SNV (het) | | | | |
| | 0-15 | SNV (hom) | | | | |
| | 0-15 | DELETION | | | | |
| NA12877 | >15-40 | INSERTION | | | | |
| | >15-40 | SNV (hemi) | | | | |
| | >15-40 | SNV (het) | | | | |
| | >15-40 | SNV (hom) | | | | |
| NA12877 | >40-65 | DELETION | | | | |
| Sample
Name | GC
content
range | Variant
Type | Number of
Variants
in
Reference
Data | Mean
Number of
False
Negatives | Mean
PPA | Mean
TPPV |
| NA12878 | >65-90 | INSERTION | | (b)(4) | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >90-100 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | 0-15 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| INSERTION | | | | | | |
| SNV (het) | | | | | | |
| | SNV (hom) | | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >90-100 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| NA24143 | 0-15 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >15-40 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >40-65 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | | | | | |
| Sample
Name | GC
content
range | Variant
Type | Number of
Variants
in
Reference
Data | Mean
Number of
False
Negatives | Mean
PPA | Mean
TPPV |
| NA24149 | >65-90 | SNV (het) | | b | | |
| | >90-
100 | SNV (hom) | | | | |
| | 0-15 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >15-40 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >40-65 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >90-
100 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| NA24385 | 0-15 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | >40-65 | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >40-65 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | SNV (het) | | | | |
| Sample
Name | GC
content
range | Variant
Type | Number of
Variants
in
Reference
Data | Mean
Number of
False
Negatives | Mean
PPA | Mean
TPPV |
| NA24631 | >90-
100 | SNV (hom) | | (b)(4) | | |
| | 0-15 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >15-40 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >90-
100 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| SF-1926-
57155 | 0-15 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | SNV (het) | | | | | |
| | SNV (hom) | | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| SF-2911-
23958 | 0-15 | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| Sample
Name | GC
content
range | Variant
Type | Number of
Variants
in
Reference
Data | Mean
Number of
False
Negatives | Mean
PPA | Mean
TPPV |
| | >15-40 | DELETION | | (b)(4) | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | >65-90 | SNV (hom) | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | 0-15 | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| SF-3016-
50811 | >15-40 | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >40-65 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | SNV (hemi) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | >90-
100 | SNV (het) | | | |
| SF-3171-
12777 | 0-15 | DELETION | | | | |
| | | INSERTION | | | | |
| Sample
Name | GC
content
range | Variant
Type | Number of
Variants
in
Reference
Data | Mean
Number of
False
Negatives | Mean
PPA | Mean
TPPV |
| SF-3724-
39960 | >15-40 | SNV (hemi) | | (b)(4) | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >40-65 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >65-90 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | 0-15 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| SF-3762-
53211 | >15-40 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >40-65 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >90-
100 | SNV (het) | | | | |
| | 0-15 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| Sample Name | GC content range | Variant Type | Number of Variants in Reference Data | Mean Number of False Negatives | Mean PPA | Mean TPPV |
| | | SNV (hom) | | (b)(4) | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >15-40 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >40-65 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | SNV (hemi) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | 0-15 | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| SF-4908-
78821 | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >40-65 | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | INSERTION | | | | |
| SF-5309-
54434 | 0-15 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| Sample
Name | GC
content
range | Variant
Type | Number of
Variants
in
Reference
Data | Mean
Number of
False
Negatives | Mean
PPA | Mean
TPPV |
| SF-5727-
76022 | >40-65 | INSERTION | | (b)(4) | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >40-65 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | 0-15 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | SNV (hom) | | | | | |
| SF-6482-
77281 | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | SNV (hom) | | | | | |
| | >65-90 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >90-
100 | SNV (hom) | | | | |
| | 0-15 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | >15-40 | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | SNV (het) | | | | | |
| | SNV (hom) | | | | | |
| Sample
Name | GC
content
range | Variant
Type | Number of
Variants
in
Reference
Data | Mean
Number of
False
Negatives | Mean
PPA | Mean
TPPV |
| SF-7909-
95368 | >40-65 | DELETION | | (b)(4) | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | INSERTION | | | | |
| | 0-15 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| SF-8249-
10361 | >40-65 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | 0-15 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| >40-65 | DELETION | | | | | |
| | INSERTION | | | | | |
| | SNV (hemi) | | | | | |
| | SNV (het) | | | | | |
| Sample
Name | GC
content
range | Variant
Type | Number of
Variants
in
Reference
Data | Mean
Number of
False
Negatives | Mean
PPA | Mean
TPPV |
| SF-8270-
35552 | >65-90 | SNV (hom) | | (b)(4) | | |
| | | SNV (hemi) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >90-
100 | SNV (het) | | | | |
| | 0-15 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| >15-40 | SNV (het) | | | | | |
| | SNV (hom) | | | | | |
| | DELETION | | | | | |
| >40-65 | INSERTION | | | | | |
| | SNV (het) | | | | | |
| | SNV (hom) | | | | | |
| | SNV (het) | | | | | |
| >65-90 | SNV (hom) | | | | | |
| >90-
100 | SNV (het) | | | | | |
| SF-8472-
70262 | 0-15 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >40-65 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| 0-15 | INSERTION | | | | | |
| Sample
Name | GC
content
range | Variant
Type | Number of
Variants
in
Reference
Data | Mean
Number of
False
Negatives | Mean
PPA | Mean
TPPV |
| SF-8478-
76128 | >15-40 | SNV (het) | | (b)(4) | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >40-65 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >90-
100 | SNV (het) | | | | |
| SF-8568-
55453 | 0-15 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | >40-65 | SNV (hom) | | | | |
| | | SNV (hemi) | | | | |
| | >65-90 | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | SNV (hemi) | | | | |
| | >90-
100 | SNV (het) | | | | |
| SF-9559-
80394 | 0-15 | INSERTION | | | | |
| | | SNV (hemi) | | | | |
| Sample Name | GC content range | Variant Type | Number of Variants in Reference Data | Mean Number of False Negatives | Mean PPA | Mean TPPV |
| | >15-40 | SNV (het) | | (b)(4) | | |
| | | SNV (hom) | | | | |
| | | DELETION | | | | |
| | | INSERTION | | | | |
| | >40-65 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >65-90 | SNV (hemi) | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | | SNV (hemi) | | | | |
| SF-9984-
59520 | 0-15 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | | SNV (hom) | | | | |
| | >15-40 | DELETION | | | | |
| | | INSERTION | | | | |
| | | SNV (het) | | | | |
| | >40-65 | SNV (hom) | | | | |
| | | DELETION | | | | |
| | >65-90 | INSERTION | | | | |
| | | SNV (het) | | | | |
| | >90-
100 | SNV (hom) | | | | |
| | | SNV (het) | | | | |

78

79

80

81

82

83

84

85

86

87

88

Figure 12. Performance based on the GC content and indel size in the Precision and Between-

89

Lot Reproducibility Studies

(DK4)
  • b. Linearity/assay Reportable Range:
    Not applicable.

  • c. Detection Limit

    • i. Limit of Detection

Not applicable. The HLP has a requirement for a minimum of 50ng of DNA.

  • ii. DNA Input
    To assess the impact of different levels of DNA input on the HLP, 20 samples with known variants (single nucleotide variants, small insertions or small deletions) in medically related genes were tested across 4 input conditions spanning 10-fold below and 2 fold above the input amount of 50ng. In total, 20 samples were tested at 35ng, 50ng, 70ng and 100ng DNA input, each in triplicate, for a total of 240 samples (20 unique samples x 4 conditions x 3 replicates/condition = 240 samples). Of these 240 samples failed to meet secondary sequencing metrics QC for evaluability leaving 219 samples for use in the final data analyses. Replicates of the 50ng and 70ng (the optimal range of DNA input) input samples were used to determine the majority call at each position in the reportable range in order to generate a reference sequence for each sample. Sensitivity (PPA), specificity

90

(NPA) and precision (TPPV) were calculated for each sample and the mean of these values for all samples per condition was determined (Table 24).

This study met all acceptance criteria for mean PPA, NPA and TPPV. The study vielded concordant test results for all samples with a DNA input range between 35 to 70 ne. Additional specimen is required when there is less than 50ng.

| Group
Name | Number of
Replicates | Mean
PPA | Mean PPA
Lower
Bound | Mean NPA | Mean NPA
Lower
Bound | Mean TPPV | Mean
TPPV
Lower Bound |
|---------------|-------------------------|-------------|----------------------------|----------|----------------------------|-----------|-----------------------------|
| 100 ng | 55 | 0.999797 | 0.999704 | 1.000000 | 0.999999 | 0.999721 | 0.999615 |
| 35 ng | 54 | 0.999756 | 0.999658 | 1.000000 | 0.999999 | 0.999712 | 0.999608 |

Table 24: Comparison to Majority Call PPA (SNV and Indels)

d. Traceability

The Helix Reference Genome is based on the Human Reference Genome Consortium Build 38 assembly with additional alternative contigs, including decov sequences.

e. Analytical Specificity

i. Index Swapping -Barcoding

Sample index primers are used in the HLP to assign a unique barcode combination to each sample DNA during library preparation, allowing for the ability to pool multiple samples downstream for enrichment and sequencing processes. The HLP uses 2- unique index primers which are combined into 96 unique combinations (12x8) for sample assignment. Index swapping (also referred to as index hopping, index misassignment) can occur between samples and cause the reporting of incorrect DNA sequence data between patients. The percentage of index swapping occurring with the Helix platform and its impact on sequence accuracy was investigated.

The performance evaluated agreement between replicates of DNA isolated from 48 saliva samples with known variants (40 deletions, 9 insertions and 81 SNVs) and across the analytical range as a measure of whether index swapping affected accuracy. In total, 48 samples were run in triplicate, each down three columns of a library plate, testing overlapping indices across a row. Eight (8) of those samples were tested another two times, varying the position of sample by row to test overlapping indices down a column. A total of 160 libraries (40x3 plus 8x5) were generated. Performance was evaluated by a replicate pairwise comparison. Due to sample failure, one index comparison could not be made. A total of 163 pairwise comparisons met acceptance criteria for NRC. Only data passing the quality metrics (callability) and other contamination evaluation metrics that are applied prior to reporting. Of 160 total samples, 157 were used in the analyses; 2

91

samples failed to meet callability and coverage metrics and a third sample was flagged during a software-based error check. The mean NRC across the analytical range was 0.999 and for clinical variants it was 1.0; one SNV was a no call.

ii. Index Swapping - Carry-Over and Cross- Contamination

The Helix laboratory platform follows a high-throughput workflow that processes increasing numbers of multiplexed samples in each subsequent step of the HLP workflow. As enriched library pools undergo flow cell clustering and sequencing on the cBot2 and HiSeq instruments, respectively, it may be possible that carryover of barcoded DNAs may be transferred when subsequent samples are run on the same instrument. The HLP uses a software that verifies whether the reads in a particular file match previously known genotypes for an individual (or group of individuals) and checks whether the reads are contaminated as a mixture of two or more samples to estimate possible contamination between samples and exclude samples containing a mixture of DNA from multiple individuals. The quality metric for this evaluation is that they must be less than (DKS)

  • (1) Carryover study 1: This study evaluated the potential for inter-run carryover DNA introduced at the clustering and sequencing steps of the HLP workflow.
    A total of four cell line DNAs with a known reference sequence was used to create two plates of samples for library prep and enrichment. Plate 1 consists of 36 replicates of each NA12877 and NA12878 arranged in a checkerboard pattern, with alternating High DNA input (100 ng) into library prep or Low DNA input into library prep in adjacent wells. Plate 2 contained 48 replicates of NA24143 at standard production input of 70ng in rows C - F, and 24 replicates of NA24695 in rows G-H. Sample NA24695 is a female sample and is included on Plate 2 as filler to allow a full pool of 72 samples contributing to the sequencing pool from this plate. NA24695 will not be included in any analyses for this protocol.

Each plate of samples produced one sequencing pool. The sequencing pool of samples from Plate 1 were clustered on a single flow cell and sequenced on a single HiSeq. The sequencing pool from Plate 2 were clustered on a single flow cell using the same cBot as for Plate 1, and immediately after the Plate 1 clustering event was complete. The Plate 2 flow cell was sequenced immediately after the Plate 1 flow cell sequencing is complete, using the same HiSeq as the Plate 1 flow cell.

There were 48 index pairs used on both Plates 1 and 2. If inter-run carryover occurred, it is possible that the reads from Plate 1 could be assigned to a sample on Plate 2 that has the same index pair. The samples used in this study (Plate 1 = NA12878, NA12877; Plate 2 = NA24143) have known reference sequences. Plate 2 samples (NA24143) were evaluated for non-reference concordance (NRC) compared to the known reference data for variant sites that differ between the Plate 1 and Plate 2 samples. The 95% lower bound for all results show all samples had NRC >99.5% (data not shown). Samples on Plate 1 use 24 index pairs that are not

92

used on Plate 2. Barcode carryover between runs was determined by calculating the amount, if any, of the 24 index pairs from Plate 1 that may be present in Plate 2 data. Barcode carryover was not observed between runs.

  • (2) Carryover study 2: This study evaluated the potential for intra-run carryover DNA and its impact on the performance of the Helix Laboratory Platform.
    Two cell line DNAs with known variants were used to create 36 replicates per sample, for a total of 72 samples tested (2 cell lines x 36 replicates per cell line = 72 libraries). DNAs were used to create a checkerboard pattern with DNA input amounts of either High DNA input (100ng) or Low DNA input (50ng) in adjacent wells, so that any spillover from high to low will be detected as contamination. Each sample was indexed with its own unique barcode. The analysis was conducted to detect whether DNA fragments from High DNA input are carried over to the Low DNA input, and to determine if carryover would affect performance of the Helix Laboratory Platform. Three analyses were performed.

Image /page/92/Picture/3 description: The image is a gray rectangle with a red border. There is text in the center of the top border that says "(b)(4)". The rectangle takes up most of the image.

All 36 female samples met secondary sequencing metric OC for evaluability. Performance was evaluated by calculating the amount of chromosome Y carryover into female samples and by calculating contamination using the Freemix value. Intra-run carryover protocol samples processed through the HLP workflow met the acceptance criteria for this study, indicating that intra-run carryover does not impact the performance of the Helix Laboratory Platform.

iii. Interfering Substances

(1) Endogenous Substances

A study was conducted to evaluate the potential impact of four endogenous substances common to saliva specimens (alpha-amylase, hemoglobin, IgA and total protein (albumin)) as potential sources of interference. To assess the impact of these four endogenous substances as an interfering substance on the Helix Laboratory Platform, the performance of the HLP was evaluated on donor saliva

93

samples after the addition of 4 different endogenous substances as compared to no treatment samples from the same donors.

Sixty (60) donors submitted two saliva samples each for this study which were then combined and aliquoted into 3 separate tubes. Donors were separated into 2 groups; one group received amylase and hemoglobin and the other group received IgA and albumin. For each group, a donor aliquot was processed without the addition of a substance (no treatment) and the two additional aliquots each had different endogenous substances added (amylase and hemoglobin, or IgA and albumin). Each of the no treatment samples had DNA extracted and were tested in triplicate. The treatment samples had DNA extracted and were tested as one replicate.

In total, 180 no treatment libraries (60 donors x 3 replicates/extracted DNA) and 120 treatment (amylase, hemoglobin, IgA and albumin) libraries (30 donors/group x 2 treatments/group x 2 groups) were generated for enrichment and sequencing on the Helix Laboratory Platform. Of the 300 samples in the study, 299 met secondary sequencing metrics OC for evaluability and were used in the final data analyses for this study. The one sample that failed QC metrics was not replaced and was not used in data analysis, as per protocol.

A reference for each saliva sample was generated by majority call comparison over all control replicates of a sample in this study. The number of SNVs, Insertions. Deletions, Reference Bases and Bases with No Maiority was calculated for each sample. Majority call for each donor was used as a reference for comparison to the corresponding treated samples (albumin, amylase. hemoglobin, IgA).

Each sample in albumin, amylase, hemoglobin and IgA groups was compared to reference (majority call of all no treatment replicates for each sample) and concordance was calculated. Mean values of NPA, PPA and TPPV were calculated for all samples combined per condition. The NPA is calculated using only SNV; PPA and TPPV is calculated using SNV and indels combined. The acceptance criteria are: NPA ≥ 99.99%; PPA ≥ 99.5%; TPPV > 99.5%; all with a 95% confidence interval lower bound of 99.0%.

All conditions passed the Acceptance Criteria suggesting that normally occurring levels of albumin, amylase, hemoglobin and IgA groups in saliva does not interfere with HLP performance (Table 25).

| Interferent | Number of
Replicates | Mean
PPA | Mean PPA
Lower
Bound | Mean NPA | Mean NPA
Lower
Bound | Mean
TPPV | Mean TPPV
Lower
Bound |
|-------------|-------------------------|-------------|----------------------------|----------|----------------------------|--------------|-----------------------------|
| Albumin | 30 | 0.9997 | 0.9996 | 1.0000 | 0.9999 | 0.9996 | 0.9995 |
| Amylase | 30 | 0.9997 | 0.9996 | 1.0000 | 0.9999 | 0.9996 | 0.9995 |

Table 25. Endogenous Interferents

De novo Summary (DEN190035)

94

Hemoglobin290.99970.99961.00000.99990.99960.9995
IgA300.99970.99961.00000.99990.99960.9995

(2) Exogenous Substances

To assess the impact of 4 exogenous substances (eating, drinking, chewing gum and mouthwash) as an interfering substance on the Helix Laboratory Platform, the performance of the HLP was evaluated on saliva samples either prior to (control). immediately after or 30 minutes after consumption of an exogenous substance. The samples collected prior to consumption were used to construct a reference by majority call across all replicates from each donor before consumption samples. The performance of the HLP was evaluated by comparing the majority call reference to the data for samples collected immediately after consumption; food, drink, chewing gum, mouthwash, and 30 minutes after consumption: food, drink, chewing gum, and mouthwash. Table 26 describes the number of biosamples at the start of the study and the number of evaluable biosamples that passed OC metrics.

| Condition Tested | Number of Biosamples | Number of Evaluable
Biosamples |
|-----------------------------------|----------------------|-----------------------------------|
| 30 minutes after drink | | |
| 30 minutes after food | | |
| 30 minutes after gum | | |
| 30 minutes after mouthwash | | |
| Before Consumption - drink | | |
| Before Consumption - food | | |
| Before Consumption - gum | | |
| Before Consumption -
mouthwash | | |
| Immediately after drink | | |
| Immediately after food | | |
| Immediately after gum | | |
| Immediately after mouthwash | | (b)(4) |

Table 26. Post-Sequencing Sample Evaluability for Exogenous Substances
--------------------------------------------------------------------------------

Twenty (20) initial donors (5 per treatment group) submitted three samples each for this study, for a total of 60 saliva samples. An additional two donors were added for the food group because two samples failed to vield enough purified DNA from the saliva sample. In sum, there were 22 donors and 66 saliva samples collected for this study. In total, 198 samples (22 donors x 3 treatments/donor x 3 replicates/extracted DNA), were processed on the Helix Laboratory Platform. Performance was evaluated by comparing the post consumption samples to the corresponding donor majority call reference for concordance.

95

Sensitivity (PPA), specificity (NPA) and precision (TPPV) were calculated for each replicate and the mean of these values for all samples per condition was compared to the acceptance criteria. Three groups, drink, chewing gum and mouthwash, passed all acceptance criteria at all timepoints. In the 'immediately after food' group (Condition 2). mean NPA and TPPV failed to meet acceptance criteria. This failure is attributed to one poorly performing sample which failed acceptance criteria for NPA and TPPV. The 30 minutes after food sample from this donor passed acceptance criteria, and the mean values for all samples 30 min after food passed all acceptance criteria. If this sample is removed from the group calculation, the mean NPA and TPPV pass acceptance criteria (Table 27).

These results indicate that saliva samples should be collected at least 30 minutes after consuming food, as recommended by the manufacturer (DNA Genotek), and as per the collection device instructions.

| Condition Tested | Mean
PPA | Mean PPA
Lower
Bound | Mean
NPA | Mean
NPA
Lower
Bound | Mean
TPPV | Mean
TPPV
Lower
Bound |
|--------------------------------|-------------|----------------------------|-------------|-------------------------------|--------------|--------------------------------|
| 30 minutes after drink | 0.9997 | 0.9996 | 1.0000 | 1.0000 | 0.9997 | 0.9996 |
| 30 minutes after food | 0.9997 | 0.9996 | 1.0000 | 0.9999 | 0.9996 | 0.9994 |
| 30 minutes after gum | 0.9997 | 0.9996 | 1.0000 | 0.9999 | 0.9996 | 0.9995 |
| 30 minutes after mouthwash | 0.9997 | 0.9996 | 1.0000 | 0.9999 | 0.9996 | 0.9995 |
| Immediately after drink | 0.9997 | 0.9996 | 1.0000 | 0.9999 | 0.9997 | 0.9995 |
| Immediately after food | 0.9988 | 0.9986 | 0.9999 | 0.9999 | 0.9362 | 0.9355 |
| Immediately after gum | 0.9997 | 0.9996 | 1.0000 | 0.9999 | 0.9997 | 0.9995 |
| Immediately after
mouthwash | 0.9997 | 0.9996 | 1.0000 | 0.9999 | 0.9997 | 0.9995 |

Table 27. Exogenous Interference
----------------------------------------

(3) Microbial Interference

To assess the impact of bacteria and yeast as interfering substances on the Helix Laboratory Platform, bacterial DNA from a commercial source (American Type Culture Collection) was added in various amounts into 6 cell line DNA samples. This bacterial sample is comprised of 20 fully sequenced cultures that were selected by the vendor to encompass a variety of characteristics including different gram stain. GC content, genome size and spore formation capabilities. In total, 6 cell lines were tested across 5 conditions, 0%. 10%, 20%, 30%, 50% bacterial content, each in triplicate, for a total of 90 samples. Performance was evaluated by comparing test samples to known truth sample (GIAB or Platinum Genomes) for all evaluable samples (N=81). This study met all acceptance criteria for mean PPA, NPA and TPPV (Table 28 and Table 29).

96

| %
microbes | Numb
er of
replic
ates | Mean
PPA | Mean PPA
Lower
Bound | Mean
NPA | Mean
NPA
Lower
Bound | Mean
TPPV | Mean TPPV
Lower
Bound |
|---------------|---------------------------------|-------------|----------------------------|-------------|-------------------------------|--------------|-----------------------------|
| 0% | 18 | 0.999691 | 0.999601 | 0.9999 | 0.9999 | 0.999475 | 0.999358 |
| 10% | 18 | 0.999671 | 0.999571 | 0.9999 | 0.9999 | 0.999498 | 0.999374 |
| 20% | 18 | 0.999667 | 0.999558 | 0.9999 | 0.9999 | 0.999478 | 0.999342 |
| 30% | 16 | 0.999685 | 0.999568 | 0.9999 | 0.9999 | 0.999498 | 0.999352 |
| 50% | 11 | 0.999622 | 0.999466 | 0.9999 | 0.9999 | 0.999531 | 0.999357 |

Table 28. Accuracy of Cell line Under Varying Concentrations of Microbial Interference

The impact of interference for clinically relevant variants was evaluated by Clin Var subset analysis of clinical variants found within the Helix reportable range (all likely-pathogenic and pathogenic variants reported in Clin Var that are present in the Helix reportable range after filtering). All likely-pathogenic and pathogenic variants reported in ClinVar that are present in the Helix reportable range after filtering plus additional clinical variants ( @) ( variant positions total) were assessed for concordance with the known reference data of each cell line (Table 29). The results demonstrated that the HLP accuracy is not impacted by microbial interference.

| %
microbes | Number of
Replicates | Mean NRC Analytical
Range | Mean NRC Clinical
Variants |
|---------------|-------------------------|------------------------------|-------------------------------|
| 0% | 18 | 0.999166 | 1.000000 |
| 10% | 18 | 0.999170 | 1.000000 |
| 20% | 18 | 0.999146 | 1.000000 |
| 30% | 16 | 0.999183 | 1.000000 |
| 50% | 11 | 0.999153 | 1.000000 |

Table 29. ClinVar subset analysis

In addition, fresh saliva from three (3) donors was used to test three conditions each: 1) baseline with no microbe interferent added, 2) bacteria spiked-in, and 3) yeast spiked-in. Samples were tested in triplicate for a total of 27 samples. The actual amount of microbial contamination in the fresh saliva samples used in this study is estimated. Thirty percent (30%) bacterial DNA was added to the saliva DNA to mimic high levels of contamination, resulting in the estimated final proportion of bacterial DNA to be approximately 40%. Ten percent (10%) C. albicans DNA was added to the saliva DNA to mimic high levels of yeast contamination.

97

Replicates of the no treatment samples (no spike-in) were used to determine the majority call at each position in the reportable range in order to generate a reference genome sequence for each donor. Performance was evaluated by comparing the treatment samples (bacteria spike-in and veast spike-in) to the corresponding donor majority call reference for concordance. The NPA was calculated using only SNV; PPA and TPPV is calculated using SNV and indels combined. The acceptance criteria are: NPA > 99.99%: PPA > 99.5%: TPPV ≥ 99.5%; all with a 95% confidence interval lower bound of 99.0%. All conditions passed the Acceptance Criteria suggesting that bacterial and yeast at the levels tested do not interfere with performance (Table 30).

| Spike-
in | Numb
er of
replic
ates | Mean
PPA | Mean PPA
Lower
Bound | Mean
NPA | Mean
NPA
Lower
Bound | Mean
TPPV | Mean
TPPV
Lower
Bound |
|-----------------------------|---------------------------------|-------------|----------------------------|-------------|-------------------------------|--------------|--------------------------------|
| 10%
yeast
DNA | 9 | 0.9997 | 0.9996 | 0.9999 | 0.9999 | 0.9996 | 0.9995 |
| 30%
bacteri
al
DNA | 9 | 0.9997 | 0.9996 | 0.9999 | 0.9999 | 0.9996 | 0.9995 |

Table 30. Accuracy of Clinical Samples with Microbial and Yeast Spike-ins

(4) Smoking

To assess the impact of smoking as an interfering substance on the Helix Laboratory Platform, saliva samples were evaluated 60 minutes prior to (Condition1, before smoking), immediately after (Condition 2) and 30 minutes after (Condition 3) smoking. Five saliva donors submitted three samples each for this study, for a total of 15 saliva samples. In total, 45 libraries (3 conditions x 5 donors x 3 replicates/extracted DNA) were generated for enrichment and sequencing on the Helix Laboratory Platform. The dataset from Condition 2 - immediately after smoking and Condition 3 - 30 minutes after smoking treatment groups were compared to the Condition 1- before smoking dataset to assess the impact of smoking on performance. Replicates of the baseline samples, before smoking, were used to determine the majority call at each position in the reportable range in order to generate a reference for each donor. Performance was evaluated by comparing the treatment samples (immediately after smoking, and 30 minutes after smoking) to the corresponding donor majority call reference for concordance. Sensitivity (PPA), specificity (NPA) and precision (TPPV) were calculated for each sample and the mean of these values for all samples per condition was determined (Table 31). This study met all acceptance criteria for mean PPA, NPA and TPPV suggesting that saliva collected 60 prior to smoking, immediately after smoking and 30 minutes after smoking does not interfere with performance.

Table 31. Smoking

98

| Condition | Mean
PPA | Mean PPA
Lower
Bound | Mean
NPA | Mean NPA
Lower
Bound | Mean
TPPV | Mean
TPPV
Lower
Bound |
|--------------------------------|-------------|----------------------------|-------------|----------------------------|--------------|--------------------------------|
| 30 minutes
after
smoking | 0.9997 | 0.9996 | 1.00 | 0.9999 | 0.9997 | 0.9996 |
| Immediately
after smoking | 0.9997 | 0.9996 | 1.00 | 0.9999 | 0.9997 | 0.9995 |

f. Stability

Studies were conducted to evaluate reagent stability and specimen stability.

i. Real Time Kit Stability

Helix conducted studies to confirm the stability of storage and shelf life conditions of 13 critical reagents provided by the vendor. Stability assessment for 7 of 13 (53%) critical reagents were tested up to less than one week before expiration date. Stability assessment for 2 of 13 (15.3%) were tested up to 3 months or less before expiration date. The remainder, 4 out of 13 (30.7%) critical reagents associated with long expiration date range (2 years) were assessed between 12 to 18 months before expiration date. Additionally, lots are qualified across the proposed timeline through assessment of a cohort of samples and controls. Study design is shown in Figure 13. All reagents are stored at -25°C to - 15°C.

Figure 13. Reagent Stability Study Schema. (a) A cohort of samples and controls is received at the Helix laboratory. (b) Samples and controls are processed through the HLP where critical reagents are used as part of routine testing. (c) The quality indicator performance metrics for each sample and control are grouped by the lot number of the critical reagent used. (d) Reagents lots are received, kept in inventory, and move to point of use on a first-in, first-out basis. (e) The quality indicator data collection takes place between the time a specific critical reagent lot commences use to the last date the lot is used.

99

Image /page/99/Figure/0 description: The image is a gray rectangle with a red border. The rectangle is a solid gray color, and the border is a thin red line. There is some text at the top of the image, but it is too small to read. The image is simple and does not contain any other objects or details.

ii. Specimen Stability

Two studies were conducted to support specimen stability: temperature and storage and freeze/thaw.

(1) Specimen Transport Stability

A study was conducted to evaluate the stability of specimens that are potentially subject to extreme transport conditions when shipped to the Helix laboratory. Specimens were shipped using the Oragene Dx OGD-610 collection device (K192920). A total of 5 saliva donors were consented for the study. Each donor collected and submitted 4 saliva samples (n = 20). For each donor, 2 samples were stored at room temperature, (20 +/- 5°C) and served as a control group, one sample was stored at high temperature (50 +/- 5°C) to simulate extreme summer transport conditions and one sample was stored at low temperature (-20 +/- 5°C) to simulate extreme winter transport conditions. Samples were stored under the testing conditions for a total of 7 days to simulate the worst-case scenario of a delivery delay during transport.

A reference for each saliva sample was generated by majority call comparison over all room temperature (RT) replicates of a sample in this study. The number of SNVs, Insertions, Deletions, reference bases and bases with no majority was calculated for each sample. Each replicate of a sample per testing condition (high and low temperatures), was compared to the majority call reference sequence generated from the appropriate ambient temperature sample. The NPA is calculated using only SNV: PPA and TPPV is calculated using SNV and indels combined. The acceptance criteria are: NPA > 99.99%: PPA > 99.5%; TPPV ≥ 99.5%; all with a 95% confidence interval lower bound of 99.0%. All conditions passed the Acceptance Criteria suggesting that performance is not negatively affected by transport of saliva in the OGD-610 collection kit at the extreme summer and

100

winter temperatures tested (Table 32).

| Group Name | Mean PPA | Mean PPA
Lower
Bound | Mean NPA | Mean NPA
Lower
Bound | Mean TPPV | Mean TPPV
Lower Bound |
|------------|----------|----------------------------|----------|----------------------------|-----------|--------------------------|
| -20°C | 0.99974 | 0.99963 | 1.00000 | 0.99999 | 0.99969 | 0.99956 |
| 50°C | 0.99978 | 0.99968 | 1.00000 | 0.99999 | 0.99974 | 0.99963 |

Table 32. Comparison to Majority Call PPA (SNV and Indels)

(2) Freeze/Thaw Specimen Stability

A study was conducted to evaluate the impact of repeated freeze-thaws on performance for DNA extracted from saliva samples in the Helix Laboratory Platform. A total of 17 saliya samples with known variants, i.e., single nucleotide variants, small insertions or small deletions in medically related genes were tested after repeated DNA freeze-thaw cycles.

The test conditions included:

  • a) Baseline: DNA was extracted from saliva and subjected to NGS library preparation and sequencing prior to the first freezing event. The remaining DNA was stored at -20°C for 2 weeks.
  • b) Condition 1: DNA was subject to one (1) freeze-thaw cycle. The DNA was frozen for 2 weeks followed by thawing. A portion of the thawed DNA was subjected to NGS library preparation and sequencing. The remaining DNA was stored at -20℃ for 2 weeks.
  • c) Condition 2: DNA from Condition 1 was subjected to an additional 3 freeze-thaw cycles (2 weeks/cycle). Upon the final thaw, the DNA was used for NGS library preparation and sequencing.

Performance was evaluated by pairwise comparison of known variants in baseline samples to samples exposed to freeze/thaw conditions. Results for this study show 100% concordance of the known variants, indicating that HLP performance is not negatively affected by repeat freeze/thaw under the conditions tested (data not shown).

3. Comparison Studies - Accuracy

A representative approach to accuracy was conducted to evaluate the accuracy of the HLP for the indicated variant types and challenging genomic contexts. Two studies were performed in support of the accuracy of the HLP: (1) an assessment of accuracy across the coding region and subsets using reference cell lines, and (2) accuracy of specific clinically relevant variants in DNA isolated from saliva specimens and cell lines.

Accuracy Study 1: Accuracy with Reference Cell Lines a.

The accuracy of the Helix Laboratory Platform was assessed through comparison of sequence data obtained with the HLP to the publicly available reference datasets for six (6)

101

well characterized cell lines. Five (NA12878, NA24385. NA24149. NA24143 and NA24631) cell lines were characterized by the Genome in a Bottle (GIAB) consortium and one cell line (NA12877) sequenced as part of the Platinum Genomes project (see Table 7).

Consistent with the HLP protocol, DNA input was | @10) |and samples failing QC metrics at any step in the process were restarted or re-queued. Samples and process controls were required to pass the secondary analysis metrics for sample evaluation as listed in Table 1. The quality metrics and thresholds were evaluated as evidence of robustness in accordance with the requirements for each subset. The PPA, NPA, and TPPV along with the 95% lower limit of the confidence interval were calculated as described in Table 6. The pre-specified acceptance criteria for these studies for overall performance are shown in Table 33.

| Variant
Type | PPA
Threshold | PPA
Lower Bound
Threshold* | NPA**
Threshold | NPA
Lower Bound
Threshold* | TPPV
Threshold | TPPV
Lower Bound
Threshold* |
|-----------------|------------------|----------------------------------|--------------------|----------------------------------|-------------------|-----------------------------------|
| SNV | 0.995 | 0.990 | 0.999 | 0.990 | 0.995 | 0.990 |
| Deletion | 0.990 | 0.980 | n/a | n/a | 0.990 | 0.980 |
| Insertion | 0.990 | 0.980 | n/a | n/a | 0.990 | 0.980 |

Table 33. Acceptance Criteria for the Accuracy Studies.

*95% Lower Bound Confidence Interval (CI) calculated using a binomial model with a 95% CI estimated with Wilson's method

** The NPA cannot be determined for insertions and deletions.

Accuracy was determined for each indicated variant type and challenging genomic contexts including the range of sizes for insertions and deletions, zygosity, GC content, pseudogenes, and accuracy of detection near short tandem repeats (i.e., STRs: homopolymer stretches). Results were summed over all 6 samples in the study and performance was calculated for the coding regions and the two subsets (mendiolome and priority). These three regions have different levels of coverage. The data in Table 34 presents the total expected number of variant calls, the number of no calls, the percentage of variants with no calls, the number of true positives (TP), the number of false positives (FP), the number of false negatives (FN), the positive percent agreement (PPA) and technical positive percent agreement (TPPV) for all samples combined for each region. Calculation of PPA, NPA, and TPPV values were calculated as described in Table 6. The data demonstrates that all regions met the predefined acceptance criteria with the exception of insertions with size | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rate (>25%) and PPA 15-40% GC, >40-65% GC, >65-90% GC.

Table 37. Accuracy for Deletions and Insertions based on Size for All 6 Reference Samples Combined (DK4)

Colley

105

(P)(4)

Variant Detection Near Short Tandem Repeats (STR): Stretches of homopolymer regions can lead to unreliable detection of nearby variants. To determine the HLP accuracy for detection of variants near short stretches of tandem repeats, detection of variants STRs was compared to the detection of variants not near STRs. The results are shown in Table 38a and 38b below. The results demonstrated no difference in performance of variant detection in the absence of STRs or near short STRs. All criteria met the prespecified acceptance criteria with the exception of insertions which was expected for the results overall (Table 38a). (Note: The performance of the HLP for accuracy and precision of reporting short tandem repeats was not evaluated and the HLP is not indicated for detection of STRs).

(0) (4)

Table 38a. Variant Detection Near Short Tandem Repeats.

Table 38b. Variant Detection Near Short Tandem Repeats Stratified by Indel Size.

  1. Call

106

Genotype Quality (GQ) score: GQ score is one of several criteria used to control for variant The number of false positives as a function of calling accuracy. (b)(4) GQ score was evaluated for variants from Accuracy study 1. The data demonstrated an 1674) Table 39 demonstrates that the majority of variants have high GQ scores. (014)

Table 39. Genotype Quality Scores for SNVs, Insertions and Deletions
------------------------------------------------------------------------
GQ Score
Variant TypeClassification[20, 30)[30, 40)[40, 50)[50, 60)[60, 70)[70, 80)[80, 90)[90, 99]
SNVTP(b)(4)
FP
INSERTIONTP
FP
DELETIONTP
FP

(0)143

107

b. Accuracy Study 2: Clinical Specimens

A study was conducted to evaluate the accuracy of the HLP with DNA isolated from saliva specimens. Specimens were selected for the presence of clinically relevant variants. A specimen selection protocol was used to select specimens to minimize bias as follows: Each of the samples were sequenced in the Helix lab between July 2017 and April 2018. A list of clinically relevant variants from all variants in ClinVar were identified and filtered for high confidence interpretation. The list was further refined to remove variants outside of Helix reportable range (86 were removed) and from helix blacklist (7 variants removed). The final ClinVar variant list contained 29,492 variants which were screened against the Helix database to determine availability of the variant within the Helix specimen collection. A total of 6, 427 unique variants were identified. The variant list was then randomly chosen from this list. Samples with inadequate saliya volume were removed from analysis. Two (2) samples failed to meet secondary sequencing metrics QC for evaluability based on callability and coverage metrics. Three (3) samples failed to meet secondary sequencing metrics QC based on freemix metric. One (1) sample failed to yield sufficient DNA at OC after library prep (failed at Quant OC - Quant 2). A total of 1002 clinical samples met secondary sequencing metrics QC for evaluability and were used in the final data analyses below.

A validated Sanger sequencing method was used to confirm the accuracy of 1,061 clinical variants in 1.002 clinical samples and 90 unique variants in 96 cell lines. The comparator method was assigned to sequence a region of ~125bp length, which included the variant of interest. The additional sequenced DNA flanking the variant of interest was used to assess the accuracy of the HLP for reference (wild-type sequencing). As a result of this additional sequencing, a total of 103.339 reference calls were extracted as truth for reference sequence at clinical variants sites and sites flanking the variants of interest. The variants are composed of single nucleotide variants, small insertions or small deletions (less than or equal to 20 bp) in clinically relevant genes.

Additionally, 125 clinical variants and reference calls in 96 unique cell line samples were also evaluated. Cell lines were compared to the truth genotypes taken from published variant information. Some of the samples have more than one variant and some of the variants were tested multiple times in different samples.

The study was done according to protocol and results were required to meet quality metric reporting thresholds as described previously.

Because the accuracy is contingent on the HLP results, positive predictive value (PPV) and negative predictive value (NPV) are calculated. Truth genotypes for the variants in the cell lines were taken from publicly available information. The acceptance criterion for PPA was established at 99.5% with a lower bound of 99.0% using a binomial model with a 95% CI estimated with Wilson's method. Acceptance criterion for NPA was 99%. Adjusted PPA (aPPA). and adjusted NPA (aNPA) were also calculated to adjust for potential bias in the clinical sample selection. Definitions and methods of calculation for aPPA and aNPA are shown in Table 6. The sampling prevalence used for the calculation of aPPA and aNPA was 0.266096.

108

Data were calculated for all variant and reference calls combined (All) and by variants and reference calls located within three GC bins. For a total of 104,398 possible variant calls, there were 2 incorrect calls (Table 40). Two heterozygous SNVs (at chr1:152313368 and at chr3:120641660) were called incorrectly as reference and classified as false negative (FN). For variant at chr1:152313368, Sanger sequencing determined this variant to be Het Alt (A/G) and HLP called Hom Ref (A/A). For variant at chr3:120641660, Sanger sequencing determined this variant to be Het Alt (C/T) and HLP called Hom Ref (C/C). The data demonstrates high degree of accuracy for the variants evaluated.

| Number
of Unique
Clinical
Samples | Variant
Type | Number of
Expected
Calls1 | Number
of No-
calls2 | Number of
Correct
Calls3 | Number of
Incorrect
Calls4 | Percent
Agreement
including No
calls | Percent
Agreement
Excluding
no calls |
|--------------------------------------------|--------------------------|---------------------------------|----------------------------|--------------------------------|----------------------------------|-----------------------------------------------|-----------------------------------------------|
| 729 | SNV (het) | (b)(4) | | 759 | 2 | 99.086 | 99.737% |
| 81 | SNV (hom) | | | 86 | 0 | 100.000 | 100.000% |
| 164 | Deletion
(All) | | | 163 | 0 | 99.390 | 100.000% |
| 107 | Deletion
(1-2bp) | | | 106 | 0 | 100.000% | 100.000% |
| 28 | Deletion
(3-5bp) | | | 28 | 0 | 100.000% | 100.000% |
| 29 | Deletion
(6-20bp) | | | 29 | 0 | 100.000% | 100.000% |
| 45 | Insertion
(All) | | | 45 | 0 | 100.000 | 100.000% |
| 33 | Insertion
(1-2bp) | | | 33 | 0 | 100.000% | 100.000% |
| 2 | Insertion
(3-5bp) | | | 2 | 0 | 100.000% | 100.000% |
| 10 | Insertion
(6-20bp) | | | 10 | 0 | 100.000% | 100.000% |
| 641 | Reference
(wild-type) | | | 102,813 | 0 | 99.492 | 100.000% |
| 1,002 | All | | | 103.866 | 2 | 99.490 | 99.998% |

Table 40. Concordance of NGS and Sanger Results for each Variant Category

1 Number of expected calls confirmed by Sanger sequencing (variant sites and flanking regions).

2 Subset of expected calls which received a no-call.

3 Number of calls agreeing with Sanger sequencing.

4 Number of calls disagreeing with Sanger sequencing.

5 Percent Agreement = Number of Correct Calls / (Number of Expected Calls - Number of No-calls)

Additional variants of interested present in cell lines were also evaluated. Data was also analyzed for cell lines. There were 2 homozygous calls that HLP called heterozygous (Table 41).

Table 41. Accuracy Study 2: Results for Cell Line Samples Reported by Variant Type.

| Number of
Unique Cell
Line Samples | Variant
Type | Number of
Expected
Calls1 | Number of
No-calls2 | Number of
Correct
Calls3 | Number of
Incorrect
Calls4 | Percent
Agreement5 |
|------------------------------------------|-----------------|---------------------------------|------------------------|--------------------------------|----------------------------------|-----------------------|
| 52 | SNV (het) | 61 | 0 | 61 | 0 | 100.000% |

109

18SNV (hom)180171694.444%
27Deletion290290100.000%
5Insertion5050100.000%
8Reference8080100.000%
96All12501241699.200%

1 Number of known calls based on published data for each cell line.

2 Subset of expected calls which received a no-call.

3 Number of calls agreeing with published data for each cell line.

4 Number of calls disagreeing with published data for each cell line.

3 Percent Agreement = Number of Correct Calls / (Number of Expected Calls - Number of No-calls)

6 Homozygous SNV was called as heterozygous SNV.

Overall accuracy of the Saliva and Cell lines is summarized in Table 42. The no call rate for all data sets is shown for each region in Table 43.

| Sample | Number
of
Samples | TP1 | FP2 | TN3 | FN4 | NPV | TPPV | aPPA | aNPA |
|---------------------|-------------------------|-------|-----|---------|-----|---------|----------|-----------------------------|-------------------------------|
| Clinical
samples | 1,002 | 1,053 | 0 | 102,813 | 2 | 99.998% | 100.000% | 99.995% | 100.000% |
| Cell lines | 96 | 116 | 0 | 8 | 1 | 88.889% | 100.000% | 99.145
(not
adjusted) | 100.000%
(not
adjusted) |

Table 42. Accuracy Study 2: Saliva (Clinical) and Cell Line Data.

1 Number of Sanger sequencing confirmed variants that were called correctly.

2 Number of Sanger sequencing confirmed reference calls that were incorrectly called as variants.

3 Number of Sanger sequencing confirmed reference calls that were called correctly.

4 Number of Sanger sequencing confirmed variants that were incorrectly called as reference or with an incorrect genotype.

No-Call Rates

The mean no-call rate of the HLP was averaged across all samples in the accuracy study and shown for the reportable region and subsets (Table 43). Additionally, the no call rate for variants in ClinVar were also assessed (accuracy for ClinVar subset not shown). The no call rate for the coding region considering the flanking regions in Accuracy study 2 was 1614 (data not shown).

Table 43. Callability in the Accuracy datasets

| Reportable Window | Mean Callability | Mean No Call Rate
(1- Callability) |
|-----------------------------------|------------------|---------------------------------------|
| Exome (all coding region targets) | 97.0% | 3.0% |
| Mendeliome | 98.5% | 1.5% |
| Priority | 99.5% | 0.5% |
| ClinVar Subset | 99.5% | 0.5% |

Accuracy- Assessment of Metrics

110

In order to support the representative approach, data using reference samples with known sequence data, were used to demonstrate that the metrics used by the HLP establish high confidence calls (See Section L). Data demonstrating the metrics in the accuracy study were assessed to provide evidence that conclusions that the overall sequencing data accurate is supported. Mean coverage, as well as the callability and percentage of variants with [1] are summarized for all samples in the accuracy study with 96 cell lines and 1002 clinical samples in Table 44.

Cell LinesClinical Samples
Mean Coding Callability(b)(4)
Range
Mean Mendeliome Callability
Range
Mean Priority Callability
Range
Mean Coding Percent >=20x
Range
Mean Mendeliome Percent >=20x
Range
Mean Priority Percent >=20x
Range
Mean Coding Average Coverage
Range
Mean Mendiolome Average Coverage
Range
Mean Priority Average Coverage
Range

Table 44. Quality Metrics for Cell Lines and Clinical Samples in the Accuracy Study

4. Clinical Performance

Not applicable. The HLP is a targeted whole exome sequencing test system intended to be used with genetic assays as part of a test system where such genetic assays are validated for specific clinical use. The HLP is being authorized in conjunction with the Helix Genetic Health Risk App test; K192063.

M. Instrument Name

Illumina HiSeq X instrument (qualified by Helix); The HiSeq sequencing instrument is a high throughput DNA analyzer that measures fluorescence signals of labeled nucleotides through the use of instrument specific reagents in conjunction with Helix proprietary reagents.

Illumina cBot system (qualified by Helix): The cBot System (Illumina Inc.) is an automated system that creates clonal clusters from single molecule DNA templates, preparing them for sequencing by synthesis (SBS) on the HiSeq instrument. The cBot isothermally amplifies cDNA fragments that have been captured by complementary adapter oligonucleotides covalently bound to the surface of Illumina

De novo Summary (DEN190035)

111

flow cells (called the Cluster Generation Process)

N. System Descriptions

    1. Modes of Operation:
      Does the applicant's device contain the ability to transmit data to a computer, web server, or mobile device?

Yes X or No or No

Does the applicant's device transmit data to a computer, web server, or mobile device using wireless transmission?

Yes X or No No and

2. Software:

FDA has reviewed applicant's Hazard Analysis and software development processes for this line of product types:

Yes X or Nor Not of Not of

    1. Instrument software:
    • o HiSeq Control Software: The HiSeq Control Software (Illumina Inc.) controls the flow cell stage, fluidics system, and flow cell temperatures. It also captures images of clusters. generating image analysis, base calling, and base call quality automatically.
    • Real Time Analysis Software: Primary analysis is performed by the Real Time Analysis (RTA) o software (Illumina Inc.) and consists of base calling of each cluster at each cvcle. In addition to base calling, RTA assigns an analytical quality score (0-score) to each base call. Calculations of O-scores are based on the ratio of the signal intensity of the highest base in a given cluster during a given cycle to the signal intensity of the three other bases. The quality score Q is calculated as as -10 log10 P, where P is the probability that base call is incorrect.
    • Conversion Software: The bcl2fastq conversion software (Illumina Inc.) is used to process o BCL (base call log) files generated by the HiSeq instrument into de-multiplexed FASTO files. De-multiplexing is the process of using the index sequences to assign clusters to the sample from which they originated. FASTQ is a standard text-based file format that will store the nucleotide sequences and base quality scores for each read sequenced from a sample.

Yes X or No __

    1. Specimen Identification:
      Specimens are accessioned.

5. Specimen Sampling and Handling:

Automated

112

6. Calibration:

Not applicable

    1. Quality Control:
      External and internal controls

O. Other Supportive Instrument Performance Characteristics Data Not Covered In The "Performance Characteristics" Section above:

Analysis of Reads Coverage

Coverage varies according to genomic regions: Primary clinical regions (DWA) (0)4) 同图 Secondary clinical genes other coding regions The mean reads coverage for each study was analyzed as a function of callability to determine that callability was appropriately assessed for each reportable range. Although overall mean reads coverage varied among studies, (0)43 DIG in every study, meeting reads coverage

requirements for each reportable range (Figure 14a).

Average Mean Callability

Due to variation of mean read coverage in different region subsets of the reportable range. callability QC thresholds for Priority, Mendeliome, and Coding regions were set at 10MM (03/4). Trespectively. Average mean callability in each sample from all analytical studies were assessed based on region subsets. As expected, all samples in all studies met these thresholds (Figure 14b).

Figure 14. Average mean reads coverage and callability in each analytical study by sample types. Data in each figure was aggregated based on sample types (cell lines vs. clinical samples reportable ranges, Priority (green), Mendeliome (blue), and Coding regions (orange). (a) Average mean reads coverages in each analytical study based on genomic regions. (b) Average mean callability in each analytical study based on genomic regions.

113

(b)(4)

Coverage threshold vs Callability per Reportable Ranges

As a minimum coverage of mas used to define callability of region subsets of the reportable range, increasing the minimum coverage requirement would impact the callability of region subsets. Using different reads coverage thresholds, [ ] callability of each region subset, Priority, Mendeliome, and Coding, were assessed. Increasing the read coverage requirement had the most impact on coding regions as more than [6]] of regions are not callable (no call) in most of samples, when the threshold was set at (10x4) (Figure 15).

Figure 15. Coverage threshold vs Callability per Region Subset of Reportable Range. Data were aggregated from studies, Accuracy-1, Precision, and Between-Lot Reproducibility. Y-axis, percent of regions met coverage requirements.

114

Non-coding region: The HLP reagents include coverage of both coding and non-coding regions. The non-coding regions are not reported. For information purposes only, the non-coding regions were covered by the HLP exhibited @ @ 14) no call rate.

(D/4)

Change Protocols: Change protocols specify the verification and validation activities that will be performed for anticipated bioinformatic software modifications to reevaluate performance claims or performance specifications that were reviewed and determined acceptable by FDA. The protocol included a list of specific changes, the specimens (type, number and variant representation), analytical methods, statistical analysis methods, and acceptance criteria, for determination that the modifications met the performance specifications of the HLP. Assessment of the risk and the impact on results was also provided and included the processes for communicating to developers of downstream clinical genetic tests, the impact of the bioinformatics software change on the whole exome sequencing constituent system genetic data output.

P. Proposed Labeling:

The labeling supports the decision to grant the De Novo request for this device.

Q. Identified Risks to Health and Identified Mitigations

2 9, 1948 19, 1997 1994 1994 1

Identified Risks to HealthMitigation Measures
-------------------------------------------------

115

| Inaccurate test results and failure to

provide results
Inaccurate test results and failure to
provide resultsCertain design verification and validation,
including certain analytical studies and clinical
studies.
Certain labeling information, including certain
performance information and device limitations.
Incorrect application or interpretation of
resultsCertain design verification and validation,
including certain clinical studies.
Certain labeling information, including certain
performance information and device limitations.
User error and improper use of the deviceCertain design verification and validation,
including certain analytical studies and clinical
studies.
Certain labeling information, including certain
performance information and device limitations.

R. Benefit/Risk Analysis:

Patient Perspectives

This submission did not include specific information on patient perspectives for this device.

Summary of the Assessment of Benefit For the Proposed Indications for Use

The Helix Laboratory Platform is a device that sequences the whole exome in an individual's genomic DNA obtained from saliva. The whole exome data includes SNVs, insertions, and deletions for genetic tests validated for use on this device. The benefit of this device is large in magnitude due to the potential to enable individuals to obtain genetic information through over the counter or prescription legally marketed genetic tests that use the HLP. Such results may positively influence lifestyle decisions or optimize patient management when used in conjunction with medical consultation.

Summary of the Assessment of Risk For the Proposed Indications for Use

Probable risks associated with the use of this device are mainly due to 1) false positive, false negatives, or failure to provide a result; 2) the incorrect application or interpretation of the data by genetic test developers that use the HLP; and 3) user error and improper use of the device

The risks of false positives are that genetic tests that use the HLP will report false results to patients and/or clinicians, which will then potentially lead to inappropriate lifestyle changes, psychological harm, and unnecessary therapies and/or medications, with the attendant risk of side effects.

The risks of false negatives are that the patient may fail to initiate appropriate medical consultation, lifestyle changes, therapeutic options, or targeted surveillance. Also, the risks of a failure to provide results are that the patient may experience a delay in getting results, and may fail to initiate appropriate medical consultation, as necessary.

116

There is the also risk of incorrect application or interpretation of data by genetic test developers that use the HLP, as well as user error or improper use of the device; these will result in clinically inaccurate results (false positives or false negatives) presented to the patient and or clinicians, with the attendant risks described above.

Summary of the Assessment of Benefit-Risk For the Proposed Indications for Use

The risks associated with providing patients incorrect genetic results may lead to delays in patient management or incorrect patient management, or failure to make lifestyle choices that may benefit an individual's well-being, For this reason, general controls are insufficient to mitigate the risks of the device. Comprehensive analytical validation studies, including accuracy studies and precision studies with reference standards and clinical specimens, were performed to support the probable benefit of this device and mitigate such risks. Additional risk mitigation factors taken into consideration were the delineation of the reportable range of the device and the analytical range of the device, which allowed for implementation of quality metric requirements for reporting, in order to mitigate the risk of incorrect results. Device design verification and validation, including appropriate descriptions of test limitations in the device labeling, support the appropriate and accurate use of the device. Finally, due to the comprehensive nature of the information, applicable protocols to support continued changes to the device were submitted and reviewed. Such change protocols enable the device to remain current with the practice of medicine and the optimization of this technology.

The probable clinical benefits outweigh the probable risks when appropriate mitigation of the risks is provided for through implementation and adherence to the special controls. The combination of the general controls and established special controls support the assertion that the probable benefits. outweigh the probable risks.

S. Conclusion

The De Novo request is granted, and the device is classified under the following and subject to the special controls identified in the letter granting the De Novo request:

Product Code: QNC Device Type: Whole exome sequencing constituent system Class: II (special controls) Regulation: 21 CFR 866.6000