Who is the manufacturer of Helix Laboratory Platform?

Helix Opco, LLC submitted the Direct clearance for Helix Laboratory Platform (DEN190035).

What is the FDA product code for Helix Laboratory Platform?

QNC. It is classified under 21 CFR 866.6000 as a Class 2 device in the Medical Genetics panel.

What is the regulatory pathway for Helix Laboratory Platform?

De Novo classification (submission type: Direct).

Helix Laboratory Platform

DEN190035 · Helix Opco, LLC · QNC · Dec 23, 2020 · Medical Genetics

Device Facts

Record ID	DEN190035
Device Name	Helix Laboratory Platform
Applicant	Helix Opco, LLC
Product Code	QNC · Medical Genetics
Decision Date	Dec 23, 2020
Decision	DENG
Submission Type	Direct
Regulation	21 CFR 866.6000
Device Class	Class 2

Intended Use

The Helix Laboratory Platform is a qualitative in vitro diagnostic device intended for exome sequencing and detection of single nucleotide variants (SNVs) and small insertions and deletions (indels) in human genomic DNA extracted from saliva samples collected with Oragene® Dx OGD-610. The Helix Laboratory Platform is only intended for use with other devices that are germline assays authorized by FDA for use with this device. The device is performed at the Helix laboratory in San Diego, CA.

Device Story

Helix Laboratory Platform performs whole exome sequencing (WES) on human genomic DNA extracted from saliva samples; utilizes Oragene® Dx OGD-610 collection device. System processes DNA to identify SNVs and small indels; output serves as input for downstream FDA-authorized germline assays. Performed at Helix laboratory facility; operated by laboratory personnel. Provides high-confidence genomic data; enables clinical decision-making when integrated with authorized diagnostic assays. Benefits patients by facilitating standardized, accurate germline variant detection for clinical use.

Clinical Evidence

No clinical data provided in the summary letter. Device validation relies on analytical studies, including accuracy, precision, and robustness testing using DNA standards and clinical specimens to demonstrate performance across variant types, zygosity, and genomic regions.

Technological Characteristics

Whole exome sequencing constituent device; utilizes NGS technology. Requires specific quality metrics: cluster density, base quality scores, coverage depth, allele fraction, and strand bias. Includes bioinformatics software for variant calling. Cybersecurity controls required for data integrity and audit trails. Operates within a centralized laboratory environment.

Indications for Use

Indicated for qualitative whole exome sequencing and detection of SNVs and small indels (up to 20 bp) in human genomic DNA from saliva (Oragene® Dx OGD-610). Intended for use with FDA-authorized germline assays. Prescription use only.

Regulatory Classification

Identification

A whole exome sequencing constituent device is for germline whole exome sequencing of genomic deoxyribonucleic acid (DNA) isolated from human specimens. The DNA sequence generated by this device is intended as input for clinical germline DNA assays that have FDA marketing authorization and are intended for use with this device.

Special Controls

*Classification.* Class II (special controls). The special controls for this device are:(1) The intended use on the device's label and labeling required under § 809.10 of this chapter must include: (i) The indicated variant types for which acceptable, as determined by FDA, validation data has been provided. Distinct variant types are considered as single nucleotide variant, insertion, deletion, tandem repeats, copy number variants, or gene rearrangements, and validated for specific sizes and lengths, as applicable. (ii) The indicated specimen type(s) for which acceptable, as determined by FDA, validation data has been provided. (2) The labeling required under § 809.10(b) of this chapter must include: (i) The identification of, or the specifications for, the collection device or devices to be used for sample collection, as applicable. (ii) A description of the reportable range, which is the region of the genome for which the assay is intended to provide results, as well as a description of the targeted regions of the genome that have enhanced coverage. This must include a description of any genomic regions that are excluded from the reportable region due to unacceptable risk of erroneous results, or for other reasons. A description of the clinically relevant genes excluded from the reportable range must also be included, if applicable. (iii) A description of the design features and control elements, including the quality metrics and thresholds which are used for reporting the analytical range (the genomic DNA in the reportable range that passed the quality metrics in the run required for reporting to the user) that are incorporated into the testing procedure, that mitigate the risk of incorrect clinical results. The following metrics are considered applicable in the generation of high confidence data and the established thresholds for these metrics for reporting must be described and be determined to be acceptable by FDA: cluster density and percent of cluster pass quality filter, percent of bases meeting the minimum base quality score, average coverage of reads, percent of reads mapped on target, percent of reportable region with coverage meeting the minimum requirement, percent of unassigned read indices, percent of reads for non-human DNA, allele fraction, and strand bias. Any alternate metrics used must be described and an acceptable, as determined by FDA, rationale for applicability must be provided. (iv) A representative sample of the device output report(s) provided to users, which must include any relevant limitations of the device, as determined applicable by FDA. (3) Design verification and validation must include: (i) A detailed description of the impact of any software, including software applications and hardware-based devices that incorporate software, on the device's function. (ii) Acceptable data, as determined by FDA, demonstrating how the key quality metrics and quality metric thresholds in the list in paragraph (b)(2)(iii) of this section for reporting were established and optimized for accuracy using appropriate DNA standards with established reference genomic sequence. Data must include, as applicable, base quality score, allele fraction for heterozygosity and coverage, and other applicable metrics. (iii) Data demonstrating acceptable, as determined by FDA, analytical device performance using patient specimens representing the full spectrum of expected variant types reported across the genome and in genomic regions that are difficult to sequence. The number of specimens tested must be sufficient to obtain estimates of device performance that are representative of the device performance that can be expected for the reportable region and clinically relevant subsets of the reportable region, as applicable. For each study, data must include a summary of the key quality metric data; the number and percentage of true positives (TP), false positives (FP), and false negatives (FN); number and percentage of no-calls; positive percent agreement (PPA); negative percent agreement (NPA); positive predictive value (PPV); technical positive percent value (TPPV); and non-reference concordance (NRC). These data must be provided per sample and stratified by variant type. The variant data must also be further stratified by size and zygosity (homozygous common allele, heterozygous, homozygous rare allele). Data demonstrating the accuracy assay based on guanine and cytosine (GC) content, pseudogenes, and proximity to short tandem repeats must also be presented. The data must be presented for the entire exome and also for clinically relevant subsets of the reportable region. For each study, the number of run failures and repeat/requeued specimens must be summarized. (iv) Documentation of acceptance criteria that are applied to analytical and clinical validation studies, which must be justified based on the estimated risk of erroneous results on clinically significant genes and variants and must be clinically acceptable, as determined by FDA. The acceptance criteria must be pre-specified prior to clinical and analytical validation studies, and all validation testing results must be documented with respect to those acceptance criteria. (v) Analytical validation must be demonstrated by conducting studies that provide: (A) Data demonstrating acceptable, as determined by FDA, accuracy based on agreement with an acceptable, as determined by FDA, comparator method(s) that has been validated to have high accuracy and reproducibility. Accuracy of the test shall be evaluated with reference standards and clinical specimens for each indicated specimen type of a number determined acceptable by FDA, collected and processed in a manner consistent with the test's instructions for use. (B) Data demonstrating acceptable, as determined by FDA, precision from a precision study using clinical samples to adequately evaluate intra-run, inter-run, and total variability across operator, instrument, lot, day, and site, as applicable. The samples must include the indicated range of DNA input. Precision, including repeatability and reproducibility, must be assessed by agreement between replicates, and also supported by sequencing quality metrics for targeted regions across the panel. Precision must be demonstrated per specimen and in aggregate. Precision data must be calculated and presented with and without no calls/invalid results. (C) Data demonstrating acceptable, as determined by FDA, accuracy in the presence of clinically relevant levels of potential interfering substances that are present in the specimen type and intended use population, including, for example, endogenous substances, exogenous substances, and microbes, as applicable. (D) Data demonstrating the absence of sample cross contamination due to index swapping (misassignment). (E) Data demonstrating that the pre-analytical steps such as DNA extraction are robust such that sources of variability in these steps and procedures do not diminish the accuracy and precision of the device. (F) Data demonstrating that acceptable, as determined by FDA, device performance is maintained across the range of claimed DNA input concentrations for the assay. (vi) Design verification and validation for software within the whole exome sequencing constituent device must include the following: (A) Detailed description of the software, including specifications and requirements for the format of data input and output, such that users can determine if the device conforms to user needs and intended uses. (B) Device design must include a detailed strategy to ensure cybersecurity risks that could lead to loss of genetic data security, are adequately addressed and mitigated (including device interface specifications and how safe reporting of the genetic test is maintained when software is updated). Verification and validation must include security testing to demonstrate effectiveness of the associated controls. (C) Device design must ensure that a record of critical events, including a record of all genetic test orders using the whole exome sequencing constituent device, device malfunctions, and associated acknowledgments, is stored and accessible for an adequate period to allow for auditing of communications between the whole exome sequencing constituent device and downstream clinical genetic tests, and to facilitate the sharing of pertinent information with the responsible parties for those devices. (vii) A protocol reviewed and determined acceptable by FDA, that specifies the verification and validation activities that will be performed for anticipated bioinformatic software modifications to reevaluate performance claims or performance specifications. This protocol must include a process for assessing whether a modification to the bioinformatics software could significantly affect the safety or effectiveness of the device. The protocol must include assessment metrics, acceptance criteria, and analytical methods for the performance testing of changes, as applicable. The protocol must also include the process for communicating to developers of downstream clinical genetic tests the impact of the bioinformatics software change on the whole exome sequencing constituent system genetic data output so they may implement appropriate corresponding actions.

Related Devices

DEN210011 — Invitae Common Hereditary Cancers Panel · Invitae Corporation · Sep 29, 2023
DEN130011 — ILLUMINA MISEQDX PLATFORM · Illumina, Inc. · Nov 19, 2013
K170299 — Ion PGM Dx System · Life Technologies Corporation · Jun 22, 2017

Submission Summary (Full Text)

{0}------------------------------------------------ # EVALUATION OF AUTOMATIC CLASS III DESIGNATION FOR Helix Laboratory Platform DECISION SUMMARY ## A. DEN Number: DEN190035 # B. Purpose for Submission: De Novo request for evaluation of automatic class III designation for the Helix Laboratory Platform # C. Measurands: Single nucleotide variants, insertions, and deletions in whole exome sequence in human genomic DNA # D. Type of Test: Qualitative whole exome sequencing # E. Applicant: Helix OpCo, LLC # F. Proprietary and Established Names: Helix Laboratory Platform # G. Regulatory Information: - 1. Regulation section: 21 CFR 866.6000 - 2. Classification: Class II - 3. Product code(s): QNC - 4. Panel: 88- Pathology {1}------------------------------------------------ #### H. Indications for use: - 1. Indications for use: The Helix Laboratory Platform is a qualitative in vitro diagnostic device intended for exome sequencing and detection of single nucleotide variants (SNVs) and small insertions and deletions (indels) in human genomic DNA extracted from saliva samples collected with Oragene® Dx OGD-610. The Helix Laboratory Platform is only intended for use with other devices that are germline assays authorized by FDA for use with this device. The device is performed at the Helix laboratory in San Diego, CA. - 2. Special conditions for use statement(s): For prescription use For in vitro diagnostic use - 3. Special instrument requirements: Illumina HiSeq X Sequencer (qualified by Helix) Oragene®.Dx OGD-610 (DNA Genotek, Inc; k192920) #### I. Device Description: The Helix Laboratory Platform (HLP) is a high throughput DNA sequencing platform for targeted sequencing of an individual's whole exome. It is intended for targeted sequencing of an individual's whole exome for use with a genetic test application. Genetic test applications may be third party partner ("Partner") genetic test applications or a Helix genetic test application such as the Helix Genetic Health Risk App (HRA; k192073). The DNA sequence generated by this device is intended as input to clinical germline DNA assays intended for use with this device that have FDA marketing authorization. A brief overview of the commercialized workflow is shown in Figure 1 (refer to the section titled "Test Principle" for more specific information regarding the commercialized workflow.) HLP consists of a HiSeq sequencing instrument, cBot system, library preparation reagents, sequencing reagents, and data analysis software. The Helix Laboratory Platform also interacts with the Helix Laboratory Automation Systems and Content Mapping Systems which serve as repositories for the data and do not perform data analysis. The test detects single nucleotide variants (SNVs) insertions and deletions (indels) up to 20 base pairs (bp) and is limited to making high-confidence variant calls that meet prespecified quality metrics (i.e., the analytical range) within the reportable range. Sequencing is performed at the Helix clinical laboratory in San Diego. CA. {2}------------------------------------------------ Image /page/2/Figure/0 description: This image shows the steps of a process. The first step is product selection, where the customer is presented with a set of products to choose from on the Helix website. The second step is the order and checkout process, where the customer completes the ordering and checkout process on Helix.com, receives an order confirmation email, and Helix ships a saliva collection kit to the customer. The third step is account creation, where the customer creates an account with Helix and, if selecting a partner product, also creates an account with the partner. The fourth step is kit registration and sample collection, where Helix provides step-by-step instructions for the customer to register the kit online and collect a saliva sample, the customer mails their sample to the Helix lab, and the customer receives sample sequencing status updates via email. The fifth step is sequencing and data upload, where Helix sequences the sample and, if a partner product, VCF data is uploaded to the partner account. The sixth step is report generation and delivery, where the owner of the genetic test interprets variant data and generates a report, the customer receives notification that results are available, and the customer logs in to view the report. # Figure 1. Workflow Relationship of Helix and "Partner" Genetic Tests Overview # The HLP instruments, reagents and software are qualified by Helix and are comprised of the following: #### 1. Specimen Collection and DNA Preparation Saliva is collected in the Oragene® Dx OGD-610 collection device. At least 1 mL saliva sample must be collected. DNA is extracted using Maxwell HT Saliva DNA Prep. Extracted DNA concentration must be equal or greater than 3.5 ng/yL. A total of 50 to 70ng of DNA input is used for the library preparation. DNA may be stored at -20℃ for up to two weeks. #### 2. Library Preparation Library preparation consists of five key steps: Fragmentation, Adapter addition by polymerase chain reaction (PCR), Hybridization capture, PCR amplification, and Library normalization. In the first step, purified genomic DNA is broken into small pieces by enzymatic fragmentation, generating overlapping small DNA fragments tiling the entire human genome. Library fragments are sizeselected to enable paired-end sequencing with between read pairs. Sequencing adapters are then added by PCR. DNA samples are barcoded using a dual-indexing strategy with barcode sequences that require multiple sequencing errors to become ambiguous. Quality Control (QC) is performed on the purified libraries by DNA quantitation. Purified DNA libraries are combined into a multiplex for enrichment by hybrid-capture. This process includes enrichment using probes to capture the genomic regions of interest. The DNA concentration of the enriched libraries is determined and normalized to allow equal loading of libraries on the flow cell along with clustering reagents into a cBot instrument for cluster generation. The enriched libraries are then clustered on the sequencing flow cell and loaded into the DNA sequencing instrument. Sample concentration of equal or greater than will pass quality control. The cBot System is an automated system that creates clonal clusters from single molecule DNA templates, preparing them for sequencing by synthesis (SBS) on the HiSeq instrument. The cBot isothermally amplifies cDNA fragments that De novo Summary (DEN190035) {3}------------------------------------------------ have been captured by complementary adapter oligonucleotides covalently bound to the surface of flow cells (this is called the Cluster Generation Process). Different sets of reagents are used for this process. HiSeq DNA sequencing reagents, cluster generation reagents and a flow cell are obtained from Illumina and qualified by Helix. The Helix Exome+ 8 reagents includes library preparation, sample indexing reagents and the capture probe reagents that targets the whole exome. ## 3. Sequencing and Data Analysis When cluster generation is complete, the flow cell is inserted with SBS reagents into a HiSeq instrument to perform paired-end DNA sequencing with (014) read lengths. SBS technology uses four fluorescently labeled nucleotides to sequence the (014) of clusters on the flow cell surface in parallel. During sequencing, images are captured from the flow cell by the HiSeq instrument and processed through primary analysis after each sequencing cvcle. Primary analysis is performed by the RTA (Real Time Analysis) software without user intervention. This analysis consists of base calling of each cluster at each cycle. Reads are filtered to require a minimum rate of high-quality base calls. Flow cells that do not meet a minimum vield of filtered reads will undergo additional sequencing. The bcl2fastq software de-multiplexes the data and generates sample specific FASTQ files. Next-generation sequencing (NGS) secondary analysis is performed by the Helix bioinformatics pipeline to process base calls into genomic variant calls, starting with FASTO files. The Helix bioinformatics pipeline is hosted in the cloud and analyzes the targeted human genome sequencing data. The pipeline consists of software analytical tools for short read alignment (Aligner), variant calling (Variant Caller), variant refinement (Variant Refinement), and quality control (Quality Control). The Helix bioinformatics pipeline is a suite of both proprietary and opensource software programs for high-throughput processing and analysis of sequence data. Individual major components of the Helix bioinformatics pipeline are described in Figure 2. Quality control metrics of varying depth and resolution are incorporated into several checkpoints along the data processing and analysis pipeline. The Helix bioinformatics inputs are as follows: - . Metadata: De-identified sample identifier, self-reported sex and age, and the molecular barcode assigned to the library - . Sequencing reads: The set of sequencing reads generated for the sample by the Helix Laboratory Platform - . Annotation data: BED files specifying the reportable range, short tandem repeats (STRs), and known variant truth data for control sample concordance analysis - . Reference genome: The "Helix Reference Genome" based on the Human Reference Genome Consortium Build 38 assembly with additional alternative contigs as described in detail in the Helix Reference Genome section in the Helix Laboratory Platform User Manual. {4}------------------------------------------------ Figure 2. High-level modular and functional overview of the bioinformatics pipeline. Image /page/4/Picture/1 description: The image is a solid gray color. There is a small text string at the top of the image that says '(b)(4)'. The image is a rectangle shape. The gray color fills the entire image. The analysis steps include read alignment, variant caller, variant refinement and quality control. - a) Alignment: A software aligns short-read nucleotide sequences from FASTQ files to the Helix Reference Genome to generate aligned nucleotide sequences and associated mapping quality data. The nucleotide sequences in the FASTQ files are aligned to the human reference genome GRCh38 as (0) 4) (b)(e) The human reference genome, GRCh38, is a human genome reference (b)(4) generated by the Genome Reference Consortium. Reference for the human oral microbiome is obtained from the Human Oral Microbiome Database (HOMD) and (b)(0) 1579 not present within HOMD. from the ATCC® MSA-1002™ panel genomes. Aligned reads are then processed to flag any reads that may be polymerase chain reaction (PCR) or optical (hardware limitation of the sequencer) duplicates. Duplicate marking is necessary to ensure that duplicates are not treated as independent evidence of a genomic sequence. The aligned sequence reads, with the associated base and mapping quality information, are stored in a standard file format, such as a binary alignment mapping file (BAM) or the compressed columnar file format CRAM. These formats are community accepted formats for storing the aligned nucleotide sequences and their corresponding quality scores. They also serve as input files for the Variant Caller. - b) Variant Calling: The variant calling software uses existing OTS software. The Variant Caller {5}------------------------------------------------ is able to detect single nucleotide variants (SNVs) and small (≤ 20 nucleotides) insertion or deletion variants (indels). Genotype calls are stored in an industry standard format such as variant call format (VCF) or its binary format BCF and genomic VCF (gVCF). The VCF includes the following fields: - Chromosome . - Start Coordinate . - End Coordinate, if not equal to start + length of reference allele 1 (1-based inclusive . numbering, following the gVCF standard). - . Reference Allele: Reference base - . Alternate Allele: Non-reference base(s) called in the sample, if any - . Genotype: This field indicates the alleles carried by the sample. - . Additional information: - Genotype Likelihood: The likelihood of each of the possible genotypes given observed data - Genotype Quality: Difference in genotype likelihood between the most likely and o second-most likely genotypes. Higher numbers represent higher-confidence genotype calls. - Read Depth: The number of reads at a position o - · Allele Depth: The number of reads observed for each allele Each genomic locus will have a genotype call with an associated genotype likelihood. The genotype likelihood is a standard probability-based score used to evaluate the likelihood of a genotype call at a given locus conditioned on observed data. The observed data includes the aligned sequence data with associated quality data. More specifically, at each locus, the algorithm will count the number of occurrences of each distinct nucleotide from the aligned nucleotide sequences. The number of aligned sequencing reads represents the "coverage" at a given locus. This coverage data is combined with associated mapping quality, base call quality, and other prior information to generate a genotype call with an associated genotype likelihood. The genotype likelihood can be used to determine if there is insufficient information for a confident call (resulting in a no-call). Two separate genotype call files are created by the variant caller. The first file provides reference calls and the second file provides finalized posterior genotype likelihoods. - c) Variant Refinement: This software analytic tool uses existing open-source libraries and internal code to perform additional processing on variant and reference calls produced during the Variant Calling step to generate the final observed variant call output. Variant refinement first merges the two variant data files generated by the Variant Caller for the sample into a single variant data file by merging records in the file that represent adjacent reference calls and records in the file that represent overlapping variant calls. This is followed by a sex inference using a statistical model based on the counts of fragments mapping to chromosomes X and Y compared to fragments mapping to the autosomes. This model also accounts for possible chromosome Y loss using the self-reported age range of the individual when it is available. Furthermore, it performs ploidy correction as the Variant Caller assumes that all contigs are diploid. The ploidy correction process will retain variant calls that are inconsistent with the expected ploidy with sufficiently high quality. {6}------------------------------------------------ - d) Variant Scrubbing: Metadata for individual files are collapsed. Additional curation against empirically determined "blacklist" and "whitelist" genomic regions with intrinsically poor mapping quality, known polymorphism (e.g. HLA loci), and significant violation of Hardy-Weinberg equilibrium (i.e., non-neutral selective pressures). Haplotype (i.e., phase set) information is calculated. - 4. Sequencing Quality Control methods: Quality Control (OC) methods are incorporated into the Helix bioinformatics pipeline at every step. Quality control methods include Sample-level OC for each region. In addition, the metrics are monitored and used for root cause analysis if samples have a OC failure. These include: (0)43) (67.9) A set of QC metrics are applied to distinct regions of the reportable range. These are collectively referred to as "callability". Callability = 1-No-call rate, is calculated base by base and ensures that the no call rate is acceptable. The sample must pass all thresholds listed in Table 1. For example, the Priority white list has a callability metric of 10/4/1 |which is a no call-rate of [0] If any of the thresholds is not met, the sample(s) are routed back to the appropriate restarting point - library preparation, pooling for enrichment or cluster generation - to be re-tested. Samples may be retested up to times. | | | Table 1. Secondary Analysis Quality Control (OC) Metrics for Sample Evaluation | | | | | | |--|--|--------------------------------------------------------------------------------|--|--|--|--|--| |--|--|--------------------------------------------------------------------------------|--|--|--|--|--| | Metric | Threshold | |--------|-----------| | (b)(4) | | # 5. Content Mapping Systems - a) The Helix Interpretation Module (HIM) is a mapping tool that does not perform data analytics. Rather, it maps the end user's variant data against the reporting rules o identify and match the correct variant call data with the correct clinical result for that customer (also known as the user). In the case of the Helix Genetic Health Risk (k192073), | (b)(4) | |--------| | (b)(4) | | (b)(4) | The Helix Laboratory Platform provides data to the partner in one of two ways (1) variant call data (VCD) on the genetic variants for a predefined set of genetic coordinates for their genetic test indication. The VCD includes information that is typically contained in a Variant Call {7}------------------------------------------------ Format (VCF or BCF) file but is transferred securely to the partner via an API. The partner app will review the VCD and make the interpretation and generate the report or, (2) The Helix Laboratory Platform provides the final genotype (b) 3 The partner generates their final report using the content 15751 from the JSON file. In both scenarios, Helix provides data only to the owner of the genetic test. As the sequencing laboratory, the Helix Laboratory Platform will provide data to partners as described under scenario 2. Image /page/7/Figure/3 description: The image contains two gray rectangles with red borders. The top rectangle is labeled with the letter 'b)' on the left and '(b)(4)' in the center. The bottom rectangle is labeled with the letter 'c)' on the left and '(b)(4)' in the center. The rectangles are similar in size and shape. # 6. Software (Laboratory Automation Sub Systems and User Interfaces) The Helix Laboratory Automation sub-systems are used to automate physical sample data tracking. The Helix Laboratory Automation system's purpose is to track samples from the point at which they enter the clinical laboratory, through sample processing and sequencing workflow steps, and to results delivery (either to a customer for the Helix Genetic Health Risk App or to a Helix third party partner). The various sub-systems assess the quality of the runs and provide information about sample queuing to laboratory personnel and partners. The sub-systems include the Accessioning Subsystem, Laboratory Information Management System (LIMS), Pre-and Post-review sub-systems, Data Delivery Review sub-system, Helix Partner API sub-system and Genomic Data Service sub-system. # 7. Controls A Process Control is defined as the DNA reference material with known sequence that can be used to determine the success of a sequencing run. The NA12878 process controls used by HLP are cell lines from the Coriell Institute for Medical Research. These two cell lines are required in the {8}------------------------------------------------ Helix workflow and are introduced prior to library prep and carried through enrichment and sequencing. The Process Control workflow is as follows: Image /page/8/Figure/1 description: The image shows three redacted sections with the text "(b)(4)" appearing in each. Below the redacted sections is the text "Table 2. Secondary analysis metrics for process control evaluation". The image appears to be a table with sensitive information that has been redacted. | Metric | Threshold | |--------|-----------| | | (b)(4) | - 8. Definitions of the Genomic Regions Reported by the HLP: - a) Reportable Range: Helix has defined the Reportable Range as all coding exons ("coding region" also referred to as "white list") minus prespecified regions that are not reported (referred to as "black list") which is defined as a list of regions in the HLP that are either not covered by the reagents or are excluded from reporting because they have empirically observed 1 {9}------------------------------------------------ elevated false positive rates as defined by deviations from Hardy-Weinberg equilibrium. The reportable range excludes the following: - Regions outside the coding regions. . - Regions that are difficult to map or have poor mapping quality, defined as regions where . fewer than 20% of reads have a mapping quality greater than 20. - Regions with highly polymorphic gene clusters (portions of the HLA loci, . immunoglobulin-like receptors on chr19 and Golgin family on chr15) - Sites with variants called that deviate significantly from Hardy-Weinberg equilibrium . - . Regions with > 25% of bases with quality score < 30. - . Regions with high sequence similarity (i.e. pseudogenes) DNA sequence is aligned to the Helix Reference Genome as previously described (reference genome GRCh38 and select contigs) and variant calling occurs using the Helix bioinformatics pipeline. The Helix Exome+ reagents are designed to cover the whole exome ("coding region"), but with enhanced sequence coverage over clinically relevant genes in the human genome. In addition to the probes covering overall exon regions, more probes (gap-filling) targeting genomic locations of clinical significance are designed and added to the capture panel. As a result, coverage has been optimized for two subsets of the reportable range. The distinct reporting sets are as follows (Figure 3). - i. Coding: All targets within the exome: The Helix Laboratory Platform reportable range is limited to coding regions minus the regions excluded as described above. - ii. Mendeliome: A subset of the coding region consisting of (070) |genes that are associated with human disease with enhanced coverage - iii. Priority: A subset of Mendeliome consisting of play genes identified as clinically relevant and which has the highest coverage. # Figure 3. Reportable ranges for Coding region and subsets Image /page/9/Picture/12 description: The image is a gray rectangle with a red border. In the upper left corner of the rectangle, there is the text "(b)(4)". The text is small and difficult to read. | Region | Region Size* (bp) | Percentage of the Reportable Range | |--------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------| | | (b)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t)(t) | | | | | | {10}------------------------------------------------ The reportable range excludes ================================================================================================================================================ variants of clinical significance as curated in ClinVar. These variants are excluded because they are either not targeted by the HLP or reside in regions that are difficult to sequence or known to not meet performance criteria. Analytical Range: All variants and reference records called within the genomic regions specified above are then filtered based on variant type, sequence context, and variant call quality metrics. The analytical range refers to single nucleotide variants and insertions and deletions up to 20 bp in length that pass quality metrics for reporting. Additionally, each variant record is evaluated based on its proximity to short tandem repeat elements (STR). Specific rules are applied for the location of variants and the sizes of the STRs which will determine whether the variant is reportable. All variant calls are required to meet Helix's variant level QC metrics applied to every locus within the reportable range (Table 3). # Table 3. Quality Metric Thresholds | Metric | Threshold Required for Reporting | |------------------------------------------|----------------------------------| | Depth of Coverage (DP) | (b)(4) | | Genotype quality (GQ) | | | Allele fraction of heterozygous calls | | | Fisher's strand bias score (FS) | | | Root mean square of mapping quality (MQ) | | Only loci that meet the above criteria are reported. There is variation on which loci are reportable on a run to run basis. Callability: Callability refers to the proportion of the coding region that is reportable to ensure that the no-call rate within the likely reporting windows is acceptable. Callability incorporates base quality, alignment/mapping quality, and minimum coverage levels and determines the number (%) of base calls that pass the OC metrics for a genotype call. The callability thresholds for the reportable regions and subregions are reported in Table 1. The Helix Laboratory Platform is intended for use as part of a genetic test that has been validated for specific clinical indications. Each genetic test that uses the HLP will have indication specific callability assessments prior to commercialization so that the expected nocall rate is acceptable for the specific indication. # J. Standard/Guidance Document Referenced (if applicable): - . Guidance for Industry and FDA Staff: Guidance for the Content of Premarket Submissions for Software Contained in Medical Devices: Mav 11, 2005 - . General Principles of Software Validation; Final Guidance for Industry and FDA Staff; January 11, 2002 - Guidance for Industry Cybersecurity for Networked Medical Devices Containing Off-the-. Shelf (OTS) Software; January 2005 - Guidance for Industry, FDA Reviewers and Compliance on Off-the-Shelf Software Use in . Medical Devices: September 9, 1999 {11}------------------------------------------------ # K. Test Principle: The Helix Laboratory Platform (HLP) is a specimen to variant call file platform that is intended for use with sponsors of legally marketed germline genetic assays validated for use with the HLP. The HLP provides results only to sponsors and does not provide results to ordering physicians or patients (collectively referred to as "customers"). Customers interact with the Helix Laboratory Platform through a web-based application. The customer places an order for a genetic test (that has been cleared for use on the platform), and an FDA cleared specimen collection device, the Oragene® Dx OGD-610 is shipped to the ordering customer's saliva sample is returned by the customer to the Helix clinical laboratory for processing. Upon receipt by the Helix clinical laboratory, the sample is accessioned. Samples with insufficient saliva amount will be rejected and a new sample collection kit will be sent to the ordering customer. Samples that pass accessioning enter the extraction process where purified genomic DNA is extracted from saliva samples. Sequencing libraries are prepared, pooled, captured and sequenced. Sequencing reads are aligned to a reference human genome. FASTQ files serve as input files for the Aligner (Helix bioinformatics pipeline) which aligns nucleotice sequences from the FASTO file to the Helix Reference Genome to generate aligned nucleotide sequences and associated mapping quality data. Aligned sequence data is analyzed to identify samples that do not meet quality thresholds and minimum sequencing coverage on the target regions. Samples that do not pass quality thresholds will undergo additional sequencing. This data is typically stored in a compressed file format and serves as input for the Variant Caller. As the sequencing lab. Helix provides variant call data (VCD) on the genetic variants for the Genetic Health Risk App (or a future third party partner genetic test validated for use on the platform), but does not provide variant data directly to the ordering customer. The VCD includes information that is typically contained in a Variant Call Format (VCF) file but is transferred securely to the partner via an Application Programming Interface (API). For partner products with clinically relevant intended uses, the reporting lab will review the VCD and make the final decision on interpretation and labeling and generate the final report. The Helix Laboratory Platform interacts with the [genetic test app] in two primary ways as shown in Figure 1 (Section I device description). First, a [genetic test app] software module called the Helix Report Module (HRM) interacts directly with HLP to obtain a customer's first name, last name, email address, and Helix User Id. This allows the [genetic test app] to input the customer's name in the final report as well as use the email and Helix User Id for secondary confirmation purposes to address customer service inquiries. The second interaction is through the [genetic test app] software module called the Helix Interpretation Module (HIM), which maps the customer's genetic information with the associated clinical interpretation for the final report.[] 157751 The final end user results are displayed as a PDF on the product webpage (also called the Helix Report Module User Interface) upon end user secure login. (b)(4) For third party partner genetic tests, the partner-defined genotype/interpretation rules will be reviewed by Helix through an internal process to ensure the variants and indications have been cleared by the FDA. The HIM associates the interpretations provided by the partner via the Helix Health Traits API for the specific application and the future third party partner creates their final report. In summary, the single clinical laboratory provides the following next generation sequencing (NGS) {12}------------------------------------------------ services to customers who order genetic tests validated for use on the Helix Laboratory Platform: (i) Specimen collection, handling and storage, (ii) Exome+ sequencing; (iii) storage of the customer's genetic information in a secure cloud database, and (iv) an initial and subsequent in-silico query of the customer's Exome+ data for a predefined set of genetic variants as offered by genetic tests validated for use on the Helix Laboratory Platform. The ordering customer's Exome+ data will be generated with their initial order and additional queries in the future will not require additional sequencing. Refer to Section I for more details on the device description. ## L. Performance Characteristics: - 1. Device Optimization and Quality Metrics - a) Optimization of variant read depth, allele fraction and callability thresholds: Parts of the HLP was optimized based on thistorical reference sample runs to develop per-variant QC metrics. Accuracy (positive predictive value; PPV) was assessed for optimal read depth (Figure 4). The data demonstrated that the pre-specified acceptance criteria for PPV was acceptable at a minimum read depth of with no significant improvement above [] The minimum reads coverage for variant calling was set at 20X # Figure 4. Optimization of Variant Read Depth. Optimal accuracy was assessed for allele fraction (Figure 5). Based on the data, the allele fraction range was set between(b)(4) {13}------------------------------------------------ # Figure 5. Optimization and Selection of Allele fraction (D)(4) (DR) Callability thresholds for 16 (4) respectively. On the Y axis is the number of samples having the callability values within the ranges defined on the x-axis. # Figure 6. Selection of Callability Thresholds (D)(4) {14}------------------------------------------------ # b) Establishment of filter and OC threshold for variant calling - i. Genotype quality score (GQ) and filter lists: Reference samples were analyzed with (DRO) of of the bioinformatics pipeline representing different conditions relative to the quality metric criteria (see Section quality metrics above). The control condition retained all of the criteria above and the other | were each removed. Figure 7 shows the PPA, PPV, and NRC performance metrics for SNV, insertion, and deletion variant types in each of the conditions compared to control. The results in Figure 7 show that the removal of any one of the criteria results in reduced accuracy and that retaining all criteria (the control condition) results in the highest accuracy. The T (b)(b) } has the greatest effect on PPA and NRC across all variant types while the DP threshold has the greatest effect on PPV. (Refer to section below for definition of PPA, PPV and NRC). # Figure 7. Accuracy compared to reference datasets under various filtering conditions Image /page/14/Figure/3 description: The image is a gray rectangle with a thin red border. The rectangle takes up most of the image. In the top center of the image, there is the text "(b)(4)" in a small, dark font. {15}------------------------------------------------ - ii. Fisher strand bias score (FS): True positive (TP) and false positive (FP) variant calls from the Accuracy study 1 were grouped based on FS score ranges, While calling accuracy (TPPV) for SNVs was achieved at 99.91% for FS | blore FPs were detected than TPs in with FS MM (Table 4). Therefore, FS threshold for SNV calling was set at 10047 # Table 4. Performance SNV calling in Accuracy study 1 based on Fisher strand bias score ranges Image /page/15/Figure/9 description: The image is a gray rectangle with a thin red border. The text '(b)(4)' is written in red in the center of the rectangle. The rectangle is long and narrow. - iii. Root mean square mapping quality score (MQ): Performance metrics (PPA) from samples in all analytical studies were grouped based on MO score ranges. (D)(4) and Town. The mean PPA in MQ was significantly lower than mean PPA for variants with MO man Therefore, the evidence did not provide adequate support that MQ scores | would be sufficient for variant calling with high accurate results (Figure 8). The data supported high confidence results when the MQ score is 100% | The threshold for MQ threshold was set [0)14) # Figure 8. Performance of variant calling (PPA) from all samples in analytical validation studies based on the range of MO score Image /page/15/Figure/5 description: The image shows a gray rectangle with the text "Mean PPA" at the top. Below the title, the text "(0)(4)" is present. The rectangle is filled with a solid gray color, and the text is in a simple font. The image appears to be a placeholder or a simple graphic element. - iv. Indel size: Performance metrics (PPA) of indel calling from samples in all analytical studies were grouped based on size ranges, 1-2bp, 3-5bp, and 6-20bp. Although the {16}------------------------------------------------ median PPA for indel size of 6-20bp was achieved at (Figure 9), it was significantly lower than size 1-2bp and 3-5bp (this conclusion was supported through FDA analysis of p-values, data not shown). Thus, caution should be taken for indel size 6-20bp. Figure 9. Performance of indel calling (PPA) from all samples in analytical validation studies based on the range of size Image /page/16/Picture/2 description: The image contains two gray rectangles stacked vertically. The top rectangle is labeled with the text "studies based on the range of size (b)(4)". The bottom rectangle is labeled with the letter "b" and the text "(b)(4)". - Removal of sequencing regions with low quality bases: The cell line samples from v. Accuracy I and one replicate of Munique clinical samples from the DNA Input | Wish Endogenous Exogenous Smoking Precision Precision Precision PM Between-Lot Reproducibility | | | and the Microbial | | studies were used to identify regions with a high percentage of bases with quality score | | | Base quality scores within windows of | W/4) base pairs were aggregated across all samples and then the percentage of bases with {17}------------------------------------------------ quality scores)(4) | within each window were calculated. Based on the empirical data. provided in Table 5, this product is not intended to report variants in regions with 9/4 of bases with quality score(b)(4) total of (b)(4) in the coding region with high potential for sequencing errors were excluded. The data in the table below demonstrates that when greater than or equal to(6)(4) of the bases have base quality score (0)(4) the Therefore, this data supports the criterion that(0)(4) % of the bases need (DRO) to have a base quality score [014] With this criterion the risk of false positives is minimized. Table 5. Summary of regions with base quality scores across 6 cell line samples and 102 clinical samples. | (0)(4) | | |--------|--| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | {18}------------------------------------------------ # 2. Analytical performance: The reportable range includes the coding range minus distinct regions that the HLP does not report due to unreliable resolution of the genetic sequence. Specific regions of the coding region have increased targeted coverage and different quality metrics for reporting. For this reason, performance was evaluated for each region based on the pre-specified quality metrics and thresholds described above. The definitions and formulas used for each study is described in Table 6. Performance data generated with cell lines were not overinflated with respect to saliva samples (data not shown). | Term | Definition | |--------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Number of Variants Expected | The total number of known truth variants in the intersection of the Helix Exome+ reportable range and the reference sequence high confidence regions. | | Coding | Targets representing coding regions (the exome); the callability threshold for the coding region is (b)(4) | | Mendeliome | Targets representing (b)(4) genes; (b)(4) | | Priority | Coding regions of 615 genes (a subset of Mendeliome) which were tested to support k19206; (b)(4) | | STR | Short tandem repeats identified that are nucleotide repeats (b)(4) long. This includes strings of homopolymer runs and di/tri nucleotide repeat regions. STRs are not reported. Variant detection was determined near STR, i.e., variants that are located within or immediately next to a STR. | | Non-STR | All other variants that are not "near STR". | | True Negative (TN) | Bases consistent with reference dataset (i.e., wild-type) which are expected variants that were correctly called with the expected genotype. | | True Positive (TP) | Bases that differ from the reference dataset (i.e., variant) which are expected variants that were correctly called with the expected genotype. | | False negative (FN) | Expected variants that did not receive a no-call and were not correctly called. | | Term | Definition | | False positive (FP) | HLP called variants that did not match an expected variant genotype such as wild type | | Non-Reference Concordance (NRC) | Agreement when ignoring reference calls. TP/(TP+FP+FN) | | Positive Percent Agreement (PPA) | Ability of the test to correctly identify variants that are present in a sample. TP/(TP+FN) | | Negative Percent Agreement (NPA) | Ability of the test to correctly identify wild-type bases (the probability that the test will not call a variant). TN/(TN+FP) | | Technical Positive Predictive Value (TPPV) | Defined as the probability of a variant calling being a true positive. TP/(TP+FP) | | Concordance assessment for precision | The degree of agreement between test samples defined as calculating the accuracy for a sample to the reference dataset or gold standard dataset or majority call rate of the same sample. (TP+TN)/(TP+TN+FP+FN) | | Acceptance Criteria | SNVS (PPA and TPPV ≥ 99.5%, NPA ≥99.99%) INDELS (PPA and TPPV ≥ 99.0%) | | Majority Call | The majority (consensus) calls are generated from at least two replicates where more than 50% of replicates have the same (as defined by the location and genotype) passing variant or reference record. | | Negative Predictive Value (NPV) | Defined as the probability of a variant call being a true negative TN/(TN +FN) | | Positive Predictive Value (PPV) | Defined as the probability of a variant call being a true positive (TP/(TP+FP) | | Adjusted Negative Percent Agreement (aNPA) | The adjusted Negative Percent Agreement (aNPA) is defined as the ability of the test to correctly identify wild-type bases (the probability that the test will not call a variant). $aNPA = (NPV*(1-p)) / (NPV*(1-p) + (1-PPV)*p)$ where p is the sampling prevalence, i.e. p = percentage of people positive for at least one variant in the set of tested variants. This value was calculated to adjust for ascertainment bias given clinical samples are previously ascertained using HLP. | | Term | Definition | | Adjusted Positive Percent Agreement (aPPA) | The adjusted Positive Percent Agreement (aPPA) is defined as the ability of the test to correctly identify polymorphic variants that are present in a sample. $aPPA =(PPV*p) / (PPV*p+(1-NPV)*(1-p)) lt;br>where p is the sampling prevalence, i.e. p = percentage of people positive for at least one variant in the set of tested variants. This value was calculated to adjust for ascertainment bias given clinical samples are previously ascertained using HLP. | #### Table 6. Definitions and Formulas {19}------------------------------------------------ {20}------------------------------------------------ Reference Samples used in these studies were obtained from various sources as listed in Table 7: | Sample Name (Source) | Reference Data | Ethnicity | Gender | Variants | |-------------------------|---------------------|-----------------------------------|--------------------|-------------------------------------------------------------------------------------------| | NA12877 | Platinum Genomes | Northern European from Utah | Male | All high confidence variants present in the HLP reportable and analytical ranges | | NA12878 | GIAB | Northern European from Utah | Female | | | NA24385 | GIAB | Ashkenazim Jewish | Male (son) | | | NA24149 | GIAB | Ashkenazim Jewish | Male (father) | | | NA24143 | GIAB | Ashkenazim Jewish | Female (mother) | | | NA24695 | GIAB | Asian Chinese | Female (mother) | | Table 7. Reference Samples used in the studies. # a. Precision A study was conducted to assess the precision of the Helix Laboratory Platform. A total of 24 unique samples (DNA from 6 cell lines with known variants and 18 saliva samples) were tested across 3 different operators using3 separate cBots at the clustering step; and across 3 different HiSeq instruments at the sequencing steps. Each unique sample started with 9 replicates in library prep (3 replicates/sample x 3 plates of library prep = 9 replicates/sample). Each library prep was processed twice through independent enrichments, increasing the number of replicates to 18/sample. Each of the enriched libraries was sequenced on 4 independent runs of the cBot and HiSeq instruments, resulting in a total of 72 replicates/sample (18 enriched libraries/sample x 4 sequencing events = 72 replicates/sample). A total of 1728 samples were included in this study design (24 unique samples x 72 replicates/ sample = 1728). {21}------------------------------------------------ Positive percent agreement (PPA), negative percent agreement (NPA) and technical positive predictive value (TPPV) were calculated for each sample replicate. The mean of these values for each sample per condition and for all samples for each condition was determined. This study passed all acceptance criteria thresholds for mean PPA, NPA and TPPV at all conditions evaluated (library prep operator, enrichment operator, cBot operator, cBot instrument. HiSeq operator and HiSeq instrument - data not shown) when taken in aggregate. Mean quality metrics were evaluated for the coding regions and subsets (Mendeliome and Priority). Precision is described below for cell lines and clinical samples independently. The invalid rate in the precision studies was approximately [0) #### i. Precision - Reference Samples (Cell lines) Precision using reference cell lines was evaluated. Results of this studyusing six reference cell line samples each with 72 replicates processed on the Helix Laboratory Platform were found to pass all acceptance criteria with PPA 99.91% and TPPV 99.93% for SNVs > 99.5%, and PPA 99.52% and TPPV 99.29% for insertions and PPA 99.59% and TPPV 99.18% for deletions when considered in aggregate, however PPA for indels > 6bp could be <99% (Figure 11). No call rates for each variant type are shown and ranged from 5% for SNVs to 29% for indels (Figure 10). Calling performance based on variant type and size in each sample were broken down by Coding regions (Table 8). Mendeliome regions (Table 9), and Priority regions (Table 10). The SNV was further stratified by zygosity (heterozygous, homozygous, hemizygous) and indels were stratified by size (1-2bp, 3-5 bp and 6-20 bp). Acceptance criteria were not applied to the stratified data, however data that was less than the acceptance criteria are shaded and shown in bold in the tables. The instructions for use indicate that insertions ≥ 6bp will be validated independently. For all tables below (cell lines and clinical specimens) the data is shown for all calls combined across the 72 replicates. The mean of per sample values was also assessed and showed slightly improved (data not shown). {22}------------------------------------------------ Image /page/22/Picture/0 description: The image is a gray rectangle with a red border. There is text in the top center of the image that says '(b)(4)'. The image is simple and does not contain any other objects or details. Figure 10. Percentage of No Call Rate for SNV and Indels in Six Reference Cell Lines Image /page/22/Figure/2 description: The image is titled "Figure 11. Concordance of Variant Calling for SNV and Indels in Six Reference Cell lines". The title describes the figure as showing the concordance of variant calling for single nucleotide variants (SNV) and insertions/deletions (Indels). The data is from six reference cell lines. Image /page/22/Figure/3 description: The image is a box plot comparing the mean positive predictive agreement (PPA) for different types of genetic variants. The x-axis represents the variant types: All SNVs, All Indels, Indels (1-2bp), Indels (3-5bp), and Indels (6-20bp). The y-axis represents the mean PPA, ranging from 95.00% to 100.00%. The box plot shows that All SNVs have the highest mean PPA, while Indels (6-20bp) have the lowest mean PPA and the largest variance. ## Table 8: Precision Using Reference Cell lines - Coding Region {23}------------------------------------------------ | Sample Name | Variant Type | Variant Length | Number of Expected Variants | Number of No- calls | TP | FP | FN | PPA | TPPV | |-------------|-----------------|-------------------|--------------------------------------|---------------------------|--------|------|------|---------|---------| | NA12877 | SNV (het) | All | (b)(4) | | 953346 | 957 | 4083 | 99.57% | 99.90% | | | SNV (hom) | All | | | 571420 | 68 | 137 | 99.98% | 99.99% | | | SNV (hemi) | All | | | 18512 | 0 | 0 | 100.00% | 100.00% | | | Insertion | 1-2 | | | 5579 | 28 | 1 | 99.98% | 99.50% | | | | 3-5 | | | 3641 | 17 | 91 | 97.56% | 99.54% | | | | 6-20 | | | 1359 | 88 | 17 | 98.76% | 93.92% | | | | All | | | 10579 | 133 | 109 | 98.98% | 98.76% | | | Deletion | 1-2 | | | 6634 | 41 | 0 | 100.00% | 99.39% | | | | 3-5 | | | 4933 | 8 | 1 | 99.98% | 99.84% | | | | 6-20 | | | 1300 | 1 | 2 | 99.85% | 99.92% | | | | All | | | 12867 | 50 | 3 | 99.98% | 99.61% | | NA12878 | SNV (het) | All | | | 783200 | 532 | 369 | 99.95% | 99.93% | | | SNV (hom) | All | | | 487707 | 3 | 140 | 99.97% | 100.00% | | | Insertion | 1-2 | | | 4295 | 8 | 1 | 99.98% | 99.81% | | | | 3-5 | | | 3128 | 13 | 8 | 99.74% | 99.59% | | | | 6-20 | | | 842 | 74 | 72 | 92.12% | 91.92% | | | | All | | | 8265 | 95 | 81 | 99.03% | 98.86% | | | Deletion | 1-2 | | | 5621 | 86 | 78 | 98.63% | 98.49% | | | | 3-5 | | | 3598 | 87 | 2 | 99.94% | 97.64% | | | | 6-20 | | | 1070 | 0 | 0 | 100.00% | 100.00% | | | | All | | | 10289 | 173 | 80 | 99.23% | 98.35% | | NA24143 | SNV (het) | All | | | 768755 | 538 | 64 | 99.99% | 99.93% | | | SNV (hom) | All | | | 459671 | 4 | 17 | 100.00% | 100.00% | | | Insertion | 1-2 | | | 4381 | 14 | 1 | 99.98% | 99.68% | | | | 3-5 | | | 2577 | 5 | 4 | 99.85% | 99.81% | | | | 6-20 | | | 905 | 18 | 27 | 97.10% | 98.05% | | | | All | | | 7863 | 37 | 32 | 99.59% | 99.53% | | | Deletion | 1-2 | | | 5485 | 26 | 4 | 99.93% | 99.53% | | | | 3-5 | | | 3624 | 18 | 2 | 99.94% | 99.51% | | | | 6-20 | | | 1082 | 0 | 0 | 100.00% | 100.00% | | | | All | | | 10191 | 44 | 6 | 99.94% | 99.57% | | NA24149 | SNV (het) | All | | | 768869 | 2008 | 570 | 99.93% | 99.74% | | | SNV (hom) | All | | | 459159 | 4 | 18 | 100.00% | 100.00% | | | Insertion | 1-2 | | | 5173 | 17 | 4 | 99.92% | 99.67% | | | | 3-5 | | | 2714 | 18 | 0 | 100.00% | 99.34% | | | | 6-20 | | | 991 | 14 | 2 | 99.80% | 98.61% | | | | All | | | 8878 | 49 | 6 | 99.93% | 99.45% | | Sample Name | Variant Type | Variant Length | Number of Expected Variants | Number of No- calls | TP | FP | FN | PPA | TPPV | | NA24385 | Deletion | 1-2 | | | 5109 | 147 | 3 | 99.94% | 97.20% | | | | 3-5 | | | 3469 | 10 | 1 | 99.97% | 99.71% | | | | 6-20 | | | 836 | 3 | 1 | 99.88% | 99.64% | | | | All | | | 9414 | 160 | 5 | 99.95% | 98.33% | | | SNV (het) | All | | | 801423 | 506 | 815 | 99.90% | 99.94% | | | SNV (hom) | All | | | 477860 | 4 | 13 | 100.00% | 100.00% | | | Insertion | 1-2 | | | 4669 | 9 | 2 | 99.96% | 99.81% | | | | 3-5 | | | 2919 | 4 | 1 | 99.97% | 99.86% | | | | 6-20 | | | 850 | 13 | 0 | 100.00% | 98.49% | | | | All | | | 8438 | 26 | 3 | 99.96% | 99.69% | | | Deletion | 1-2 | | | 5135 | 21 | 3 | 99.94% | 99.59% | | | | 3-5 | | | 3658 | 12 | 72 | 98.07% | 99.67% | | | | 6-20 | | | 869 | 0 | 0 | 100.00% | 100.00% | | | | All | | | 9662 | 33 | 75 | 99.23% | 99.66% | | NA24631 | SNV (het) | All | | | 744482 | 872 | 561 | 99.92% | 99.88% | | | SNV (hom) | All | | | 518002 | 4 | 105 | 99.98% | 100.00% | | | Insertion | 1-2 | | | 5429 | 14 | 4 | 99.93% | 99.74% | | | | Insertion | 3-5 | | | 2774 | 15 | 14 | 99.50% | | | 6-20 | | | | 969 | 13 | 7 | 99.28% | 98.68% | | | All | | | | 9172 | 42 | 25 | 99.73% | 99.54% | | | Deletion | 1-2 | | | 5143 | 40 | 15 | 99.71% | 99.23% | | | | 3-5 | | | 3483 | 8 | 74 | 97.92% | 99.77% | | | | 6-20 | | | 1140 | 4 | 0 | 100.00% | 99.65% | | | | All | | | 9766 | 52 | 89 | 99.10% | 99.47% | {24}------------------------------------------------ # Table 9. Precision Using Reference Cell Lines - Mendeliome Image /page/24/Picture/2 description: The image shows a gray rectangle with a red border. The text '(6)(4)' is visible in the upper center of the rectangle. The text is small and appears to be part of a larger document or diagram. De novo Summary (DEN190035) {25}------------------------------------------------ Image /page/25/Picture/0 description: The image is a gray square with a red border. In the top center of the image, there is the text "(b)(4)" in a small, sans-serif font. The text is also red, matching the border. {26}------------------------------------------------ Image /page/26/Picture/0 description: The image is a gray rectangle with a red border. The text '(0)(4)' is located in the top center of the rectangle. The text is small and difficult to read. # Table 10. Precision Using Reference Cell Lines -Priority (D(4) {27}------------------------------------------------ Image /page/27/Picture/0 description: The image is a gray square with a red border. There is some text in the upper right corner that says "(b)(4)". The image is a simple, unadorned graphic with no other features. {28}------------------------------------------------ #### ii. Precision - Clinical Specimens The reproducibility of 18 saliva (clinical) specimens was evaluated as described above. Of 18 saliva samples, one sample (SF-6203-80058) had all 9 replicates fail at the library preparation step, which were not processed further. This failure resulted in all downstream replicate samples being removed from this study (9 library preparations x 2 enrichments/library preparations x 4 sequencing events/enrichment = 72 replicates not sequenced). The sample was not replaced, as per protocol. Two samples (SF-5744-44790. SF-8763-29117) each had one library preparation replicate, of 9 total per sample, failed at library preparation quantitation, and were not replaced as per protocol. Failure of these two samples removed 16 sample replicates from the study (2 library preparations x 2 enrichments/library preparation x 4 sequencing events/enrichment = 16 replicates not sequenced). One sample (SF-2856-29005) had a total of 9 replicates fail and were not replaced as per protocol. Additionally, 21 sample replicates failed to pass the DNA sequencing OC metric thresholds and were not included in the analysis for this study, as per protocol. In summary, 118 sample replicates were not evaluable and were not included in the final data analyses presented below (9.1% of total replicates). The total number of replicates for the 17 saliva specimens is shown in Table 11. | Group Name | Number of Sample Replicates Available for Analysis | |---------------|----------------------------------------------------------| | SF-3312-24549 | 69 | | SF-5744-44790 | 63 | | SF-8763-29117 | 64 | | SF-2856-29005 | 60 | | SF-8085-16718 | 72 | | SF-3607-46036 | 69 | | SF-5221-11824 | 72 | | SF-3002-89172 | 72 | | SF-7804-33749 | 72 | | SF-1901-15675 | 69 | | SF-6203-80058 | 0 | | SF-8038-20902 | 72 | | SF-1070-81401 | 72 | | SF-2757-45105 | 72 | | SF-2711-11665 | 72 | | SF-5730-56578 | 64 | | SF-5044-81848 | 72 | #### Table 11. Post-Sequencing Clinical Sample Replicate Number {29}------------------------------------------------ | Group Name | Number of Sample Replicates Available for Analysis | |---------------|----------------------------------------------------------| | SF-7430-13589 | 70 | | Total | 1178* | *1178 of 1296 possible replicates Performance based on variant type and size in each saliva sample were broken down by Coding regions (Table 12), Mendeliome regions (Table 13), and Priority regions (Table 14). Results of this test using clinical samples each with 72 replicates processed on the Helix Laboratory Platform were found to pass all acceptance criteria with PPA 99.98% and TPPV 99.96% for SNVs, and PPA 99.83% and TPPV 99.12% for insertions and PPA 99.93% and TPPV 99.73% for deletions when considered in aggregate. The SNV was further stratified by zygosity (heterozygous, homozygous, hemizygous) and indels were stratified by size (1-2bp, 3-5 bp and 6-20 bp). Acceptance criteria were not applied to the stratified data, however data that was less than the acceptance criteria are shaded and shown in bold in the tables. | Sample Name (number of replicates) | Variant Type | Variant Length | Number of Expected Variants | Number of No- calls | TP | FP | FN | PPA | TPPV | |------------------------------------------|-----------------------|-----------------------|------------------------------------------------|---------------------------|---------|---------|---------|---------|---------| | SF-1070-81401 (72) | SNV (het) | All | (0)(4) | | 973343 | 466 | 149 | 99.98% | 99.95% | | | SNV (hom) | All | | | 564279 | 48 | 87 | 99.98% | 99.99% | | | Insertion | 1-2 | | | 5184 | 22 | 4 | 99.92% | 99.58% | | | | 3-5 | | | 3435 | 18 | 7 | 99.80% | 99.48% | | | | 6-20 | | | 1289 | 68 | 1 | 99.92% | 94.99% | | | All | | | 9908 | 108 | 12 | 99.88% | 98.92% | | | Deletion | 1-2 | | | 7226 | 18 | 10 | 99.86% | 99.75% | | | | 3-5 | | | 4198 | 17 | 1 | 99.98% | 99.60% | | | | 6-20 | | | 1371 | 0 | 0 | 100.00% | 100.00% | | | | All | | | 12795 | 35 | 11 | 99.91% | 99.73% | | | SF-1901-15675 (69) | SNV (het) | All | | | 875072 | 322 | 136 | 99.98% | 99.96% | | | SNV (hom) | All | | | 529558 | 81 | 121 | 99.98% | 99.98% | | | SNV (hemi) | All | | | 16370 | 0 | 1 | 99.99% | 100.00% | | | Insertion | 1-2 | | | 4790 | 18 | 3 | 99.94% | 99.63% | | | | 3-5 | | | 3149 | 10 | 1 | 99.97% | 99.68% | | | | 6-20 | | | 1133 | 35 | 27 | 97.67% | 97.00% | | | | All | | | 9072 | 63 | 31 | 99.66% | 99.31% | | Deletion | 1-2 | | | 5780 | 14 | 12 | 99.79% | 99.76% | | | | | 3-5 | | | 4169 | 29 | 2 | 99.95% | 99.31% | | | | 6-20 | | | 1747 | 1 | 6 | 99.66% | 99.94% | |…