Resource Library

Showing 1 - 5 of 74 results.

AACR19 Poster - Innovations in large scale liquid biopsy

Carrie Cibulskis1, Brendan Blumenstiel1, Matthew DeFelice1, Mark Fleharty1, Justin Abreu1, Viktor Adalsteinsson2, Laxmi Parida3, Susanna Hamilton1, Gad Getz4, Niall Lennon11Broad Institute, Cambridge, MA; 2Broad Institute, Koch Institute, Massachusetts General Hospital, Cambridge, MA; 3IBM, Cambridge, MA; 4Broad Institute, Harvard Medical School, Massachusetts General Hospital, Cambridge, MA

Broad Genomics offers a comprehensive liquid biopsy sequencing platform designed to provide the optimal flexibility for conducting research studies in a broad range of applications including: biomarker discovery, treatment resistance monitoring, and clinical grade ctDNA profiling. By utilizing low cost, low coverage whole genome sequencing in conjunction with dual unique molecular indexed (UMI) libraries we can offer a range of analysis that allow researchers to select the most appropriate samples for whole exome profiling or for deeper coverage, higher sensitivity targeted gene panels. To date we have generated over 3000 liquid biopsy whole genome copy number profiles and purity estimates and are supporting driver projects including the Broad/IBM Cancer Resistance Project and Count Me In. The study design for the Broad/IBM effort takes advantage of the discovery potential of tissue-based sequencing combined with serial liquid biopsy analysis to elucidate resistance events by tracking clonal and subclonal populations in patient samples over time. Sourcing samples for this and other similar efforts is a major undertaking and a combination of methods for maximally broad and deep genomic profiling are required to assay patients throughout the course of care, as tumor fraction in blood fluctuates. Responding to this need, and other applications requiring increased sensitivity we have developed a high throughput, automated workflow to efficiently assay cfDNA samples with lower tumor content. Benchmarking data using healthy donor pooled cfDNA samples indicates our assay is capable of detecting > 90% of variants present at ~1% minor allele fraction with less than 1 false positive variant called per megabase. This established laboratory and analytic process forms the basis of our 2Mb, 400 gene CLIA targeted assay currently undergoing validation. Through this suite of products we hope to enable an expansion of cfDNA sequencing efforts in support of clinical and research applications. Early results from emerging studies utilizing this platform to be presented.
AACR19 Poster - Somatic Analysis In the Cloud - TAG

Junko Tsuji, Andrew Hollinger, Alyssa MacBeth, Brian R. Grander, Micah Rickles-Young, Tera Bowers, Carrie Cibulskis, Niall Lennon. Broad Institute of MIT and Harvard, Cambridge, MA 

With advances in high-throughput sequencing technologies and analytical tools, genomic analysis of tumors has led to the identification of various important somatic mutations that shed light on diagnosis, prevention, and treatment for cancer. However, detecting somatic variants is not a trivial task in terms of the technical aspects (e.g. filtering germline events and removing a variety of noises in tumor samples) and computational resources to handle large-scale cohort analysis. There is also a demand for maintaining stable software versions and the workflows for studies over extended periods of time, that need consistency and traceability, such as clinical trials. We introduce here, the Translational Analysis Group (TAG), a team which deploys, validates, and conducts scalable analytical workflows in a secure, cloud-based environment. We maintain 29 well-tested workflows with best practice methods and ample resources for both somatic and germline analysis. Our cloud platform, FireCloud, enables us to run workflows at any scale. Since May 2017, our team has performed nearly 10,000 analyses for mutation detection (SNV, InDel, CNV, and SV) and cohort analysis on tumor samples and cell-free DNA samples. TAG offers a range of options for somatic and germline variant detection, from legacy pipelines through recently validated contemporary pipelines to allow for continuity across long running projects.
AACR19 Poster - Clinical Whole Genome Sequencing At Scale

Alyssa MacBeth, Tera Bowers, Betty Woolf, Maegan Harden, Niall Lennon, Stacey Gabriel. Broad Institute of MIT and Harvard, Cambridge, MA

Whole genome sequencing (WGS) offers the greatest potential to comprehensively and accurately identify all forms of human genetic variation. The accessibility of WGS data for diagnostic use has increased due to cost reductions in recent years, driving the need for a high quality, clinically validated offering. Our platform has completed an extensive benchmarking study of the Illumina PCR-free whole genome pipeline to establish clinical validity for the end-to-end laboratory, analytical, and computational processes. Our benchmarking study is comprehensive in nature, including measuring the precision, robustness, limits of detection for variant classes, and contamination estimates. A cohort of well-characterized reference controls and clinical samples with previously identified pathogenic variants were used to establish the performance characteristics of our clinical WGS test, which at 30X coverage has >99% analytical sensitivity for SNVs and >98% analytical sensitivity for small insertions and deletions. Our platform is capable of operating at a scale that supports applications from individual clinics (cancer and medical genetics related applications) as well as large scale ambitious collaborative projects such as the All Of Us program which aims to generate clinical grade whole genome variant calls for 1 million healthy research participants.
AACR19 Poster - Somatic SNP and CNV calling with the Genome Analysis Toolkit (GATK)


Somatic small mutations, SNVs or Indels, and copy number alterations are the two categories of mutations with the largest impact on cancer tumors. The Broad Institute has released somatic variant calling workflows for small mutations (M2) and copy number alterations (ModelSegments) based on the Genome Analysis Toolkit (GATK). The suite of workflows can call variants in capture or whole-genome sequencing data and will include functional annotations (Funcotator), such as protein change (for small variants) and impacted gene (for all variants). Common artifacts in sequencing data, such as those arising from oxidative DNA damage, FFPE/deamination, or mapping errors, are corrected automatically. Evaluation of the workflows is standardized and repeatable, which allows tracking of performance across versions, both detection performance (e.g. sensitivity, precision), as well as runtime performance (e.g. CPU and RAM usage). A matched normal is not required for a given tumor sample, since the workflows can leverage pre-processed panels of normals (PoNs). The workflows are freely available, are portable (i.e. can be run on local, on-prem, or cloud compute), are optimized for cost reduction, and can be tuned to optimally leverage available compute.The measured sensitivity of M2 was at least 0.93 for small somatic nucleotide variants (SNVs) and 0.83 for small insertions/deletions (Indels) on DREAM1, DREAM2, and DREAM3 challenges, and on a titrated mixture of germline samples (>=100x depth, AF = 0.2). The measured precision of M2 ranged from 0.91 to 0.98 on DREAM1, DREAM2, and DREAM3 for both SNVs and Indels. The false positive rate (FPR) of M2 was between 0.03 and 0.21 FP/Mb for SNVs, and between 0.0 and 0.1 FP/Mb for indels, on twelve paired, replicate normal-normal samples. The cost of the M2 workflow is about USD$1.15 for a pair of 35x WGS matched tumor-normal samples, using Google Cloud Compute, and required about 32 hours of CPU time on a single core with 3GB RAM.
The measured sensitivity of ModelSegments was at least 0.91 for deletions and amplifications across three cohorts of TCGA whole-exome samples (Stomach adenocarcinoma N=39, Thyroid carcinoma N=50, and Lung adenocarcinoma N=60). The measured specificity for the same set of cohorts was at least 0.96 for both deletions and amplifications. All results reported here were using the corresponding SNP Array results as a truth set.
GATK MS cost was approximately USD$0.65 on a 30x WGS pair using Google Cloud Compute and required about 6 hours of CPU time with a single core. The RAM usage was varied automatically in the workflow to minimize cost, but was in the range of 2-13GB.
AGBT19 Poster - Providing large scale single-cell RNA-seq in the Genomics Platform at the Broad Institute

Corey Nolet, Cole Walsh, Brian Granger, Tim Desmet, Tera Bowers, Niall Lennon

Broad Institute of MIT & Harvard


Single-cell RNA-sequencing (scRNA-seq) is a powerful technique to study gene expression, cellular heterogeneity, and delineation of cell states. Interest in single-cell research continues to grow and gain momentum, which can be seen in projects such as the Human Cell Atlas.


Single-cell sample preparation has been around for almost a decade, however it has not been until recently that single-cell RNA-seq has reached the tipping point into truly high scale. Due to the availability of more methods of sample preparation, along with lowered sequencing costs, the demand is increasing for reproducible and high quality methods. In response, Broad Genomics has expanded our portfolio to include single-cell services that utilize automated workflows and integrated sample tracking to support 10X Genomics Chromium Single-Cell 3’single-cell, and a modified SMART-Seq2 mRNA library construction, sequencing, and analysis at scale. Throughout the development of SMART-Seq2 we had to deal with major design challenges, such as managing the minimal input concentrations, high viscosity master mixes, and low volumes at full scale automation. We accomplished this through using a variety of in-house created automation liquid classes, labware, and protocol designs.

Data delivery and analysis from single-cell sequencing is made available through our cloud based platform, Firecloud. Each cell receives a best practices QC and analysis workflow, mirroring the methods being made publically available via the Human Cell Atlas consortium, consisting of alignment with HISAT2, sequencing quality assessment with Picard tools, and determination of expression with RSEM. This data is then aggregated and run through a second workflow to visualize the results at the plate level allowing for troubleshooting of lab processes and identification of systematic biases.