Resource Library

Showing 16 - 20 of 87 results.

AACR19 Poster - Somatic Analysis In the Cloud - TAG

Junko Tsuji, Andrew Hollinger, Alyssa MacBeth, Brian R. Grander, Micah Rickles-Young, Tera Bowers, Carrie Cibulskis, Niall Lennon. Broad Institute of MIT and Harvard, Cambridge, MA 

With advances in high-throughput sequencing technologies and analytical tools, genomic analysis of tumors has led to the identification of various important somatic mutations that shed light on diagnosis, prevention, and treatment for cancer. However, detecting somatic variants is not a trivial task in terms of the technical aspects (e.g. filtering germline events and removing a variety of noises in tumor samples) and computational resources to handle large-scale cohort analysis. There is also a demand for maintaining stable software versions and the workflows for studies over extended periods of time, that need consistency and traceability, such as clinical trials. We introduce here, the Translational Analysis Group (TAG), a team which deploys, validates, and conducts scalable analytical workflows in a secure, cloud-based environment. We maintain 29 well-tested workflows with best practice methods and ample resources for both somatic and germline analysis. Our cloud platform, FireCloud, enables us to run workflows at any scale. Since May 2017, our team has performed nearly 10,000 analyses for mutation detection (SNV, InDel, CNV, and SV) and cohort analysis on tumor samples and cell-free DNA samples. TAG offers a range of options for somatic and germline variant detection, from legacy pipelines through recently validated contemporary pipelines to allow for continuity across long running projects.
AACR19 Poster - Clinical Whole Genome Sequencing At Scale

Alyssa MacBeth, Tera Bowers, Betty Woolf, Maegan Harden, Niall Lennon, Stacey Gabriel. Broad Institute of MIT and Harvard, Cambridge, MA

Whole genome sequencing (WGS) offers the greatest potential to comprehensively and accurately identify all forms of human genetic variation. The accessibility of WGS data for diagnostic use has increased due to cost reductions in recent years, driving the need for a high quality, clinically validated offering. Our platform has completed an extensive benchmarking study of the Illumina PCR-free whole genome pipeline to establish clinical validity for the end-to-end laboratory, analytical, and computational processes. Our benchmarking study is comprehensive in nature, including measuring the precision, robustness, limits of detection for variant classes, and contamination estimates. A cohort of well-characterized reference controls and clinical samples with previously identified pathogenic variants were used to establish the performance characteristics of our clinical WGS test, which at 30X coverage has >99% analytical sensitivity for SNVs and >98% analytical sensitivity for small insertions and deletions. Our platform is capable of operating at a scale that supports applications from individual clinics (cancer and medical genetics related applications) as well as large scale ambitious collaborative projects such as the All Of Us program which aims to generate clinical grade whole genome variant calls for 1 million healthy research participants.
AACR19 Poster - Somatic SNP and CNV calling with the Genome Analysis Toolkit (GATK)


Somatic small mutations, SNVs or Indels, and copy number alterations are the two categories of mutations with the largest impact on cancer tumors. The Broad Institute has released somatic variant calling workflows for small mutations (M2) and copy number alterations (ModelSegments) based on the Genome Analysis Toolkit (GATK). The suite of workflows can call variants in capture or whole-genome sequencing data and will include functional annotations (Funcotator), such as protein change (for small variants) and impacted gene (for all variants). Common artifacts in sequencing data, such as those arising from oxidative DNA damage, FFPE/deamination, or mapping errors, are corrected automatically. Evaluation of the workflows is standardized and repeatable, which allows tracking of performance across versions, both detection performance (e.g. sensitivity, precision), as well as runtime performance (e.g. CPU and RAM usage). A matched normal is not required for a given tumor sample, since the workflows can leverage pre-processed panels of normals (PoNs). The workflows are freely available, are portable (i.e. can be run on local, on-prem, or cloud compute), are optimized for cost reduction, and can be tuned to optimally leverage available compute.The measured sensitivity of M2 was at least 0.93 for small somatic nucleotide variants (SNVs) and 0.83 for small insertions/deletions (Indels) on DREAM1, DREAM2, and DREAM3 challenges, and on a titrated mixture of germline samples (>=100x depth, AF = 0.2). The measured precision of M2 ranged from 0.91 to 0.98 on DREAM1, DREAM2, and DREAM3 for both SNVs and Indels. The false positive rate (FPR) of M2 was between 0.03 and 0.21 FP/Mb for SNVs, and between 0.0 and 0.1 FP/Mb for indels, on twelve paired, replicate normal-normal samples. The cost of the M2 workflow is about USD$1.15 for a pair of 35x WGS matched tumor-normal samples, using Google Cloud Compute, and required about 32 hours of CPU time on a single core with 3GB RAM.
The measured sensitivity of ModelSegments was at least 0.91 for deletions and amplifications across three cohorts of TCGA whole-exome samples (Stomach adenocarcinoma N=39, Thyroid carcinoma N=50, and Lung adenocarcinoma N=60). The measured specificity for the same set of cohorts was at least 0.96 for both deletions and amplifications. All results reported here were using the corresponding SNP Array results as a truth set.
GATK MS cost was approximately USD$0.65 on a 30x WGS pair using Google Cloud Compute and required about 6 hours of CPU time with a single core. The RAM usage was varied automatically in the workflow to minimize cost, but was in the range of 2-13GB.
AGBT19 Poster - Providing large scale single-cell RNA-seq in the Genomics Platform at the Broad Institute

Corey Nolet, Cole Walsh, Brian Granger, Tim Desmet, Tera Bowers, Niall Lennon

Broad Institute of MIT & Harvard


Single-cell RNA-sequencing (scRNA-seq) is a powerful technique to study gene expression, cellular heterogeneity, and delineation of cell states. Interest in single-cell research continues to grow and gain momentum, which can be seen in projects such as the Human Cell Atlas.


Single-cell sample preparation has been around for almost a decade, however it has not been until recently that single-cell RNA-seq has reached the tipping point into truly high scale. Due to the availability of more methods of sample preparation, along with lowered sequencing costs, the demand is increasing for reproducible and high quality methods. In response, Broad Genomics has expanded our portfolio to include single-cell services that utilize automated workflows and integrated sample tracking to support 10X Genomics Chromium Single-Cell 3’single-cell, and a modified SMART-Seq2 mRNA library construction, sequencing, and analysis at scale. Throughout the development of SMART-Seq2 we had to deal with major design challenges, such as managing the minimal input concentrations, high viscosity master mixes, and low volumes at full scale automation. We accomplished this through using a variety of in-house created automation liquid classes, labware, and protocol designs.

Data delivery and analysis from single-cell sequencing is made available through our cloud based platform, Firecloud. Each cell receives a best practices QC and analysis workflow, mirroring the methods being made publically available via the Human Cell Atlas consortium, consisting of alignment with HISAT2, sequencing quality assessment with Picard tools, and determination of expression with RSEM. This data is then aggregated and run through a second workflow to visualize the results at the plate level allowing for troubleshooting of lab processes and identification of systematic biases.

AGBT19 Poster - Utilizing a novel microfluidic technology to enable robust and rapid “hands-off” library preparation for human whole genomes

Maura Costello1, Tim Desmet1, Julia Yoo2, Severine Margeridon2, Nikolay Sergeev2, Yu-Hung Chen2, Mais Jebrail2, Fay Christodoulou2

1 Broad Institute Genomics Platform, Cambridge, MA 02141

2 Miroculus, Inc., San Francisco, CA 94107

As whole genome sequencing has become more widely adopted due to ever decreasing sequencing costs, Broad Genomics has seen an increased demand for our PCR-Free genome offerings in both the traditional and clinical research spaces. For clinical research in particular, rapid turn around, complete chain of custody tracking, reproducibility, and tolerance for lower input and lower quality samples are all highly desirable features for a whole genome workflow. To address these needs, we are collaborating with Microculus to develop sample preparation methods using their novel aeros™ microfluidic technology. Our first application will be PCR-Free whole genomes utilizing a version of the KAPA Hyper Prep workflow. All sample manipulation steps including enzyme additions, incubations, and magnetic bead-based clean ups are done on the Miroculus instrument, with the output being a sequencer-ready PCR-Free library. The technology allows for hands-off library preparation, reducing the chance for user error, cross contamination, and sample swaps. Here, we demonstrate that libraries made using the Miroculus instrument are equivalent or better in terms of turnaround time, reproducibility, yield, and data quality than current manual or plate-based automated methods. Additionally, the aeros™ microfluidic technology improves the stoichiometry of library preparation reactions by introducing active mixing steps during incubations and shrinking reaction volumes, leading to higher adapter ligation conversion rates. We demonstrate this increased efficiency allows creation of robust PCR-Free libraries from a variety of sample types with much less starting DNA input than our current plate based automated methods, enabling PCR-Free genomes from samples that previously did not meet our specifications for DNA quantity. Going forward, we believe this technology has the potential to be applied to a wider variety applications, including DNA extractions, blood biopsy targeted methods, and other workflows, and could enable ultra fast rapid turn around applications for clinical research.