Datasheet for the custom capture, and targeted sequencing product
Micah Rickles-Young, Junko Tsuji, Alyssa Macbeth, Brian R. Granger, Tera Bowers, Carrie Cibulskis, Niall Lennon
The Broad Institute has a long history in genomic sequencing and in the development of tools for researchers to analyze these data. With improvements in technology and reductions in cost, the rate of sequence generation is increasing, which necessitates a platform to scale the associated analyses. We also need to be able to apply our best practice methods across a range of complex workflows to support the breadth of science among our users. This challenge is what spurred the creation of the Translational Analysis Group (TAG) within the Genomics Platform at the Broad Institute. Over the past two years, our group has developed and maintained over 30 validated, version-controlled workflows and has run over 20,000 analyses on Terra, the Broad Institute’s cloud-based analysis platform. Until recently, we have mainly focused on supporting germline and somatic variant analyses on whole genome and exome libraries, however there is high demand to integrate RNA-sequencing (RNA-seq) into the analyses. In this presentation, we introduce our new RNA-seq workflows for bulk and single-cell RNA experiments. Our suite of RNA-seq workflows starts with mapping RNA reads to a reference genome and then profiles gene and the isoform expression. The bulk RNA-seq outputs can be used as inputs for the downstream workflows to perform differential expression and RNA variant calling analysis. For evaluating the workflows, we benchmarked with publicly available datasets such as GTEx to check the expression and the RNA variant calls against the matched exome. The development of our RNA-seq analysis capabilities increases the scope of projects, both internal and external, for which TAG can provide analysis services with the reproducibility, scalable resources, and version control necessary for consistency in studies which extend over a long period of time, such as clinical trials.
Tom Howd, Peter Trefry, Samuel DeLuca, Marissa Gildea, Doug Gobron, Michael Nasuti, Shannon Adams, and Tim DeSmet
The utility and application of genomics to understand disease and the continuing trend to utilize genomics in healthcare, results in an ever increasing demand for greater sequence data generation. Despite the significant reductions in per-base sequencing cost over the last decade, the infrastructure, capital, and reagent costs are still relatively expensive. Top of the line sequencers can cost over 1 million dollars per instrument, and sequencing run costs can still be tens of thousands of dollars. With such high fixed cost associated with genome data generation, it is important to maximize capacity utilization and reduce the non-value add and wasteful workflow process steps. We demonstrate the application of lean manufacturing methodologies and visual management techniques to the genomic sequencing workflow, which results in achieving a sequencer utilization rate of around 90%, while three fold scaling our library preparation process to over 300,000 samples destined for exome and whole human genome sequencing annually.
By combining the sample preparation methods for both exome and whole genome sequencing into a unified, modularized workflow, samples and reagent supply chains can be optimized resulting in more efficient, and cost effective processing. Additional benefits include reductions to work in process and overall cycle times. Here, we illustrate the methodologies that enable low cost per base sequence data generation applicable across large sequencing cores, and modest sized data generation groups.
Tera Bowers, Carrie Cibulskis, Justin Abreu, Andrew Hollinger, Cole Walsh, Maura Costello, Niall Lennon
In 2017, the Genomics Platform (GP) at the Broad Institute created a new set of roles (Portfolio Leads) to better support both the germline and somatic research communities. The creation of these roles have allowed the platform to roll out new products and improve existing products with a focus on the specific features required to serve the scientific questions our communities want to address. Once of the key drivers for germline portfolio is Broad’s Medical and Populations Genetics (MPG) group. When researching the requirements for the new exome, MPG investigators expressed a need for better mitochondrial coverage. After working closely with MPG, our R&D team, and TWIST Biosciences, a 100 fold increase in the mitochondrial genome coverage was achieved while still maintaining abundant and even coverage across the rest of the custom exome design. Another offering, portfolio has been instrumental in developing are the new single-cell RNA-seq products. A collaboration with Aviv Regev’s lab, has successfully scaled the SmartSeq2M protocol with full automation from library construction to sequencing. The Platform now has the capacity to library construct and sequence 16 plates a week. Thus, allowing more researchers access to sequence either single-cell or populations for their full-length transcript capture methods. A suite of long read sequencing products are being developed. These aim to provide improved structural variation calling in human whole genome sequencing. The GP was an early access site for PacBio’s Sequel II instrument with higher yielding 8M SMRT cells and longer run times. In GP’s hands, the Sequel II has delivered raw average read lengths of ~50 kb with 50% of reads being >140 kB. These new offerings will continue to enable the science of our research community.
Implementation of our v6 germline exome product