Broad Institute enables new levels of research with Google engineering and Intel optimizations.
Innovation: The Broad Institute, a joint research organization of MIT and Harvard, has optimized their workflow for fast and cost-effective Google N1 and N2 samples as a result of its collaboration with Intel and Google Cloud to accelerate genomic research. This collaboration resulted in an 85% reduction in data processing cost after optimization compared to the initial deployment of workloads on Google Cloud(1).
“We knew that the cloud would enable an entirely new level of data consolidation and collaboration, and to that end, we could work with others to create a cloud-based data ecosystem where researchers could combine their workflows with more of the data they had created with other datasets, turning them into richer, more powerful computational experiments.” –Geraldine Van der Auwera, director of Data Science Platform outreach and communications at the Broad Institute, a joint research organization of MIT and Hardvard
How It Works: The Broad Institute migrated workloads to Google Cloud N2 instances to keep up with the dramatic increase in demand for genomic data generation and computational research. By modularizing pipeline workloads, sizing cloud instances to fit the needs of the workload, and optimizing for Intel® Xeon® Scalable processors, Broad Institute users can run genomics workflows 25% faster and 34% lower on Google Cloud. To do this, they just need to distribute their workloads to N2 instances with Xeon Scalable processors.
Since 2017, Intel has partnered with the Broad Institute to help optimize the organization’s pipelines and Genome Analysis Toolkit (Genome Analysis Toolkit – GATK) with Intel libraries such as the Intel® Genomics Kernel Library. We also launched the Intel-Broad Center for Genomic Data Engineering, a project that enables researchers and software engineers around the world to create, optimize, and share new tools and infrastructure to help scientists integrate and process genomic data. ) Intel and Broad Institute co-directed.
Intel helped the Broad Institute optimize pipelines on Google Cloud. For example, certain cores in the Genome Analysis Toolkit are optimized for vector operations with Intel® Advanced Vector Extensions 512 (Intel® AVX-512). Some of the optimized storage functions use the Intel® Intelligent Storage Acceleration Library (Intelligent Storage Acceleration Library – Intel® ISA-L).
To bring their vision of a broader life sciences ecosystem to life, the Broad Institute, Microsoft, and Verily jointly developed Terra, a scalable and secure platform for biomedical researchers around the world to access data, use analytics tools, and collaborate. Terra was designed with cloud infrastructure that allows the Broad Institute to scale easily and empower the research community with new capabilities in research into human disease resolution.
Why It Matters: Genomics has changed the way the biological sciences are done. Broad Institute is at the forefront of innovation with the help of Intel and Google Cloud; facilitating and helping to accelerate genomic research. By moving workloads to the cloud and optimizing them for Google Cloud instances, the Broad Institute has solved its storage capacity and compute challenges in a scalable and forward-looking way. The co-created Terra platform enabled the Broad Institute to empower life scientists around the world alongside its own research teams, enable them to leverage optimized tools and pipelines, and create a unified data ecosystem that opens up exciting new possibilities for biomedical research.