Unlocking the Value of the Human Genome
Two weeks ago, the National Institutes of Health (NIH) and Amazon Web Services, jointly announced the launch of the world’s largest data set on human genetic variation through the Amazon Cloud platform. This initiative is the by-product of the 1000 Genomes Project, and serves as a testimony of the desire of the US government to realize the real value of the big data revolution to pursue scientific discovery and innovation. The goal of the public-private partnership is to build a comprehensive map of human genetic variation among 26 distinct population groups worldwide and to facilitate the study of genetic variations that may be implicated in many common illnesses. Currently, the genetic data collected for this initiative originates from 1700 individuals and amounts to nearly 200 terabytes. Adding genetic sequence information of the additional 900 individuals further justifies the decision to store the data using a cloud-based platform.
For researchers, the launch of human genetic sequences in the cloud means easy access to an extremely large data set and a platform to conduct efficient analysis. Of particular value to researchers will be the identification of molecular markers that characterize patient subpopulations in the context of the broader disease segment. By finding these unique molecular profiles, the idea is to design smart therapeutics that target these unique signatures of a disease and therefore improve the effectiveness and safety of new medicines. This approach to treating diseases is called personalized medicine (also known as stratified medicine). For the pharmaceutical industry, the launch of the 1000 Genomes Project will be particularly important not only for providing the opportunity to find new drug targets but also as a demonstration of the utility of the cloud computing platform. In clinical development programs ranging from oncology to the neurosciences, pharmaceutical companies already collect genetic samples of patients enrolled in clinical trials in the hope of identifying biomarkers that predict optimum drug response. The key driver behind this approach has been the recognition that the traditional approach of broadly applied therapies to treat illnesses is not cost-effective, aside from delivering wide variances in terms of efficacy. Thus, if large quantities of data can be stored and analysed more efficiently the drug development process could be far more successful. This is a particular priority for an industry characterized by a shortage of blockbuster drugs and an increasingly vigilant FDA. Further downstream, harnessing the power of cloud computing for analyses of human genomes may also lead to improved disease surveillance efforts.
To realize any benefits that cloud computing may afford in biomedical research, a key question that needs to be addressed is: Are we ready for Big Data initiatives? In other words, is the manpower available and are the analytical skills in place? A McKinsey & Company report titled “Big data: The next frontier for innovation, competition, and productivity” suggests that a talent gap exists. Thus, while the NCI-Amazon 1000 Genomes Project will spur further government led investment to develop computational tools to analyze genomic data, efforts to mine the data in a meaningful manner may not yield immediate benefits. Similarly, as the market shifts to close the talent gap, the volume of data accumulated in biomedical research will continue to grow further compounding the challenge of developing algorithms to compare and contrast data sets.
The launch of the 1000 Genomes Project using a cloud-based platform avoids the inefficiencies associated with having to download large amounts of data for analysis. For researchers, the benefits of having multiple servers manage the analysis of the vast amount of genetic sequence information may lead to more effective drug discoveries. However, while the technology may exist to support research in personalized medicine, the key limiting factor will be the availability of human resources to successfully mine the information available that will likely spur new scientific and innovative advancements.
Short URL: http://vertical-cloud.com/?p=3946