The new era of personalised medicine is already here, as the first CAR-T therapies and other types of genetic interventions are undergoing clinical testing. But a final step is still required to extract a better predictive value from the genetic data to be used in R&D activities to optimise drug development. A clear understanding of the link between the biological target of a drug and human diseases is a fundamental pre-requisite to develop new strategies of drug development able to improve the ratio of successful drug candidates that reach the market. An estimated 90 percent of candidate medicines entering clinical trials currently fail to demonstrate the necessary efficacy and safety profile, resulting in a high impact on R&D costs sustained by pharmaceutical companies. Genetic evidence is playing an increasing role in the pre-clinical validation of drug targets, even at the level of individual patients, thanks to the availability of a personalised pool of biomarkers to be used to diagnose the disease and monitor its evolution.

A new consortium to speed up research

Drug target identification, genetically-validated clinical indications and genomic markers for pharmacogenetic application are just some examples of the possible applications of genetic big data analysis. A type of information that is central to the new pre-competitive consortium that will fund the exome sequencing of the biological samples obtained from the half million volunteers adhering to the UK Biobank.

The consortium holds together six major pharmaceutical companies – Regeneron (the leading company), AbbVie, Alnylam Pharmaceuticals, AstraZeneca, Biogen and Pfizer. The project represents a $10 million effort for each of the adhering companies, with the goal to make available exom sequencing data by the end of 2019, together with associated de-identified medical and health records of the volunteers. It shall thus be possible to extract information from big data analysis to be used to speed up R&D activities. According to Regeneron, consortium members will have a limited period of exclusive access to the exome sequencing data and findings before they will be openly made available to other researchers by the end of 2020.

The next step of the project, said the company, will include sequencing of the entire genome of UK Biobank participants: nevertheless, an activity that will not be completed for several years after the closure of the current exome sequencing project.

Linking human genetic variations to human biology and disease

A first step of the project was run in 2017 with the support of Regeneron and GSK, when an initial pool of 50,000 people adhering to the UK Biobank was sequenced; the complete sequencing of all samples was then expected to terminate in 2022. The timeline has now greatly improved thanks to the formation of the consortium; sequencing of the samples will continue to be performed at the Regeneron Genetics Center (RGC) facility. The so obtained genetic data will be paired with the detailed health information stored by the UK Biobank, among which are brain, heart and body imaging as well as behavioural and psychological information on participants (all under anonymous form). It will thus become possible to link the genetic variations identified by data analysis to human biology and disease.

With mounting national and global health concerns due to widespread increases in obesity-related diseases like diabetes, and age-related diseases such as dementia, together with the ongoing threats of cardiovascular disease, cancer and infectious agents, it is a great statement that so many leading Life Sciences companies are willing to put aside their individual differences and come together to bring this unprecedented, pre-competitive ‘big data’ resource to the world. We all hope and believe this will greatly accelerate our collective efforts to make a profound impact on human health,” said George D. Yancopoulos, President and Chief Scientific Officer of Regeneron.

The main partners of the project

The UK Biobank already completed a genotyping project on its 500,000 samples, those results were published in mid-2017. Genotyping offers a lower precision level of the extracted genetic information, as it just measures specific “letters” in DNA at select locations across the genome. Exome sequencing goes into far more detail, as it records every letter in the DNA of the exomes, i.e. the 1-2 percent of the genome coding for all known proteins. Expressed proteins are believed to be more relevant for therapeutic development and understanding of inherited disease and they represent a common target for drug development.

The UK Biobank is a public institution founded by the Wellcome Trust and the UK Medical Research Council and it represents the most comprehensive resource of its kind in the world. Approximately half a million volunteers, both healthy and ill people, have provided the Biobank with information about their health, well-being and lifestyle, as well as blood and other biological samples for long-term storage and analysis. The project extends further, as the volunteers are followed for many years for the evolution of their health conditions. The data stored by the UK Biobank are anonymised and can be accessed by scientists for research intended to improve the prevention and treatment of a wide range of common disorders.

The Regeneron Genetics Center’s fully integrated genomics programme spans early gene and target discovery, functional genomics and genetics-guided drug development. The role played by RGC in the new pre-competitive consortium is just the last of its more than 60 collaborations with leading human genetics researchers and biobanks around the world. The Center has already sequenced samples from more than 250,000 appropriately-consented individuals, using its fully-automated sample preparation and data processing equipment, cutting-edge cloud-based informatics and large-scale analytical capabilities.