To design an efficient and effective drug, researchers must first identify a target, the part of our biological machinery that the drug is going to lock onto and perturb. G protein-coupled receptors (GPCRs) are the largest family of membrane protein receptors in the human genome. Because these receptors control many functions of the human body, they are the target of one-third of currently approved drugs.

In the human genome, there are roughly 800 GPCRs. More than half of these belong to the odorant receptor and are involved in our sense of smell. But other GPCRs are expressed throughout the human body from the heart to the immune system. Although GPCRs are an important target, much remains to be learned about them, including why some drugs that target them succeed while others fail.

At St. Jude, M. Madan Babu, PhD, is leading a research laboratory that focuses on mining, integrating and analyzing biological data in innovative ways to glean new insights and knowledge that may lead to biomedical discoveries with clinical potential. Babu is a member of the Structural Biology Department, endowed chair in Biological Data Science and director of the Center of Excellence for Data-Driven Discovery. His recent work on GPCRs provides a key clue as to why drugs that target these receptors can fail to make the leap from the laboratory into the clinic.

A new understanding of GPCR variation

In multicellular organisms, the same gene can undergo a process called alternative splicing, which results in the generation of multiple variants of the same gene. Rather than one gene giving rise to one protein, through splicing and other mechanisms, slight variants of the same protein are created. Babu and his team wanted to understand how extensive this variation is for GPCRs, and whether different human tissues express particular variants.

This understanding is critical for drug development, because current targeted therapies typically focus on one variant at a time. Recently published in Nature, the researchers found a variation that may explain why side effects occur. This variation also explains why drugs may work well against cell lines, but then fail in mouse models or other organisms where the variants may differ.

“This tissue-specific variation represents an underappreciated source of diversity among target GPCRs,” Babu says. “These differences would make developing highly specific drugs a challenge. But now that we know to look for them, we have an opportunity to create drugs that are uniquely targetable, minimizing side effects.”

Digging into data to drive progress

For their study, the researchers analyzed mRNA sequencing in more than 700 people, in 30 different tissues. This cohort, part of the Gene-Tissue Expression database (GTEx), included age and gender diversity.

“The core of this project involved learning from multiple sources of big data, as we were integrating and analyzing genomics, transcriptomics, proteomics, and structural and pharmacological information,” says first author of the recent Nature paper and the Marie Sklodowska-Curie postdoctoral researcher in the Babu laboratory, Maria Marti-Solano, PhD, of MRC Laboratory of Molecular Biology. “The fact that multiple receptor isoforms can coexist in particular tissues highlights the need for more careful evaluation of the cell, tissue or animal models that are used to study GPCR signaling and to screen for new drugs.”

Collaboration played an important role in the researchers’ discoveries, including working with scientists at the University of Michigan, University of Cambridge and the University of Glasgow who validated some of the findings. These collaborators also experimentally characterized the variants for some of the receptors identified through this study.

Like a lot of modern biomedical research, this effort generated a tremendous amount of data. To make this data accessible, and foster a spirit of collaboration and openness, the researchers created a new section in the database called the G protein-coupled receptor database (GPCRdb). This database is maintained by David Gloriam, PhD, of the University of Copenhagen, who is a co-author of the study. The database allows the scientists to share information about the variants, including which regions are affected, which parts of the proteins are the same, which receptors have multiple variants and where they are expressed.

Looking toward the future

This work reflects the approach to research that Babu brings to St. Jude, where doctors, scientists and medical researchers generate an abundance of valuable scientific data  daily. Genomic sequencing provides data that has the potential to revolutionize life sciences research. But, by themselves, the sequences paint a limited picture. By placing them in the context of biochemistry, genetics, molecular, evolutionary, structural and systems biology, scientists can better understand their implications.

Babu’s work combines fundamental biological questions with state-of-the-art computational approaches, including machine learning. This allows his lab to make discoveries by reducing enormous amounts of details into simple and universal principles that have relevance throughout biology and biomedicine.

“We won’t be able to make better drugs until we fully understand the biology of our specific targets and the role they play in disease,” Babu says. “To do that we need to use all of the tools in our toolbox. This means building bridges between scientific disciplines to really listen to what the data are telling us.”