Combining Chemistry and Computers to Discover New IBD Biomarkers
From the Computer to the Clinic: 12-29-23
Introduction
Welcome back! We’ve just changed our name - from ‘The Data Reuse Digest’ to ‘From the Computer to the Clinic’ (we think this one will appeal to a broader audience).
In this newsletter, we are exploring how bioinformatics and computational biology research can drive clinical progress. By sharing success stories in one disease area, we aim to inspire the implementation of these successful approaches for other diseases also.
If you haven’t already, you can subscribe to this newsletter, or share it with friends and colleagues
Featured Study
A team of researchers from UC San-Diego and several other universities have discovered new molecules, present in the human body, that may contribute to inflammatory bowel disease. Reducing the production of these molecules could help alleviate symptoms. It might be possible to do this by targeting the microbiome.
Featured Study: Reverse metabolomics for the discovery of chemical structures from humans (Gentry et al., Nature, 2023 [Early Access])
Inflammatory bowel disease – or IBD – is a term that encompasses two conditions: Chron’s disease and ulcerative colitis. Both involve intestinal inflammation but differ in the extent and location of inflammation. For individuals with IBD, daily life is made difficult by a range of symptoms including abdominal pain, diarrhea, and fatigue.
According to a 2022 report, IBD diagnoses have been on the rise over the past several decades, and annual US health care spending for IBD has reached tens of millions of dollars annually. This trend is not unique to the US, and while increased IBD diagnosis is to some extent a function of better disease surveillance and treatment, factors like poorer diet (high fat, high sugar, low fiber) are also contributing.
On the molecular level, the causes of IBD are complex, with a variety of different human genes and biomolecules playing a roleA. The microbiome, too, appears to affect IBD risk and severity. One focus of the research community has been bile acids – molecules produced in the liver that help digest food. They are known to be modified by the microbiome and their abnormal activity in IBS is thought to contribute to diarrhea and other symptoms.
These molecules are front in center in this edition’s featured study. The research team discovered new biomolecules (called bile acid conjugates), produced by the microbiome (via the addition of chemical groups to the bile acids produced by the liver), that could be contributing to IBS symptoms. How did they do it? And why have these bile acid conjugates not been noticed before?
The answer to both questions lies in the way that molecules in the human body are identified and measured. To figure out what molecules are present in the human body you first have to gather patient samples. These may be blood samples, urine, or various kinds of extracted tissue. Once you have these samples, the most common way to figure out what molecules they contain is a technique called mass spectrometry or ‘mass spec’. Mass spec exposes the molecules in a sample to high energy, fragmenting them into smaller pieces that can recognized by a detector. You can tell molecules apart because each has its own characteristic mass spectrum – a series of peaks that represent its molecular fragments and serves as a molecular barcode.
For some molecules, we know from prior study what this barcode should look like. Therefore, we can search for these patterns in the mass spec data of human samples and say whether or not they are presentB. There are many molecules, whose molecular barcode is unknown – their fragment peaks will be detected by the mass spectrometer, but we haven’t yet learned what molecule they represent. Some have referred to this large set of unidentified peaks as the metabolic ‘dark matter’C. Typically, researchers will look at mass spec data, quantify the molecules they know, and discard the unidentified peaks. The issue is that some of these discarded peaks may represent molecules that contribute to human disease.
To shed light on the metabolic dark matter, the authors of the featured study have brought together tools from chemistry and computational biology. In the lab, they synthesized a series of ‘bile amidates’ – these are bile acids that are combined with amino acids (the constituents of proteins). Some of the authors from the featured study had previously shown that the human microbiome produces these kinds of molecules, but the full range of combinations that the human microbiome can produce is unknown. To explore all the different chemical possibilities, the researchers created all possible combinations of eight distinct bile acids with 22 amino acids (176 molecules in total)
After synthesizing these bile amidate compounds, the researchers performed mass spectrometry. Specifically, they used a technique called tandem mass spectrometry that allows for the mass spectrum of each individual molecule – its molecular barcode – to be determined. Then, with spectra in hand for each of these prospective biomolecules, the researchers turned to their computers and begin searching for evidence of these molecules in two public databases.
The first database, MASST, contains published tandem mass spectrometry data from billions of samples, including many human samples. This search showed the researchers whether their newly synthesized molecules – identified by their mass spectra – are present in human samples. The researchers also drew data from a second database called ReDU. This one contains a variety of metadata (data about data) like the disease state, tissue type, donor sex, and other clinical data of the human samples. This allowed them to see whether any of their molecules were associated with IBS.
The results of this approach, which the researchers call ‘reverse metabolomics’, are impressive. 145 of the synthesized bile amidates were detected in samples from the MASST database. Some of these were more prevalent in samples from people with IBD than healthy individuals (in particular, bile amidates containing the amino acids Glu, Phe, and Trp). These results were further validated with a separate IBD dataset from the human microbiome project – providing further confirmation that bile amidates are present in human samples and that some of them are elevated in individuals with IBDD.
These findings identify a potential treatment target: reduce production of the bile amidates that are elevated in IBD. The researchers also hint at how this could be done. Knowing that bile amidates are produced by the microbiome, the study authors gathered over 200 microbial isolates that are known to inhabit the human body, grew them in the lab, and took measurements to figure out which microbes are capable of producing bile amidates. Another study has shown that it is possible to block the activity of specific microbial enzymes to therapeutic effect in the context of chronic kidney diseaseE. This same approach could be applied here. But before going down this road, further analysis is necessary – to show that increased levels of bile amidate-producing microbes are consistently associated with disease, and to figure out which specific microbial enzymes are contributing to bile amidate production so they can serve as drug targets.
Broader Trends
(A) See this review on the molecular mechanisms of Chron’s disease – and this one on the molecular mechanisms underlying ulcerative colitis.
(B) How do you figure out what a molecule’s mass spectrum looks like in the first place? The featured study demonstrates how to do it. You synthesize a specific molecule in the laboratory and produce a pure solution – then you subject it to mass spec to figure out what the molecule’s unique spectrum (its molecular barcode) looks like. Once you have this spectrum, you can search for its presence in biological samples.
(C) The phrase ‘dark matter’ comes from physics. It is used here to describe molecules that we can detect but can’t yet put a name to. It has also been used in other biological contexts – for example, the ‘dark matter’ of the microbiome refers to the large set of proteins produced by human microbes whose functional purpose we do not know. See, for example, ‘Unraveling the functional dark matter through global metagenomics’ (Pavlopoulos et al., 2023).
(D) Replication is key – especially in a study like this that claims to identify new physiologically relevant molecules. If the researchers can detect these new molecules in two distinct datasets, their work has a lot more weight in the eyes of the research community.
Replication is also common in studies where researchers have developed machine learning models to detect molecular signatures of disease. After training these models on their own data to distinguish people with disease from healthy individuals – the models are often tested on an external validation dataset to show that they can still distinguish diseased individuals from controls and are not ‘overfit’ on the researcher’s training data. We have seen an example of this before in an earlier edition of the newsletter.
(E) Here is an example of enzyme inhibition (read this study for more details). People with chronic kidney disease (CKD) often accumulate a molecule called indoxyl sulfate (IS) because their kidneys don’t work well enough to excrete it. IS accumulation is associated with cardiovascular disease and other health problems (cardiac events actually account for 40-50% of deaths in people with advanced kidney disease). The production of IS depends on microbial enzymes – so it stands to reason that if you block these enzymes, you can reduce IS production and limit the risk of heart disease for people with CKD. As an added benefit, molecules called indoles (of which IS is an example), have been shown to promote the formation of bacterial biofilms – structures that can resist antibiotic treatment. This is especially relevant for patients in the hospital (people with advanced CKD are likely to be there often), where the risk of colonization with dangerous pathogens is high. Block the IS producing enzymes and you could solve a lot of problems.
In the linked study, the researchers were able to identify many likely IS-producing microbial enzymes from a variety of different species in the human microbiome. For a selection of these enzymes, the researchers confirmed their ability to produce IS in the lab. They then developed a ‘pan-inhibitor’ that could block the activity of IS-producing enzymes in multiple microbial species at once.
Though there is a lot more research needed to test the efficacy of this ‘pan-inhibitor’ (most importantly, its ability to block IS production, without having toxic side effects, in human subjects), this same approach could theoretically be applied to block the production of bile amidates associated with IBD.