Introduction
Welcome back to the Data Reuse Digest - and the latest edition of our series on computational research for type II diabetes. We are exploring how different kinds of computational studies can drive clinical progress. By sharing success stories in one disease area, we aim to inspire the implementation of these successful approaches for other diseases also.
If you have friends or colleagues that you think would benefit from this newsletter, you can share it with them by clicking this button:
And if you are reading this now, but you’re not yet a subscriber, you can subscribe by clicking this button:
How Can Existing Drugs be Repurposed to Treat Type II Diabetes?
The starting point is often a laboratory experiment. You can take tissue samples from a group of people with type II diabetes, and a group of otherwise healthy individuals, and compare them. How can you compare them? Researchers often look at gene expressionA. Cells express the genes embedded in their DNA, which are translated into proteins, which carry out cellular functions. This same story is true for all the cells in the body. Heart cells and muscle cells, kidney cells and retinal (eye) cells are different only because they express different genes to different extents. Each cell from a single individual has exactly the same DNA – the same toolbox of genetic capabilities. Different cells just use this toolbox differentlyB.
While gene expression varies between different cells in the same individual, there are also gene expression differences in the same kind of cell between two individuals. Distinct individuals do in fact have differences in their DNA, which impacts gene expression. For example, the retinal cells of someone with diabetic retinopathy may exhibit a different pattern of gene expression (some genes are more actively expressed, others expressed less actively), and therefore function differently, then the retinal cells of someone without this disorder.
Understanding differences in gene expression in the same population of cells between healthy and diseased individuals helps us form a more complete picture of human physiology. It is also useful from a drug development standpoint. Say you have a database that keeps track of known drug compounds and their effects on human gene expression (these databases do indeed exist, as we will soon see. They are built up from the results of many prior studies). Now say you also know from prior studies that a disease is associated with the elevated expression of certain genes. With these two types of information in hand, you can use the drug database to find compounds that suppress exactly those genes which are elevated in disease. The idea is that if you can suppress these disease genes, you may be able to treat the condition.
Here is an example of this research approach in action. There are currently no effective preventative treatments for diabetic retinopathy – the loss of vision that afflicts many diabetics over time as their condition worsens. Researchers do know, however, that increased acellular capillary density in the retina is an early sign of retinopathy development. Acellular capillaries are tube-like structures that exist in the retina (the part of the eye that captures light so that this information can be transmitted, via the optic nerve, to the brain). If you can find a drug that slows or even halts this phenomenon, you may have an effective retinopathy treatment on your hands.
A team of researchers from UC Santa Barbara and several other American research institutions decided to identify genes that are more actively expressed in the retinas of Nile rats with high acellular capillary density. After comparing rats with high vs. low acellular capillary density, they keyed in on 14 genes that were associated with high density. The researchers labeled these genes the 'transcriptomic clock’ (the transcriptome is the set of all genes expressed in a cell) because they could use the expression levels for these genes to predict how far along a rat is likely to be in the progression to retinopathy. Humans are not rats, of course. But it is not unlikely that a similar transcriptomic clock operates in human retinal cells. Future research is needed, however, to confirmC.
Featured Study: Transcriptomic clock predicts vascular changes of prodromal diabetic retinopathy (Toh et al., 2023, Scientific Reports)
After establishing the transcriptomic clock, the researchers screened a large database of known drug compounds (LINCS L1000) for drugs that suppress the expression of the genes in the transcriptomic clockD. They identified 3 candidate compounds, two of which had already shown promise in prior studies (also on animal models) for the treatment of two other eye disorders: retinitis pigmentosa and macular degeneration. This is a good sign that they may be effective for diabetic retinopathy as well. Further testing of these drug candidates in animal models and human trials is needed.
A similar kind of study has found potential drug candidates to treat diabetic foot ulcers. This team of scientists came together from research centers in Malaysia, Indonesia, and the United States. The team pulled two gene expression datasets from a large public database (The NCBI Gene Expression Omnibus)E. Both datasets contained some tissue samples from healthy patients and others from patients with diabetic foot ulcers. They compared these two groups to identify genes that are differentially expressed in the diabetics.
Featured Study: Transcriptomics-driven drug repositioning for the treatment of diabetic foot ulcer (Adikusuma et al., 2023, Scientific Reports)
After identifying 31 genes as ‘risk genes’ for diabetic foot ulcers, the researchers screened a public database for drugs that are known to alter the expression of these risk genes, just like the diabetic retinopathy study. In this study, they used the aptly named Drug-Gene Interaction Database (DGIdb) and found 31 candidate drugs. Two of these drugs are already under clinical investigation for DFU (urokinase, lidocaine). Another drug called Anakinra (Kineret), associated with the risk gene IL1R1, is an FDA-approved immunosuppressant commonly prescribed to treat rheumatoid arthritis.
Research Trends
The point of this section is to provide big-picture context: how are the featured studies shared in this edition representative of broader trends in computational research? These trends will be sometimes cite information from past editions, additional research articles, and mainstream news stories.
(A) Since the 70s, researchers have been measuring the quantity of RNA molecules in biological samples. Under the umbrella of ‘RNA molecules’ is the sub-class of mRNAs. These mRNA molecules are the physical manifestation of genes (mRNAs are transcribed from DNA and translated into proteins). Some of the techniques developed prior to the start of the 21st century, like RT-PCR, are still in use (COVID tests use RT-PCR to detect coronavirus RNA molecules). However, the modern gold standard for gene expression measurement is the use of ‘high throughput’ sequencing techniques. These aim to measure all the mRNA molecules in a sample at once – not just specific known mRNAs of interest. While the microarray approach was initially the most common form of high throughput sequencing in the early 21st century, an approach called RNA sequencing (or RNA-seq) has become dominant over the past decade. Over the past few years, a technique called single-cell RNA-seq has entered the fray. While tradition RNA-seq and other methods for measuring gene expression get at ‘bulk’ gene expression for a whole sample (which may include many different cell types), single cell RNA-seq is able to measure gene expression in the individual cells of a sample. Even more recently, researchers have started using spatial transcriptomic methods – making it possible to see not only how genes are expressed in individual cells from a sample, but also where cells expressing different genes are located in physical space.
(B) Technically, people experience mutations in their DNA over their lifetime – referred to as ‘somatic mutations’ because they happen in the body’s somatic cells (i.e., non-reproductive cells). These mutations occur in specific cell types, meaning that different cell populations in our bodies actually do drift apart in their DNA sequence over time. This phenomenon is referred to in the literature as ‘somatic mosaicism’. Somatic mutations play a key role in cancer, aging, and other diseases.
(C) It is common to use so-called model organisms, like the Nile Rat, to measure gene expression under different conditions. A big reason why a particular model is chosen is that it experiences physiological changes during disease that emulate the human condition. Nile Rats were chosen for this study because they are known to experience high acellular capillary density just like humans with diabetic retinopathy. Once two groups of rats have been established in the lab (in this case, high vs. low acellular capillary density), researchers ‘sack’ (euthanize) the rats, extract retinal tissue, and measure gene expression. Obviously, you couldn’t do this procedure with humans – this is why you need a model organism. However, the limitation of model organism research is that the gene expression signatures seen in rats with high acellular capillary density may not accurately reflect the gene expression signatures of humans with this same condition. So, after the researchers establish a gene expression signature and use it to identify drug compounds that may alleviate symptoms of disease, they will ultimately need to test these compounds on the rats and other model organisms, on human cells, and if still promising at this point, in human clinical trials.
(D) There are a lot of databases like this – too many to quantify. We’ve mentioned other studies in previous editions of this newsletter where researchers used public data to guide drug repurposing (for example, here and here). The research in this edition is a great example of how past biological data, put together in useful combinations, can fuel new discoveries. Often, useful data is scattered across multiple databases or even exists as a supplementary dataset from an individual study. It is therefore a useful task for scientists to hunt for existing sources of data and combine it in useful ways – though not an easy task, given the many thousands of scientific databases that exist and the millions of scientific studies that are published each year.
(E) GEO is a very useful resource – existing datasets published in GEO are re-used in many thousands of new studies each year. We have seen plenty of previous examples in this newsletter (for example, here, here, here, and here).