As a researcher who got an extensive training in both mathematics and physics, I am well-versed in applying quantitative skills to analyze data for various biological organisms and ecosystems. What I like most in my work is the process of learning how living organisms function, behave, and interact with each other. Throughout my studies I focused on mastering various techniques to disentangle these from the data. In my previous studies, I took part in a variety of projects studying different ecosystems and organisms, developed new methods for data analysis, and proposed new models describing dynamics of microbial communities. Through these projects, I have cultivated a deep sense for how to use mathematical models alongside data to gain scientific insight into complex systems and I plan to leverage this expertise in my future studies.

Metagenomic analysis of human gut microbiome

I started my research career by analyzing metagenomic sequencing data from various microbial ecosystems with a particular focus on the human gut microbiome and developing methods to detect patterns in them, e.g. trends in species and functional diversity. However, I found association/ correlation analyses to be very limited in providing mechanistic insights into how these ecosystems truly behave and function. My efforts to explain the patterns observed in the data using only species composition led me to understand that the complexity of these ecosystems might be better explained not by the microbial species present in the community but rather by the environment, i.e. metabolites these species or the host consume and secrete. These metabolites shape microbial communities as they often underlie interactions between species, and overall shape the metabolic function of the community.

Consumer-resource models of microbial ecosystems

This realization led me to study mechanistic models capturing different patterns of metabolite consumption in microbial communities during my Ph.D. Over the course of four years, I developed several game-theory-based and dynamic models which allowed me to gain detailed insights into principles governing microbial interactions, community assembly, stability, and resilience. In one of my most recent projects, I leveraged my expertise in handling multi-omics data and incorporating such data into constructing a model of consumption and production of metabolites by various species in the human gut. The key idea behind this project is that the ecological dynamics of the human gut community is governed by a dense cross-feeding network which is relatively general for human gut species and is at least partially unknown. By analyzing a large collection of paired metabolome- metagenome samples it is possible to uncover some of the previously unknown metabolic links. One of the important practical uses for this model is to reconstruct the metabolic output of the community from microbial species abundance data alone. My collaborators and I are now working on the new generation of our model using machine learning techniques.

Side projects and ML applications in biology

Throughout my Ph.D., I also contributed to a variety of projects working with other multi-omics data (metagenomics, metabolomics, RNA-seq, DAP-seq) and models in the mouse gut, machine learning of cancer tumors, neural chord architecture, and transcription regulatory network in non-model yeast species. That helped me to broaden my scientific scope and shape my current research interests. I got involved in these studies due to my interest in machine learning and its applications to omics data. There are a couple of directions here that are the most appealing to me. The first is to “go under the hood” of various machine learning techniques and try to decipher what modalities in the input data were the most important for correct outcome prediction (e.g. tumor development, overexpression of genes, etc.). Using this type of analysis, we might be able to gain new insights into unknown biological functions and mechanisms, which might be particularly crucial for cancer research and usage of previously unstudied organisms (such as non-model species). The second direction is to study general principles of learning, information encoding, and, especially for the case of neural networks, effects of a specific architecture. For example in my work on neural chord architecture, together with my collaborators I showed that it poses a bottleneck on the transmission of information from the brain to motor neurons and living systems have to use some tricks, such as modular structure of neuron connectivity, to overcome this limita. Another prominent example of machine learning application emerged in my current work for the Center for Advanced Bioenergy and Bioproducts Innovation (CABBI) on the reconstruction of gene regulatory networks (GRN) for non-model yeast species. Since these species became practically relevant for biofuel production and metabolic engineering only recently, there is a limited amount of experimental data collected and virtually no knowledge of their transcription regulation of various genes. I applied standard approaches for GRN reconstruction including analysis of co-expression networks and creating maps of orthologous genes between these species and well-studied organisms, however, it allowed to uncover only a small and fuzzy part of this GRN. However, training a machine learning model on the available data has a potential to overcome this barrier as it was shown to be immensely powerful in capturing various hidden dependencies in the data. Thus by interrogating the model we might be able to extract a better version of GRN. This research has extensive practical applications, as uncovering transcription regulation of various genes will help other CABBI labs to guide future genome editing experiments.

Future plans

For my future research, I am open to working on many new questions and projects. Currently, my main interest lies in microbial communities, and I intend to continue bridging the gap between collected experimental data and theoretical models. I believe that this work can significantly advance our understanding of microbes and address several sharp questions in this field: How do competitive and cooperative interactions shape species composition of microbial ecosystems? How do these natural forces and various trade-offs affect the function and fate of bacterial populations? How are they encoded in the genomes of these organisms? Can we reliably manipulate and control complex microbial communities when most environmental changes affect many or all microbial species simultaneously? I anticipate this will result in the development of mechanistic models that incorporate the right level of detail and clarify existing puzzles in microbial ecology. I hope that this research will aid in solving practical tasks such as designing new therapies for the human microbiome and industrial applications of microbial communities.