Predictive Biology Moves Towards Predictive Medicine

Edison T. Liu
The Jackson Laboratory

T

he major truism accepted by all is that biology is complex. Yet, since the 19th century, the major experimental focus in biology has been reductionist, which is to isolate component parts, divine their individual function, and then use this knowledge to extrapolate the behavior of the larger system. With the advent of comprehensive genomic, proteomic, and metabolomic technologies powered by advanced computational approaches, all possible genes have been mined, and each resultant mutant molecule or pathway can be inferred.

For the first time in the history of science, all the fundamental genic participants of a biological system have been discovered. This is the key precondition for predictive biology, which is the ability to accurately predict biological outcomes based on measurable data obtained prior to a cellular perturbation. Expand that to human physiology and this becomes predictive and precision medicine. The hope is that with this knowledge, we will be able to create accurate functional models of any cellular or disease process to not only predict future events but also divine hidden processes.

This may be considered grandiose. But, in fact, it can be done and has been done in other disciplines. Astronomy provided a schedule of the movement of the heavenly bodies based only on the knowledge of past movements. By incorporating the fundamental principles of physics, astrophysicists were able to create mathematical and computational models of the cosmos that can predict how stars evolve into supernovas, and the behavior of black holes. Because the timeline of cosmological events precludes any prospective experiment on earth, these models are the only way new cosmic occurrences can be foretold.

Some would assert that biologists and medical scientists have always been developing predictive models of organismal behavior. To be fair, the intention was always there, but the models were rudimentary, and the outcomes were narrowly defined with limited utility. These models cannot be generalized to new situations, and certainly would never generate new insights.

The predictive biology models we envisage here differ vastly from those of the past by virtue of the ability to integrate massive amounts of highly heterogeneous data (integrative); to respond to stimuli that were not programmed (dynamic); to identify novel processes or solutions (emergent); and to learn (adaptive). In the past two decades, three major advances have moved biology and biomedicine closer to achieving predictive biological models: the wealth of pan-omics, chemical, and clinical data that are computable; the advent of powerful and intelligent computation systems in data sciences and artificial intelligence (AI); and their integration through systems biology efforts. The remaining challenge to this goal of predictive biology is to understand how the nuanced variations of each component, their control mechanisms, and their interlocking networks are aggregated to create a complete dynamic system.

AI applications already show that they can improve mammographic and pathological diagnoses and provide robotic solutions for minimally invasive surgery. Stokes et al. used the chemical characteristics of 2,335 chemical entities to train a deep learning model for antibacterial effects. When applied as a virtual screen on over 107 million compounds, they found one, halicin, with broad-spectrum efficacy against other pathogens including antibiotic-resistant E. coli strains and whose structure was a total surprise. This is the generative outcome of an intelligent system; that is, the generation of a finding that could not have been conceived using standard approaches. Though important, these are examples of data-driven solutions for limited technical problems in medicine and do not model the full extent of a disease or provide mechanistic biological insights.

Systems biology, however, addresses this by integrating basic biological information to derive useful models that seek to define higher order control mechanisms in disease. Such inferred regulatory maps uncover control nodes that identify potential new targets for therapeutics. Some of the most advanced applications have been in cancer but have also been contemplated in diseases like Alzheimer’s, where the call was to create an “assembly manual of the Alzheimer’s cell.” For a biological process, the integrative systems approach is perhaps the most mature in immunology where predictive models are being created at the micro- and macro- levels. In all cases, the quantified connectivity of the components, molecular and cellular, that can define hierarchical and dynamic relationships are the cornerstones to these models. The Human Cell Atlas is a prime example of an international effort to generate reference models of the cellular and genomic interactions of key organs. For dynamic models to be accurate, the assessment of the changes of all component parts after a challenge is key. For the Human Cell Atlas, some analyses are for organ-specific cells over age. In cancer, efforts like DepMap perturb genes and assess survival in cancer cells.

What may a future with predictive biology look like? A cancer patient will have her breast cancer sequenced (RNA and DNA) both in bulk and using single-cell genomic approaches. This will not only determine the intrinsic cancer characteristics but also the important cancer-stromal interactions. A metabolic profile of the cancer cells will be determined. Plasma-based cell-free DNA and RNA assessment over the course of her disease will infer the clonal fluxes of the tumor. The host genomic sequence will be obtained, and the immune profile of the individual assessed. This composite information will be used to create a virtual “avatar” of the patient that will simultaneously determine the molecular vulnerabilities of the tumor as well as the capability of the patient to mount an immune attack on her cancer. The output will be a projection of the combinations of treatments that will optimally control the cancer and even cure it. Through monitoring the tumor’s genomic response to therapy, perturbation analysis will plot the most rational next steps for treatment. This is adaptive, responsive, and personalized cancer treatment. Importantly, this information can provide the substrate for new target identification and potentially function as “crowd sourcing” for drug discovery, if shared.

It is a brave new world.