Uniting Natural History with the Molecular Sciences. The Ultimate

Mar 21, 2017 - One grand challenge seeks to interconnect these structures, in ways acceptable to both natural historians and physical scientists, to g...
0 downloads 4 Views 2MB Size
Commentary pubs.acs.org/accounts

Uniting Natural History with the Molecular Sciences. The Ultimate Multidisciplinarity Published as part of the Accounts of Chemical Research special issue “Holy Grails in Chemistry”. Steven Benner Foundation for Applied Molecular Evolution, Firebird Biomolecular Sciences LLC and The Westheimer Institute for Science and Technology, Alachua, Florida 32615, United States ABSTRACT: Life and the Earth have coevolved over the past four billion years to deliver a rich diversity of biological structure, from biomolecules to macrophysiology. One grand challenge seeks to interconnect these structures, in ways acceptable to both natural historians and physical scientists, to give an interconnecting web of models and experiments to create a planetary understanding of the phenomenon that we call “life”. The molecular scientist wants experiments; the natural historian wants reference to Darwinian fitness. Paleogenetics offers both.

N

Genome sequencing has reinforced these conclusions. Indeed, the similarities in gene and protein sequences are often evident to the untrained eye and cannot be explained in any way other than by common ancestry. This is quite striking in the ribosome, the machine that makes proteins. Ribosomes are clearly homologous in all modern life forms and must have descended from a ribosome made by LUCA. Further, the RNA parts of the ribosome, not its protein parts, are evidently responsible for the biosynthesis of proteins. This suggests a historical statement: RNA came before encoded proteins. Further, extrapolating to the big picture, this and the presence of RNA cofactors (whose RNA bits make sense as handles for RNA catalysts like that found in the core of the ribosome) suggest an “RNA World”, an episode of natural history on Earth where RNA was the sole genetically encoded component of biological catalysis. Faced with a large volume of sequence data, the librarians of biology have established grand catalogs to collect the limited number of gene families (compared to the number of species) in a form that can be searched. These catalogs differ from Linnaeus’ historic catalog of the macroscopic biosphere but only in their extent and method of archiving.

atural historians, as they explored the biosphere during the Age of Discovery, marveled at the diversity in the biology that they uncovered. Each new voyage yielded an increasingly exotic collection of macrobiota, animal and vegetable. New technology, such as the microscope, discovered an entirely new and unexpected range of microbial life forms. All seemed to be quite different from each other, and those differences were remarked upon. Only centuries later did chemistry, through its analysis of natural products from these life forms, discover an opposite reality: All known terran life, even the most macroscopically exotic, were found to share biomolecular structures. These included a shared genetic biopolymer with shared nucleotide building blocks, a shared catalytic biopolymer with shared encoded amino acids, and a (nearly) common genetic code connecting the two. Likewise, all known biota on Earth were discovered to share a core metabolism based on common metabolites, including substrates and cofactors, many of which had odd bits of ribonucleic acid (RNA) attached to them (Figure 1). These commonalities all point to common ancestry, that all life forms on Earth are related to all others as descendants of a last universal common ancestor (LUCA) that lived perhaps 3 billion years ago. LUCA, it is inferred, also had these DNA, RNA, proteins, and RNA cofactors.1 Also inferred is the view that any diversity in modern biomolecular structure reflects drift away from that common biochemistry or adaptation of these ancestral biomolecules within a particular species faced with a new environment or invention of new molecules by individual lineages to meet the particular challenges of the environment or ecosystem where species in that lineage came to inhabit. © 2017 American Chemical Society



UNIFYING THE PALEONTOLOGICAL, GEOLOGICAL, AND MOLECULAR RECORDS OF LIFE ON EARTH This contrast between chemical commonality reflecting common ancestry and biological diversity reflecting the Received: September 30, 2016 Published: March 21, 2017 498

DOI: 10.1021/acs.accounts.6b00496 Acc. Chem. Res. 2017, 50, 498−502

Commentary

Accounts of Chemical Research

Figure 1. RNA cofactors with the RNA portion in orange and the functional portion in black, together (in blue) with a replacement that performs the same role without the RNA tag. The RNA cofactors are found universally in modern terran biology and therefore are referred to as having been present in the last universal common ancestor (LUCA); the blue replacements are not. Also present universally is biotin (at the right), which notably lacks an RNA portion.

diversity of terran environments sets the stage for one of the grandest challenges in science: To unite the descriptions of life from the molecular sciences with the descriptions of life from natural history. The goal of this union is to create a planetary model that describes, rationalizes, and extracts principles from the one example of life that we can access. That planetary model would cross-reference what we can extract from natural historical records (geological, paleontological, ecological) with molecular records. But it must also include the molecular reactivity that is understood by physical organic chemists. These two broad intellectual traditions must work together for each to teach the other. The Earth, the Solar System, and the surrounding cosmos have influenced the life that we know, including its molecular structures. Conversely, on Earth and soon to be true throughout the Solar System, life has influenced the structure of the planets. As just as one example, life ∼2.5 billion years ago learned to split water to reduce carbon dioxide, extruding dioxygen into the planetary atmosphere as waste. By doing so, life transformed the oxidation state of Earth’s near surface, changing patterns of erosion, deposition, and tectonics. These changes then fed back into biology by creating new bioenergy sources, new ecological niches, and (eventually) the multicellularity that we call “advanced” life. This back-and-forth interaction between the biosphere and geosphere drove physiological and molecular adaptation, subject to constraints of chemistry and time. If time was sufficient to search molecular structure space, the adapted biomolecules came to reflect optimal molecular solutions to the challenges presented by new environments. If time was insufficient for rapid environmental change, the adapted biomolecules reflect only locally optimal solutions to those challenges. In some cases where structure was densely interwoven within biology, biomolecular structures did not change at all but rather retained their vestigial forms that reflect constraints of times past, including the time when Darwinism first emerged on an abiotic world. To make things even more

complicated, when environments did not change, features of biomolecular structures that had no impact on fitness nonetheless changed, drifting unconstrained by natural selection. A planetary model would distinguish between each of these as the dominant explanation for every biological structure.



A TROVE OF INFORMATION ABOUT MOLECULAR REACTIVITY Such a model would be a tremendous resource for chemistry, more than three billion years of experiments in molecular structure and molecular behavior unconstrained by human hypothesis. This absent constraint is important. Hypotheses, when chosen by scientists, are conservatively constructed to achieve particular outcomes. Sensible scientists avoid hypotheses that challenge core paradigms; if they do not, their journal editors or funding agencies do. In contrast, natural history provides unscripted experiments for the biological chemist. Further, the statement, “This molecular structure is optimal for solving this particular problem in molecular reactivity in this particular environment” is a remarkably strong statement about structure−function relations, one difficult to extract by hypothesis-directed research. Further, such statements distinguish interesting from uninteresting research problems in biomolecular chemistry. It makes no sense to devote the armamentarium of modern molecular science to study properties of biomolecules that are drifting neutrally; this is like studying Picasso with an electron microscope. Likewise, it makes no sense to view as optimal a biomolecular trait that is in fact poorly optimized because of the constraints of time and chemistry. Indeed, many biomolecular structures seem to be bad solutions to the challenges of biology. Consider, for example, the enzyme that fixes carbon dioxide from the atmosphere (ribulose bisphosphate carboxylase). Perhaps because it arose in an atmosphere of CO2 lacking dioxygen, it still fails to prevent O2 from destroying half of its 499

DOI: 10.1021/acs.accounts.6b00496 Acc. Chem. Res. 2017, 50, 498−502

Commentary

Accounts of Chemical Research

Figure 2. Presence of the RNA cofactor ATP in all three domains of life implies, under a rule of “maximum parsimony”, that is was present in the last universal common ancestor (LUCA). Its RNA handle, which is incidental to its fitness-useful reactivity, implies that it emerged in the RNA World as a handle for RNA catalysts there. Its presence in both LUCA and the RNA World implies that both had a metabolism that involves phosphate transfer.

method to bear on hypotheses from natural history. To get natural historians involved in molecular science, strategies are needed to connect the molecules to Darwinian fitness. Several approaches are available that do both. Let us consider just one. This approach works backward in time, beginning with the structures of modern biomolecules in organisms living today. It then extracts rules about biomolecular change by analyzing those structures. It then applies these to infer the structures of ancestral biomolecules that were present in past, now extinct, life forms. At one level, this process is intuitive. For example, if ATP is present in all modern life forms (as, indeed, it is, at least for the ones we know about), a rule of “parsimony” implies that it was present in the last universal common ancestor (LUCA) of that life (Figure 2). The same can be said for the nicotinamide and riboflavin RNA cofactors, coenzyme A, and S-adenosylmethionine. It also appears to be true for non-RNA cofactors such as biotin. But inference need not stop at molecular structure. Each of these cofactors has a particular chemical reactivity, such as phosphate group transfer, oxidation and reduction, Claisen carbon−carbon bond formation, methyl group transfer, and carbon dioxide management. If LUCA had these cofactors, we can infer that LUCA must also have had a metabolism that

substrate. As one Darwinian hack to mitigate this problem, some plants pump CO2 to increase its local concentration over the concentration of O2. This hack illustrates the impotence of the Darwinian search strategy when it is obstructed by real chemistry. If we could understand this, we would understand another slice of molecular reactivity. However, chemists need to understand the historical constraints that constrained the chemistry that they observe; this is the only way that we can be taught by natural history.



THE DIFFICULTIES OF UNIFYING MOLECULAR SCIENCE WITH NATURAL HISTORY The unification of natural history with the molecular sciences, however necessary, is a “grand” challenge not just because these two traditions are disparate in their terminology, techniques, and tactics. More seriously, they are also incongruent in their philosophies. Many physical scientists do not think that natural history is “science” at all. To them, natural history is descriptive (Rutherford used the phrase “stamp collecting”), irreproducible in the laboratory, and contingent rather than universal in its theories. Conversely, natural historians do not quite accept that molecular scientists are studying “life” when the first step of their research program kills the thing that is alive. To get molecular scientists involved in natural history, research strategies are needed that bring “the” experimental 500

DOI: 10.1021/acs.accounts.6b00496 Acc. Chem. Res. 2017, 50, 498−502

Commentary

Accounts of Chemical Research

they also connected those changes to that elusive Darwinian concept of “fitness”, the touchstone to evaluate the significance of any study of biomolecules. Of course, this approach has practical value. For example, Dan Tawfik at the Weizmann Institute has used ancestral reconstructions to guide his work on protein engineering.8 At the same time, he uncovered information that alters our view of molecular replacement rules,9 and challenges the conclusions that have been drawn from paleogenetic studies that resurrected biomolecules from organisms that lived perhaps three billion year ago.10 Paleogenetics is being applied to develop improved protein pharmaceuticals and understand the interrelation between proteins and disease.11 Nevertheless, this combination is transformative, as it makes the huge stretch from molecular behavior to biological fitness. Paleogenetics has deepened our understanding of other specific cases. For example, Belinda Chang has resurrected ancient visual proteins, tying these to the evolving ecology of lizards, snakes, birds, and dinosaurs.12 My laboratory has resurrected ancestral yeast and primate enzymes involved in the biosynthesis and metabolism of ethanol, showing that our primate ancestors started to imbibe ∼7 million years ago.13 This too was tied to our ancestors’ ecosystem changing at that time. Much work remains to apply this approach broadly and deeply across the biosphere. On one hand, this seems to be like it would involve a lot of “stamp collecting”. Indeed, it will. However, the archiving is easier if the database has a “natural organization”, one that reflects natural history.14 Much archiving done is being done in web pages, such as the “Tree of Life”.15 However, being assembled by natural historians without the participation of chemists who understand reactivity, these databases are just a start. The goal must be to have paleogenetic analysis of all biomolecular systems set in the light of everything that natural historians can tell us about our past, interconnected from the molecule to the planet. One feature of this grand challenge is its apparent boundaries. Earth has exactly one history. Thus, after we build a complete planetary history of Earth’s biogeochemicopaleosphere, our work will have been done. We need not do another until we conquer space travel to find another planet with life. Of course, the Earth’s planetary biology will then be a key essential reference for our exploration of that discovered planet.

transferred phosphate groups, oxidized and reduced substrates, formed carbon−carbon bonds using Claisen condensations, transferred methyl groups, and managed carbon dioxide. Now, the pieces of RNA attached to RNA cofactors have been interpreted as vestiges of the RNA World.2 Structurally, this may seem intuitive. However, adding reactivity deepens the interpretive model. For example, Cornelius Visser and Richard Kellogg pointed out in 1978 that while the reactivity of ATP, NADH, flavin, and the other RNA cofactors was easy to reproduce in the laboratory using biomimetic systems designed by chemists, the reactivity of biotin (not an RNA cofactor) was not.3 Further, consistent with a natural history of biotin as a cofactor that emerged only after RNA catalysts appeared, biotin generally works as a covalent adduct with translated proteins. Visser and Kellogg then suggested that RNA catalysts were primitive compared to protein catalysts; they further suggested that mechanistic enzymologists attempting to mimic biological catalysts are likewise primitive. Thus, they suggested that we should expect that primitive entities (biomimetic chemists) could reproduce the catalysis seen by primitive historical catalysts (RNA enzymes) with the historical RNA cofactors, but they could not reproduce the sophisticated reactivity of biotin. In their view, reproducing the reactivity of biotin required more sophisticated protein catalysts or, to come, more sophisticated biomimetic chemists. Here is a historical model that defines the limits of molecular theory, itself identifying a “Holy Grail”: Build a molecule that extracts from biotin the reactivity that natural biology does. And remarkably, this analysis of reactivity persuaded biomimetic chemists of the sensibility of the “RNA World” a decade before molecular biologists came to create the term.4 But where is the experimental method needed to satisfy molecular scientists? Here, it comes from the magic of biotechnology. As noted by Linus Pauling and Emile Zuckerkandl a half-century ago,5 we can infer the sequences of ancestral proteins if we analyze the sequences of many of their descendants with an understanding of how those sequences change. Then, using total gene synthesis and heterologous expression (modern glosses placed atop the words of Pauling and Zukerkandl, who knew of neither in 1963), the genes for these ancestral biomolecules can be created and the ancient enzymes can be brought back to life. An actual “Jurassic Park” experiment. Resurrected proteins can then support experiments connecting molecular structure to natural history. For example, natural historians tell us that 40 million years ago, the Earth was much warmer than today. Then, natural historians note that the Earth entered a long-term cooling, leading to the Ice Ages of the last million years, and the relative cold of modern Earth. The plants adapted, with grasses emerging. The herbivores adapted, learning how to eat grasses with ruminant digestion. Bacteria adapted to live in the rumen. All contingent “stories”, which any well-trained physical scientists see to lack general theories required by science. But with paleogenetics, the ruminant enzymes have been resurrected, both those that break open rumen bacteria cell walls6 and those that then digest the rumen bacterial rRNA.7 We can follow the adaptation of these proteins as they moved into the digestive tract to manage this new physiology required by planetary climate change. They became more stable to digestion. They changed their substrate specificity. This taught molecular scientists about how individual amino acids influenced these molecular phenotypes, for sure. However,



AUTHOR INFORMATION

ORCID

Steven Benner: 0000-0002-3318-9917 Notes

The author declares no competing financial interest.



REFERENCES

(1) Benner, S. A.; Ellington, A. D.; Tauer, A. Modern metabolism as a palimpsest of the RNA world. Proc. Natl. Acad. Sci. U. S. A. 1989, 86, 7054−7058. (2) White, H. B., III Coenzymes as fossils of an earlier metabolic state. J. Mol. Evol. 1976, 7, 101−104. (3) (a) Visser, C. M.; Kellogg, R. M. Bioorganic chemistry and the origin of life. J. Mol. Evol. 1978, 11, 163−170. (b) Visser, C. M.; Kellogg, R. M. Biotin. Its place in evolution. J. Mol. Evol. 1978, 11, 171−178. (4) Gilbert, W. The RNA world. Nature 1986, 319, 618−618.

501

DOI: 10.1021/acs.accounts.6b00496 Acc. Chem. Res. 2017, 50, 498−502

Commentary

Accounts of Chemical Research (5) Pauling, L.; Zuckerkandl, E.; et al. Chemical paleogenetics molecular restoration studies of extinct forms of life. Acta Chem. Scand. 1963, 17 supl, 9−16. (6) Malcolm, B. A.; Wilson, K. P.; Matthews, B. W.; Kirsch, J. F.; Wilson, A. C. Ancestral lysozymes reconstructed, neutrality tested, and thermostability linked to hydrocarbon packing. Nature 1990, 345, 86− 89. (7) (a) Stackhouse, J.; Presnell, S. R.; McGeehan, G. M.; Nambiar, K. P.; Benner, S. A. The ribonuclease from an extinct bovid. FEBS Lett. 1990, 262, 104−106. (b) Jermann, T. M.; Opitz, J. G.; Stackhouse, J.; Benner, S. A. Reconstructing the evolutionary history of the artiodactyl ribonuclease superfamily. Nature 1995, 374, 57−59. (8) Goldsmith, M.; Tawfik, D. S. Directed enzyme evolution. Beyond the low hanging fruit. Curr. Opin. Struct. Biol. 2012, 22, 406−412. (9) Trudeau, D. L.; Kaltenbach, M.; Tawfik, D. S. On the potential origins of the high stability of reconstructed ancestral proteins. Mol. Biol. Evol. 2016, 33, 2633−2641. (10) (a) Gaucher, E. A.; Thomson, J. M.; Burgan, M. F.; Benner, S. A. Inferring the paleoenvironment of ancient bacteria on the basis of resurrected proteins. Nature 2003, 425, 285−288. (b) Gaucher, E. A.; Govindarajan, S.; Ganesh, O. K. Palaeotemperature trend for Precambrian life inferred from resurrected proteins. Nature 2008, 451, 704−707. (11) (a) Skovgaard, M.; Kodra, J. T.; Gram, D. X.; Knudsen, S. M.; Madsen, D.; Liberles, D. A. Using evolutionary information and ancestral sequences to understand the sequence−function relationship in GLP-1 Agonists. J. Mol. Biol. 2006, 363, 977−988. (b) Tan, P. K.; Gaucher, E. A.; Miner, J. N. Coevolution of a URIC Acid Transporter and Uricase: Implications for Gout. Ann. Rheum. Dis. 2015, 74, 537. (12) Bloch, N. I.; Morrow, J. M.; Chang, B. S. W.; Price, T. D. SWS2 visual pigment evolution as a test of historically contingent patterns of plumage color evolution in warblers. Evolution 2015, 69, 341−356. (13) Carrigan, M. A.; Uryasev, O.; Frye, C. B.; Eckman, B. L.; Myers, C. R.; Hurley, T. D.; Benner, S. A. Hominids adapted to metabolize ethanol long before human-directed fermentation. Proc. Natl. Acad. Sci. U. S. A. 2015, 112, 458−463. (14) Benner, S. A.; Chamberlin, S. G.; Liberles, D. A.; Govindarajan, S.; Knecht, L. Functional inferences from reconstructed evolutionary biology involving rectified databases. An evolutionarily-grounded approach to functional genomics. Res. Microbiol. 2000, 151, 97−106. (15) Tree of Life Home Page. http://tolweb.org/tree/ (accessed Sept 29, 2016).

502

DOI: 10.1021/acs.accounts.6b00496 Acc. Chem. Res. 2017, 50, 498−502