Doubling Our DNA Building Blocks Could Lead to New Life Forms
If you were to boil all of biology down to a simple equation, it would be that DNA makes RNA, which makes proteins, which are what make every living thing you can see, smell, touch, and taste (and a lot of things you can’t). This central dogma of biology, built on strings of Cs, Gs, As, and Ts, has prevailed since Francis Crick, James Watson, and Rosalind Franklin discovered DNA’s double helix 65 years ago. Now that’s changing as scientists expand the code of life beyond the four letters provided by nature.
On Thursday, researchers unveiled the latest feat in artificial DNA engineering: an eight-letter synthetic system called “hachimoji” DNA. From the Japanese hachi for eight and moji for letter, the system is made up of four natural nucleotides and four synthetic ones that all fit seamlessly into DNA’s helical structure, maintaining its natural shape. Moreover, sequences spelled with these new letters pair predictably, and can evolve just like natural DNA. The research appears in the new issue of Science.
Previously, scientists had expanded the genetic alphabet to six letters, but the latest addition doubles the amount of information it’s possible to encode in natural DNA, testing the limits of molecular information storage. That could have immediate impacts on the nascent DNA data storage industry and NASA’s search for life elsewhere in the solar system. It also represents a big step toward the far-off vision of creating alternative life forms—organisms that use a genetic language unlike the one used by every other creature that evolved here on Earth.
“Biology is optimized to do what it wants to do, not what you want to do,” says Steven A. Benner, a synthetic biologist at the Foundation for Applied Molecular Evolution in Gainesville, Florida, who led the work. For decades he’s been trying to create artificial Darwinian systems, to understand if the four chemical letters nature wound up with became the language of life by simple chance. Was it, as Crick famously posited, merely a “frozen accident”?
“This paper, for the first time, definitively answers that question,” says Floyd Romesberg, a synthetic biologist at Scripps Research Institute in La Jolla, California, who was not involved in the work but who has created an artificial genetic language of his own. “For a long time we’ve had hints that life evolved from G, A, T, C, not because they were exactly the right raw materials but because they were simply available. Steve’s four letters [S, P, Z, B] are, at least in terms of stability, in every way equivalent to nature’s four letters.”
So now the question becomes whether broadening that coincidental code could make DNA even better. Having more letters to work with theoretically allows for totally novel molecules that don’t exist in nature—any of which could be useful for making new materials, diagnosing diseases, or developing new medicines. A four-letter alphabet gives you 64 possible codons, which yield 20 amino acids, the building blocks of proteins. Six letters takes you up to 256 codons; eight makes it 4,096. But that’s mostly meaningless unless someone creates the cellular machinery capable of reading hachimoji and spitting out synthetic proteins with new functions.
In the same way that chemists in the middle half of the 20th century took naturally occurring substances—say penicillin from a petri dish of “mold juice,” or pacliataxel from the bark of the Pacific yew tree—and tinkered with them to make them work better in human bodies, biochemists are eager to do the same with proteins. With more building blocks and new techniques to direct evolution, whose inventors won last year’s Nobel Prize in chemistry, scientists could give proteins advantageous properties that the 20 amino acids in our bodies don’t make available. You can think of those 20 amino acids as mud bricks. They’re good for building two-story houses. But say you want to make a skyscraper? Good luck.
That’s where biochemists like Andrew Ellington come in. His lab at the University of Texas evolved enzymes that can turn hachimoji DNA into RNA, the first step toward making a protein. He and Benner and their colleagues used it to make a strand of synthetic RNA that resembles a sequence found in spinach and that glows green when bound to a small molecule, just like its natural counterpart. Benner says they also have made hachimoji RNA that can seek out and bind to liver tumors and breast cancer cells in a petri dish. In the long run, he hopes his hachimoji will prove useful in detecting cancers, viruses, or even environmental toxins.
Benner has more letters waiting in the wings—a K and an X to add to his S, P, Z, and B, and he adds that he plans to try out his hachimoji in living cells soon. That hurdle should be surmountable. Romesburg of Scripps says he has successfully tested his six-letter alphabet in both human and hamster cells in work supported by the startup he founded, called Synthorx. The early-stage company, which raised $131 million in its IPO in December, is exploring the possibility of tricking semisynthetic cells into evolving proteins or other molecules for fighting cancer and other diseases.
The real challenge, though, is in the massive amounts of biological infrastructure necessary to use new genetic languages. “The more unnatural the language becomes, the more you have to engineer the tools to use it,” according to Benner.
That’s why Ellington sees a more immediate use for the technology in the up-and-coming field of DNA data storage. Large tech firms and startups alike are evaluating whether nucleotides can beat out silicon when it comes to long-term, archival information storage. DNA is notoriously data-dense, and the arrival of hachimoji just doubled its information-carrying capacity.
Ellington’s lab is engineering a library of enzymes to read and write not just natural DNA and hachimoji, but any of the variety of alphabets he foresees in the future. “We’re starting to think of it as ‘cryptogenetics,’” he says. The idea is to build the machinery necessary to read and write proprietary DNA languages. With cryptogenetics, IBM could have its own privileged genetic alphabet that no one else could translate. So could China. “An expanded alphabet gives you the opportunity to make bigger, better, stronger, faster things in general.”