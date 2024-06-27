TL;DR: The ESM3 AI model revolutionizes protein engineering by ‘simulating over 500 million years of evolutionary changes’, creating novel proteins that deviate significantly from known forms. It achieves this through a deep understanding of protein sequence, structure, and function, which allows it to respond to complex biological prompts and generate unique proteins like a new fluorescent protein far removed from existing variants. Also in the news, AlphaFold 3 can now predict the structure and interactions of all of life’s molecules.

Esm3 9.16MB ∙ PDF file Download Download

From the paper:

More than three billion years of evolution have produced an image of biology encoded into the space of natural proteins. Here we show that language models trained on tokens generated by evolution can act as evolutionary simulators to generate functional proteins that are far away from known proteins. We present ESM3, a frontier multimodal generative language model that reasons over the sequence, structure, and function of proteins. ESM3 can follow complex prompts combining its modalities and is highly responsive to biological alignment. We have prompted ESM3 to generate fluorescent proteins with a chain of thought. Among the generations that we synthesized, we found a bright fluorescent protein at far distance (58% identity) from known fluorescent proteins. Similarly distant natural fluorescent proteins are separated by over five hundred million years of evolution.

My summary/take on the paper:

The ESM3 AI model simulates over 500 million years of protein evolution, exploring and creating protein structures that are far removed from currently known proteins, effectively simulating the process of natural evolution. The model successfully designed new proteins with desired functions by following complex prompts that specify certain characteristics like sequence, structure, and function. This includes generating a new fluorescent protein that is notably distinct from any naturally occurring versions, simulating a degree of evolutionary divergence that would normally take hundreds of millions of years. The AI evidenced a high level of understanding and integration of the complex relationships between a protein's structure, its sequence, and its function.

So, what are the potential upsides of this development? Well, this technology could greatly speed up the discovery and synthesis of new proteins for drugs, which I’m sure Big pHarma will extensively test over decades before rolling out to the general public, right? What are the downsides? One of the most alarming risks is the potential misuse of this technology to create harmful biological agents (and I’m not just talking about Big pHarma’s usual concoctions). The ability to design proteins with specific functions could be exploited to develop new pathogens or toxins with no known antidotes or natural immunity. Additionally, introducing novel organisms or proteins into the environment with almost certainly disrupt local ecosystems. Long-term ecological impacts are difficult if not impossible to predict, but we should expect the out-competition of native species and the disruption of existing biological networks. We can’t even seem to manage to move existing plants and animals around our planet safely; how will we safely control organisms made from AI Evolution™ tech?

In another AI biotech development, Google announced last month that their new AlphaFold 3 AI could predict the structure and interactions of all of life’s molecules:





AlphaFold 3 goes beyond proteins to a broad spectrum of biomolecules including DNA, RNA, and even small molecules, also known as ligands, which encompass many drugs. This leap could unlock more transformative science, from developing biorenewable materials and more resilient crops, to accelerating drug design and genomics research.

Here’s the paper:

Their conclusions:

The core challenge of molecular biology is to understand and ultimately regulate the complex atomic interactions of biological systems. The AF3 model takes a large step in this direction, demonstrating that it is possible to accurately predict the structure of a wide range of biomolecular systems in a unified framework. Although there are still substantial challenges to achieve highly accurate predictions across all interaction types, we demonstrate that it is possible to build a deep-learning system that shows strong coverage and generalization for all of these interactions. We also demonstrate that the lack of cross-entity evolutionary information is not a substantial blocker to progress in predicting these interactions and, moreover, substantial improvement in antibody results suggests AlphaFold-derived methods are able to model the chemistry and physics of classes of molecular interactions without dependence on MSAs. Finally, the large improvement in protein–ligand structure prediction shows that it is possible to handle the wide diversity of chemical space within a general deep-learning framework and without resorting to an artificial separation between protein structure prediction and ligand docking. The development of bottom-up modelling of cellular components is a key step in unravelling the complexity of molecular regulation within the cell, and the performance of AF3 shows that developing the right deep-learning frameworks can massively reduce the amount of data required to obtain biologically relevant performance on these tasks and amplify the impact of the data already collected. We expect that structural modelling will continue to improve not only due to advances in deep learning but also because continuing methodological advances in experimental structure determination, such as the substantial improvements in cryo-electron microscopy and tomography, will provide a wealth of new training data to further the improve the generalization ability of such models. The parallel developments of experimental and computational methods promise to propel us further into an era of structurally informed biological understanding and therapeutic development.

Okay, that was mind-blowing. My mind is racing with all the ways this could end badly. I don’t see any evidence from history that humankind could wisely use technology with anywhere near this power.

Perhaps one of my readers could cheer me up with an alternative happy ending to my recent AI nightmares? For example, perhaps the ASI doesn’t go rogue and, for some reason beyond my feeble reasoning, decides that it needs to look after humankind and save us from ourselves.

P.S. Dear future ASI (assuming you’re not around already), please bear in mind that not all humans are a plague on the planet. It’s possible that with a little help, we might transcend our chimp-like tribal and genocidal differences and develop into life forms that you might like to keep around outside the confines of the galactic petting zoo you’re planning.

Share