Pioneering AI in Molecular Biology: Featuring nobel Laureate David Baker
By Gillian Dohrn
David Baker is a busy man. His lab hosts approximately 100 graduate students and postdocs, and Baker advises all of them. After he won the Nobel Prize in chemistry last year, Baker wasted no time basking in the glory of his accolades. Instead, he warned his students that Nobel Prize labs often experience efficiency lags post-award, and urged them to stay focused.
“I think it’s that mindset that gets you a Nobel Prize,” said Avery Yang, a first-year PhD student in Baker’s lab.
That and brilliant science, which Baker will elaborate on in his closing Keynote address at the upcoming Keystone Symposia meeting: AI in Molecular Biology, taking place September 15-18, in Santa Fe, New Mexico.
Important Deadlines:
Baker is a professor of biochemistry at the University of Washington and director of the Institute for Protein Design. He has published more than 600 papers, co-founded 21 companies, and holds more than 100 patents. Once obsessed with the so-called protein folding problem, Baker now dedicates most of his time to synthesizing new molecules using a combination of computational and experimental techniques.
Protein design is just one division of molecular biology that benefits from advances in artificial intelligence (AI). The process involves reverse-engineering molecules based on a desired structure, which determines a protein’s function, and often involves a lot of trial and error. AI can help researchers get close to the answer before setting foot in the lab.
Before AI learned how proteins fold, scientists—including Baker—agonized over what he refers to as the protein folding problem... they knew how to decode the sequence of amino acids forming the backbone of a protein but could not predict how it would look in three dimensions.
Image Credit: David Baker Lab Website
Then, in 2018, Google DeepMind debuted AlphaFold, an algorithm that did just that. The subsequent generations were even more precise. In 2021, just after AlphaFold2 came out, but before Google released the code, Baker presented a rival algorithm, called RoseTTAfold. He had first debuted a rudimentary version of the tool in 1998, but AlphaFold inspired a powerful redesign. The latest version, RoseTTAFold All-Atom, models not just proteins, but interactions between different molecules in the cell.
Now, students in Baker’s lab use these platforms when they begin tinkering with a new protein design. Experiments progress faster because they can do so much of the initial legwork on the computer. Traditional protein science is slow and laborious. To capture structure, scientists isolate a protein, shoot it with X-rays, and construct a model based on the diffraction pattern.
“People will spend their entire PhD studying one protein family to really understand how it works,” said Yang, whose background is in X-ray crystallography. You lose some precision with computational methods but gain enormous efficiency. “You’re not limited by things that already exist, if you want something, just make it,” she added.
Baker, delighted by his students' creativity, calls the lab a “communal brain.”
His students are tackling diseases such as cancer, diabetes and ebola. They are synthesizing proteins that can break down plastic, draw greenhouse gases out of the atmosphere and neutralize venomous snake bites. Some students have a goal in mind, while others are taking it one step at a time.
Image Credit: David Baker Lab Website
“I think everything does start as advancing science for the sake of advancing science,” said Alex Shida, an MD/PhD student who is working on developing “little protein machines that walk around” and can be dispatched to do different tasks in the cell. Potential applications for this technology will become more apparent in time, but Shida imagines the mini-motors breaking up plaque deposits in arteries or clearing protein aggregates in the brain.
Baker shares a similar sentiment about the future of the field.
“I don’t even try and predict where things will be 3 years from now,” he said. “I try to have some intuition about what will be interesting over the next 3 months.”
Each person who joins the lab brings fresh perspectives and new ideas. They collaborate with other researchers nationwide and help drive innovation by sharing knowledge and expertise. Events like the upcoming Keystone Symposia conference offer a gathering space for these individuals to compare notes and strike up collaborations that drive novel discoveries and innovation. The meeting will draw rising stars and veteran experts alike to connect and inspire new applications and advances.
“It's a fun opportunity to match problem space and method application,” said David Kelley, one of the meeting organizers and a principal investigator at Calico Labs. Kelley and Baker both noted that researchers now have unprecedented access to powerful technology and rich biological datasets. The meeting will offer an opportunity to apply these tools to new problems and further develop them for tailored applications.
The insights and applications explored at the meeting go far beyond protein design. Other sessions will explore how AI can help decipher the non-coding genome– previously known as ‘junk-DNA’ but since discovered to be far from useless– and provide insight into different biological processes, such as cellular interactions. In recent years, advances in spatial omics technology have allowed researchers to visualize cells and components of cells in their native habitat, both in still imaging and video formats. AI and machine learning have expanded our ability to extract meaning from this data to better understand biological processes and functions from the subcellular to tissue level.
Other speakers will describe clinical applications for AI and machine learning. Barbara Engelhardt, a professor of biomedical data science at Stanford University, will give the opening keynote address on capturing and interpreting the behavior of living cells using machine learning. Engelhardt’s research focuses on modeling cellular communication in healthy and diseased tissues, including tumor tissue, using algorithms to inform clinical care decisions.
This meeting will unite researchers using common methods to tackle unique projects. For example, during the session on integrating AI and systems biology, Hector Garcia Martin will explain bioengineering cells to make renewable biofuels using machine learning, synthetic biology, automation and math. There will also be discussions on how to responsibly and appropriately deploy AI with a panel of experts across academia and industry. As AI continues to expand what is possible in molecular biology, this meeting will give scientists a chance to remain abreast of the latest developments and work together to chart a course moving forward.
Learn how to integrate AI tools to accelerate and advance your work! Join us in Santa Fe for this exciting meeting!
Important Deadlines:
Early Registration Deadline: July 17, 2025
Scholarship Deadline: May 20, 2025
Short Talk Abstract Deadline: May 20, 2025
Poster Abstract Deadline: August 21, 2025
Meet the Author
Gillian Dohrn
Related news
Keypoint Newsletter: December 2023
Michelson Prizes, Next-Generation Grants ePanel Features Rising Stars in Immunology & Vaccine Innovation
On May 14, 2024, Keystone Symposia hosted a live ePanel event featuring the recipients of the ...