AlphaFold: Momentary or Revolutionary?
In the 21st Century we have almost come to expect technologies crossing the boundaries of science-fiction into science-fact. In the most recent years, artificial intelligence and machine learning has become one of these incumbent technologies. AI and ML methods have made quite a splash in the world of computational biology. The Alphabet owned company DeepMind has promised to solve biology’s ‘grand challenge’: using AI to solve protein folding conformations with their AlphaFold technology.
Govinda Bhisetti, Principal Investigator and Head of Computational Chemistry at Biogen (currently Vice President, Computational Chemistry, Cellarity) hosted the panel discussion, The Use of AlphaFold: Something Useful or Complete Hype? at Oxford Global’s Discovery Europe 2022 conference. Bhisetti said that getting experimental accuracy of a protein’s structure through prediction methods would be a huge achievement. “Because then,” he explained, “you wouldn’t have to wait years to get a crystal structure – you could just get it from the AlphaFold database.”
Bhisetti also touched upon several subsequent developments that have used AI/ML methods to solve protein folding since AlphaFold2. “RoseTTAFold have claimed that they can achieve similar accuracy to AlphaFold,” said Bhisetti. Both AlphaFold2 and RoseTTAFold have been released as open source to allow for collaboration between developers, Bhisetti added “you can run these algorithms yourself to predict a protein’s structure. This enables everybody all over the world to look at these predicted structures.”
Despite the obviously huge advances — such as AlphaFold2 winning CASP14 — like any other method, it is not perfect. Bhisetti stated that the software’s prediction of structures is not 100% accurate, “it’s more like 70%, which is still very impressive.”
Biases and Limitations of AlphaFold
Joining Bhisetti on the panel was Matthias Frech, Director of Molecular Interactions & Biophysics at Merck Healthcare KGaA. Frech said that he was surprised in the beginning by the power of AlphaFold, “crystallographers in the department were very much afraid about losing their jobs due to these developments,” he said. “However, we know now that this is definitely not the case.”
Frech has looked into the use of AlphaFold for structural biology and said that he had detected some limitations of the software. “It’s fine to use it for the time-being within known domains and in getting new models,” he confirmed, but pointed out that there were other difficulties. For example, Frech mentioned there were challenges in using AlphaFold to predict full-length proteins, protein complexes, and proteins with increased flexibility.
Frech then raised an interesting open question. “We have so far analysed public domain structures – for example those in the PDB (protein data bank),” he explained. Frech added “if we look into the history of structural biology, we have been using engineered protein constructs which are prone to crystallisation. So, there is a bias in the models which we derive, does this in turn mean that there is a bias in the model building by AlphaFold?”
Frech suggested that perhaps the information derived from AlphaFold could be used to get a good model for refining difficult proteins. “We are looking into the capability of AlphaFold derived information to be used in protein engineering,” continued Frech, adding that it was a huge learning endeavour that he and his team had only just started conducting.
Another advantage that Frech mentioned was the fact that surrogate models can be used for the proteins which cannot be crystalised. “What we do in structural biology and biophysics is always bring a package of data together: structural, thermodynamic, pharmacokinetic data, or any other affinity data. We also need to look into how we can support the AlphaFold models with these sorts of data.”
How Much is Just Hype?
Simone Fulle, Head of Molecular Modelling & Design at Novo Nordisk, said that “whether AlphaFold is hype or not depends on your expectation and your use-case.” Fulle explained that Novo Nordisk has a portfolio of peptide and protein drugs in their pipeline, so they find the technology to be synchronous to their use-case. However, Fulle was careful to point out that AlphaFold has not been game-changing yet: “Is AlphaFold revolutionising drug design? Probably not. Is it a useful tool? Definitely yes.”
“Is AlphaFold revolutionising drug design? Probably not. Is it a useful tool? Definitely yes.”
Fulle asked us to consider a broader perspective when thinking about AlphaFold. Her interest in the tool started when AlphaFold-Multimer had launched. AlphaFold-Multimer is an extension of AlphaFold2 that focusses on predicting the structures of protein-protein complexes. “Those weren’t trained on monomers but on dimers which then allowed for the prediction of complex structures,” Fulle explained. Additions and spin-offs of AlphaFold join the cohort of AI/ML tools by the day, the broader perspective is the backbone idea that AlphaFold was spawned from.
Finding the Limitations of AlphaFold
Michael Wilson, Co-Founder and CEO of DrugBank, said that it had been interesting to see how AlphaFold has managed to produce inferences from existing databases such as the PDB. He went on to comment that the technology had made a significant impact on drug discovery news due to its fast and vast improvements, “but I think there are a lot of lessons that we need to learn from it.”
Wilson said that the use of AlphaFold had now reached the stage where researchers were now conducting experiments to push the limits, “to build up trust in this tool in certain areas.” Building up that trust, according to Wilson, requires a greater understanding of the kind of information that researchers need to provide their models in order to make their predictions.
“It has been interesting to see so many people picking this tool up, using it, and performing experiments with the tool to understand how it works,” added Wilson. Even though AlphaFold was developed as a tool on the knowledge that researchers already had, Wilson said that “once it had been launched, people didn’t fully understand the limitations of the tool.” Wilson suggested that this could be an opportunity – “something to learn for the next versions of tools like AlphaFold would be how can we make tools that help explain those limitations better.
One Small Step for Machine Learning?
Richard Lewis, Director of Computer Aided Drug Design at Novartis, was optimistic about AlphaFold, although he expressed caution about setting expectations and not getting ahead of oneself. “AlphaFold is great. But its like the first space missions: Sputnik or Yuri Gagarin’s first manned mission. We may have taken the first step, but that didn’t mean we were going to immediately fly to Mars.”
Lewis said that with hindsight, AlphaFold2 was always going to be successful compared to physics-based methods: “I say that because if you look at the PDB, the number of novel folds that were discovered in structures had been rising and then began to plateau. So, we had started to saturate the data.” That being said, Lewis described AlphaFold as being able to provide a “fascinating insight” into proteins that were once difficult to obtain structural information for. “Now we can just look up those structures in the database.”
However, Lewis still advised caution. He advocated that researchers better understand how AlphaFold works: via pattern recognition followed by a physics-based method - “and it gives you a structure purely of the amino acids. No co-factors, no metals, no waters,” Lewis said. “Just imagine looking at a novel cytochrome P450 and there’s no heme in there! It makes no sense. A zinc finger with no zinc? We still have to solve those problems to actually turn it into something that might really drive decision making,” Lewis concluded.
Join and network with over 200 industry leaders at Discovery US: In-Person, where we will address the latest advancements in target identification, validation and HIT optimisation.