Just pointing out some facts here as found in
https://deepmind.com/blog/article/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology.
The way AI learns:
"...We trained this system on publicly available data consisting of ~170,000 protein structures from the
protein data bank together with
large databases containing protein sequences of unknown structure. It uses approximately 128
TPUv3 cores (roughly equivalent to ~100-200 GPUs) run over a few weeks, which is a relatively modest amount of compute in the context of most large state-of-the-art models used in machine learning today. As with our CASP13 AlphaFold system, we are preparing a paper on our system to submit to a peer-reviewed journal in due course..."
The software uses a protein database and the structure of these proteins have been determined prior to this experimentally in the laboratory using NMR, CEM, and X-ray crystallography.
Something about the code:
"...A folded protein can be thought of as a “spatial graph”, where residues are the nodes and edges connect the residues in close proximity. This graph is important for understanding the physical interactions within proteins, as well as their evolutionary history. For the latest version of AlphaFold, used at CASP14,
we created an attention-based neural network system, trained end-to-end, that attempts to interpret the structure of this graph, while reasoning over the implicit graph that it’s building. It uses evolutionarily related sequences, multiple sequence alignment (MSA), and a representation of amino acid residue pairs to refine this graph...By iterating this process, the system develops strong predictions of the underlying physical structure of the protein and is able to determine highly-accurate structures in a matter of days. Additionally, AlphaFold can predict which parts of each predicted protein structure are reliable using an internal confidence measure."
"...A major challenge, however, is that the number of ways a protein could theoretically fold before settling into its final 3D structure is astronomical. In 1969 Cyrus Levinthal noted that it would take longer than the age of the known universe to enumerate all possible configurations of a typical protein by brute force calculation – Levinthal estimated
10^300 possible conformations for a typical protein. Yet in nature, proteins fold spontaneously, some within milliseconds – a dichotomy sometimes referred to as
Levinthal’s paradox..."
"Levinthal himself was aware that
proteins fold spontaneously and on short timescales. He suggested that the paradox can be resolved if "protein folding is sped up and guided by the rapid formation of local interactions which then determine the further folding of the peptide; this suggests local amino acid sequences which form stable interactions and serve as
nucleation points in the folding process".
[4] Indeed, the protein folding
intermediates and the partially folded
transition states were experimentally detected, which explains the fast
protein folding. This is also described as protein folding directed within
funnel-like energy landscapes[5][6][7] Some computational approaches to protein structure prediction have sought to identify and simulate the mechanism of protein folding.
[8]"
https://en.wikipedia.org/wiki/Levinthal's_paradox
After the AI has learned much about the folding and unfolding the best it can do is to attempt to speed up this process via simulation.
The major issue in AI and or in Simulations is, the results will only be accurate IF all known physical, chemical, and kinetic attributes and interactions are known.
I do NOT think they are yet known.