Installing Alphafold2 on Apple Silicon

AlphaFold2 is an artificial intelligence (AI) program developed by Alphabets’s/Google’s DeepMind which performs predictions of protein structure. Despite the name AlphaFold2 does not actually predict the folding mechanism instead it predicts the final 3D structure of a protein from the protein sequence DOI.

Source code for the AlphaFold model, trained weights and inference script are available under an open-source license at https://github.com/deepmind/alphafold.

It is possible to get easy access to AlphaFold2 via a Google Colab notebook here https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb however there is a 2 hour timeout, and in my testing a many of the runs timed out.

Fortunately there it is possible to run the notebook locally on your machine, as written in a brilliant description by Yoshitaka Moriwaki https://github.com/YoshitakaMo/localcolabfold.

Installing LocalColabfold

There are instructions for multiple platforms but I thought I’d show details and pictures for installing on Apple Silicon, I’m using a MacBook Pro M1 max with 64GB memory under macOS 12.1

Firstly install Home-brew if not already installed. (Homebrew is a free and open-source software package management system that simplifies the installation of software on Macs).

Then install a couple of packages

The next step is to create a folder called Alphafold then in the Terminal type

To enter the newly created folder and then install miniconda using Home-brew

Then download the colabfold download/install script

You should now have a file called installcolabbatchM1mac.sh

After a few minutes a new folder should have been created as shown below.

When I tried to run the program I got an error saying SciPy was not installed, so I installed it using the colabfold conda

This has been corrected in the latest commit https://github.com/YoshitakaMo/localcolabfold/issues/55

To generate a 3D protein structure you need a protein sequence in fasta format

These can be obtained from the Uniprot database, for example HUMAN Free fatty acid receptor 2 https://www.uniprot.org/uniprot/O15552

>sp|O15552|FFAR2_HUMAN Free fatty acid receptor 2 OS=Homo sapiens OX=9606 GN=FFAR2 PE=1 SV=1 MLPDWKSSLILMAYIIIFLTGLPANLLALRAFVGRIRQPQPAPVHILLLSLTLADLLLLL LLPFKIIEAASNFRWYLPKVVCALTSFGFYSSIYCSTWLLAGISIERYLGVAFPVQYKLS RRPLYGVIAALVAWVMSFGHCTIVIIVQYLNTTEQVRSGNEITCYENFTDNQLDVVLPVR LELCLVLFFIPMAVTIFCYWRFVWIMLSQPLVGAQRRRRAVGLAVVTLLNFLVCFGPYNV SHLVGYHQRKSPWWRSIAVVFSSLNASLDPLLFYFSSSVVRRAFGRGLQVLRNQGSSLLG RRGKDTAEGTNEDRGVGQGEGMPSSDFTTE

Save the file as ffa2.fasta

We can now run a prediction thus

You will get warnings about this being minimally tested on ARM machines

On your first run AlphaFold2 weight parameters will be downloaded at ~/Library/Caches/colabfold/params directory in subsequent runs these will not be downloaded again.

You should now have an output like this.

This folder contains various files, the “env” folder contains the templates used. The log file containing the timings from each of the models There are also the unrelaxed PDB files of the direct output from the models, a PDB format text file containing the predicted structure after performing an Amber relaxation procedure on the unrelaxed structure prediction. Plus the images shown below.

f you open the PDB files in a viewer like ChimeraX you can display the structure as shown below. The pLDDT confidence measure is stored in the B-factor field of the output PDB files so you can colour by b-factor in ChimeraX to get a visual representation (red is high confidence, blue is low confidence).

I got a couple of tips from Yoshitaka

I recommend to adding --model-order 1,2,3,4,5 argument to reduce the calculation time when one uses --templates. By Default, two JAX compilations are required when starting the calculation for model 3 and model 1.

And 

Preparing the input file for complex prediction is a bit more complicated and differs from that of the original AlphaFold. Here is an example for localcolabfold:

>3kud_complex MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLC VFAINNTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDLAARTVESRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLV REIRQH: PSKTSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARLDWNTDAASLIGEELQVDF L

This input fasta file (3kud_complex.fasta) will produce complex structures.

List of tools tested https://macinchem.co.uk/software-reviews/cheminformatics-and-compchem-on-apple-silicon/

Last Updated 14 Feb 2022