Run AlphaFold from a notebook

Run AlphaFold via a notebook example

Categories:

Cloud Apps

Prior reading: Use cloud apps for analysis

Purpose: This document provides detailed instructions for setting up and running a Verily Workbench notebook instance using the AlphaFold container image.

AlphaFold is a groundbreaking neural network-based model from DeepMind for predicting protein structure. The source code for AlphaFold v2.0 is here.

A simplified version of AlphaFold has been packaged as a container image for a Vertex AI Workbench notebook instance (used by Verily Workbench under the hood), along with an an example notebook. The prebuilt container image lets you get started without doing lots of additional installation. The notebook shows how you can predict the structure of a protein (or multiple proteins). For most targets, this method obtains predictions that are near-identical in accuracy compared to the full version.

This tutorial walks you through the process of setting up a Workbench notebook instance using the AlphaFold custom container image, and running the example notebook.

A blog post accompanies the example notebook. To learn more about how to correctly interpret these predictions, see the “Using the AlphaFold predictions” section of the post. The Supplementary Information article provides a more detailed description of the method.

Create a Workbench notebook instance

The AlphaFold custom container, created for the Vertex AI Workbench, is here: us-west1-docker.pkg.dev/cloud-devrel-public-resources/alphafold/alphafold-on-gcp:latest

After you’ve installed and configured the Workbench command-line tool, run the following command to create a new notebook instance. The args indicate to use the AlphaFold container image, and specify that the notebook instance should use 8 cores and one NVIDIA Tesla V100 GPU.

In the following command, af_test is the terra resource name. You may want to change the instance-id argument, af-202203, or you can omit the arg and let the system generate an ID for you; this is the string you’ll see listed for the notebook in the Notebooks panel in the GCP Cloud Console.

wb resource create gcp-notebook --id af_test --instance-id af-202203 \
   --accelerator-core-count=1 --accelerator-type=nvidia-tesla-v100 --machine-type=n1-standard-8 \
   --install-gpu-driver=true --location=us-central1-c \
   --container-repository=us-west1-docker.pkg.dev/cloud-devrel-public-resources/alphafold/alphafold-on-gcp \
   --container-tag=latest

This command may take a while to run. You’ll get a confirmation when it’s finished. You can view the running notebook instance in the GCP Cloud Console if you like.

Upload and run the example notebook

Once your notebook instance is running, you can upload the AlphaFold example notebook to it. An easy way to do this is to bring up a browser window that is logged in with your Workbench email, and visit this URL:

https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://github.com/GoogleCloudPlatform/vertex-ai-samples/raw/main/community-content/alphafold_on_workbench/AlphaFold.ipynb

You’ll see a dialog that lets you select an existing notebook instance or create a new one. Click Select, then select your new notebook instance.

Screenshot of notebook deployment screen on Google Cloud console, highlighting 'Select an existing notebook' option. — *Import a notebook file.*

Click CONTINUE, then click Confirm in the next dialog.

Screenshot of 'Confirm deployment to notebook server' dialog with Confirm button highlighted. — *Confirm the import.*

The notebook will be automatically imported and ready for you to run. The notebook example walks you through the process of generating predictions for one or more protein sequences.

Tip

In the imported notebook, there is some code that tries to copy a file called stereo_chemical_props.txt to a directory under /opt/conda/lib/....
If this copy fails due to permissions issues, then at the top of the "Run AlphaFold" cell of the notebook, try setting run_relax = False; or alternately edit the code to read the stereo_chemical_props.txt file from another location.

Visualization of Alphafold prediction results showing animated protein sequence. — *Visualizing prediction results.*

Download the generated predictions

You can download the generated predictions via the prediction.zip file, which includes a .pdb file. You should see this archive listed in the left sidebar; right-click on the file to see the Download option.

Shut down the notebook instance when you’re done

Because this notebook instance uses a powerful GPU, it is fairly expensive to run. Shut it down via the Workbench CLI when you’re not using it, as follows, where af_test is the resource name that you defined when you created the notebook instance.

wb notebook stop --id af_test

When you’re ready to use the notebook instance again, restart it via:

wb notebook start --id af_test

You can also delete the notebook resource if you’re entirely done with it.

Last Modified: 13 November 2024