Skip to content

AlphaFold 3

This tutorial walks you through the four things you need to do once before running AlphaFold 3 on the cluster — load the module, get the model weights, fetch the genetic databases — and then how to launch a prediction. environment variables and pass the same flags that run_alphafold.py The module ships a thin alphafold3 wrapper that hides the apptainer exec --rocm --bind ... boilerplate; you point it at your data with environment variables and pass the same flags that run_alphafold.py expects upstream. For full details and updates, see the AlphaFold 3 GitHub repository.

1. Load the module

The labs-test modulefiles tree isn’t on MODULEPATH by default. Append it, then load AlphaFold 3:

module use --append /cluster/labs/modulefiles-test
module load alphafold3/3.0.2-rocm

Add the module use line to ~/.bashrc (or your Slurm job script) if you don’t want to type it every session.

After loading, you have:

  • alphafold3 — wrapper that runs run_alphafold.py inside the SIF
  • alphafold3-shell — interactive shell inside the container
  • $ALPHAFOLD3_SIF — absolute path to the SIF, for raw apptainer calls

module help alphafold3/3.0.2-rocm summarises the same on the command line.

2. Get the model parameters

Model weights are not redistributable. Request them from Google DeepMind via this form — access is granted at their discretion, usually within 2–3 working days.

Once you’ve received the download link, unpack the archive somewhere on your group share or scratch, then point the wrapper at it:

export ALPHAFOLD3_MODEL_DIR=/path/to/alphafold3_model_parameters

The wrapper bind-mounts that directory at /root/models inside the container.

3. Fetch the public databases

AlphaFold 3 needs ~650 GB of sequence/structure databases (BFD, MGnify, PDB, UniProt, UniRef90, NT, RFam, RNACentral). The container ships fetch_databases.sh, which downloads and decompresses all of them in parallel — about 45 minutes on a fast network.

Pick a directory with enough space (group share or scratch — not your home), then run:

mkdir -p /path/to/public_databases
apptainer exec "$ALPHAFOLD3_SIF" \
    /workspace/fetch_databases.sh /path/to/public_databases

Run it inside tmux or screen so a dropped SSH session doesn’t kill the download. When it finishes, export the path:

export ALPHAFOLD3_DB_DIR=/path/to/public_databases

If you have a fast SSD plus a slower fallback, give both as a colon pair — the wrapper binds them as /root/public_databases and /root/public_databases_fallback:

export ALPHAFOLD3_DB_DIR=/ssd/af3_dbs:/lustre/af3_dbs

4. Run a prediction

Place your input JSON in ~/af_input/ (or override ALPHAFOLD3_INPUT_DIR). Results land in ~/af_output/ (or ALPHAFOLD3_OUTPUT_DIR).

mkdir -p ~/af_input ~/af_output
cp my_fold.json ~/af_input/fold_input.json

alphafold3 \
    --json_path=/root/af_input/fold_input.json \
    --model_dir=/root/models \
    --output_dir=/root/af_output

Note that the paths in the flags are container-internal mount points (/root/...), not host paths — those are fixed by the wrapper’s bind layout. The host-side directories come from the env vars you set above.

Selecting a specific GPU

ROCR_VISIBLE_DEVICES=1 alphafold3 ...

Sanity-checking GPU access

apptainer exec --rocm "$ALPHAFOLD3_SIF" rocm-smi

Listing all run_alphafold.py flags

apptainer exec --pwd /workspace "$ALPHAFOLD3_SIF" python3 run_alphafold.py --help

Reference

Upstream documentation lives at the AlphaFold 3 GitHub repository — see README.md and docs/installation.md for the full flag reference, input JSON schema, and modelling background.

Note: the upstream commands target NVIDIA GPUs and Docker (docker run --gpus all alphafold3 python run_alphafold.py ...). On this cluster, replace that invocation with alphafold3 ... after module load alphafold3/3.0.2-rocm — the wrapper handles --rocm, the SIF path, the /workspace working directory, and all required bind mounts. Flag arguments to run_alphafold.py are unchanged.