AlphaFold 3
This tutorial walks you through the four things you need to do once before
running AlphaFold 3 on the cluster — load the module, get the model
weights, fetch the genetic databases — and then how to launch a prediction.
environment variables and pass the same flags that run_alphafold.py
The module ships a thin alphafold3 wrapper that hides the apptainer exec --rocm --bind ... boilerplate; you point it at your data with
environment variables and pass the same flags that run_alphafold.py
expects upstream. For full details and updates, see the AlphaFold 3 GitHub repository.
1. Load the module
The labs-test modulefiles tree isn’t on MODULEPATH by default. Append it,
then load AlphaFold 3:
module use --append /cluster/labs/modulefiles-test
module load alphafold3/3.0.2-rocmAdd the module use line to ~/.bashrc (or your Slurm job script) if you
don’t want to type it every session.
After loading, you have:
alphafold3— wrapper that runsrun_alphafold.pyinside the SIFalphafold3-shell— interactive shell inside the container$ALPHAFOLD3_SIF— absolute path to the SIF, for rawapptainercalls
module help alphafold3/3.0.2-rocm summarises the same on the command line.
2. Get the model parameters
Model weights are not redistributable. Request them from Google DeepMind via this form — access is granted at their discretion, usually within 2–3 working days.
Once you’ve received the download link, unpack the archive somewhere on your group share or scratch, then point the wrapper at it:
export ALPHAFOLD3_MODEL_DIR=/path/to/alphafold3_model_parametersThe wrapper bind-mounts that directory at /root/models inside the
container.
3. Fetch the public databases
AlphaFold 3 needs ~650 GB of sequence/structure databases (BFD, MGnify,
PDB, UniProt, UniRef90, NT, RFam, RNACentral). The container ships
fetch_databases.sh, which downloads and decompresses all of them in
parallel — about 45 minutes on a fast network.
Pick a directory with enough space (group share or scratch — not your home), then run:
mkdir -p /path/to/public_databases
apptainer exec "$ALPHAFOLD3_SIF" \
/workspace/fetch_databases.sh /path/to/public_databasesRun it inside tmux or screen so a dropped SSH session doesn’t kill the
download. When it finishes, export the path:
export ALPHAFOLD3_DB_DIR=/path/to/public_databasesIf you have a fast SSD plus a slower fallback, give both as a colon pair —
the wrapper binds them as /root/public_databases and
/root/public_databases_fallback:
export ALPHAFOLD3_DB_DIR=/ssd/af3_dbs:/lustre/af3_dbs4. Run a prediction
Place your input JSON in ~/af_input/ (or override
ALPHAFOLD3_INPUT_DIR). Results land in ~/af_output/ (or
ALPHAFOLD3_OUTPUT_DIR).
mkdir -p ~/af_input ~/af_output
cp my_fold.json ~/af_input/fold_input.json
alphafold3 \
--json_path=/root/af_input/fold_input.json \
--model_dir=/root/models \
--output_dir=/root/af_outputNote that the paths in the flags are container-internal mount points
(/root/...), not host paths — those are fixed by the wrapper’s bind
layout. The host-side directories come from the env vars you set above.
Selecting a specific GPU
ROCR_VISIBLE_DEVICES=1 alphafold3 ...Sanity-checking GPU access
apptainer exec --rocm "$ALPHAFOLD3_SIF" rocm-smiListing all run_alphafold.py flags
apptainer exec --pwd /workspace "$ALPHAFOLD3_SIF" python3 run_alphafold.py --helpReference
Upstream documentation lives at the AlphaFold 3 GitHub
repository — see
README.md
and docs/installation.md
for the full flag reference, input JSON schema, and modelling background.
Note: the upstream commands target NVIDIA GPUs and Docker (
docker run --gpus all alphafold3 python run_alphafold.py ...). On this cluster, replace that invocation withalphafold3 ...aftermodule load alphafold3/3.0.2-rocm— the wrapper handles--rocm, the SIF path, the/workspaceworking directory, and all required bind mounts. Flag arguments torun_alphafold.pyare unchanged.