-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Hi,
we're planning to run AF3 in a large-scale manner to query PPIs on a proteome scale. To that end we're using H100s on a local SCP managed by SLURM.
We're doing a few test folds with pools of proteins (something between 20-35 proteins), all very close to 5000 tokens.
We're seeing very consistent times for this process of around 45 minutes, which is around twice as long as the reported runtime for this token size with an H100 in the performance docs.
Could you give us some hints on what potential reasons could be? I assume I/O is not a major consideration here but would putting the json files onto a scratch disk make sense? Any input would be greatly appreciated. Do you think the number of proteins is an issue?
Below are the SLURM configurations we're using. If you need any other info please lets us know.
#!/bin/bash
#SBATCH -p gpu-el8
#SBATCH -C gpu=H100
#SBATCH --cpus-per-task=32
#SBATCH -G 1
#SBATCH --mem-per-gpu 193028
#SBATCH --time=24:00:00
#SBATCH -o slurm.%N.%j.out
#SBATCH -e slurm.%N.%j.err
module load AlphaFold3/3.0.1.20250908_a8ecdb2-foss-2024a-CUDA-12.6.0