-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
alphafold3 installation #406
Comments
@dsajdak thank you for the update. The use of AlphaFold3 is very timely for projects currently underway at UB. If there’s a way for our group (rams) to get involved to expedite the installation, I’d be happy to help. The instructions are straightforward if followed carefully, I just built a Docker image and converted it to Singularity outside of CCR. Slightly off-topic, but is there any possibility of CCR supporting Docker? I’ve noticed some HPCs are supporting it, potentially through a rootless version. |
Strongly support this installation request, I was about to make this request and saw @rakabari just opened the request 3 days ago. In fact, I need alphafold3 (not alphafold2) right now to perform DNA/Protein/Protein analyses (alphafold2 is unable to do). Hope the installation can be expedited as @rakabari requested. |
@rakabari, if you have one running locally, is there any possibility that I run a dataset of about 4000 proteins when your system is not in use? Thanks, Guojun |
@gy2025 I am still testing it on CCR using the GPU nodes. I built the Docker image outside CCR due to the lack of Docker support. I can provide you with the built Singularity image if you’d like to run it yourself. |
@rakabari, Appreciate the offer, that will be great, probably also need your knowhow after your successful testing. |
@rakabari We are working on a rootless docker service but it is not in production yet. For now, you have to use Apptainer/Singularity for containers. @gy2025 what group are you working with/for? We appreciate that you are both interested in this and would ask for your patience as we manage an incredible amount of requests that are critical to over 400 research groups and several thousand users at UB and beyond. Our small team works through them as quickly as possible in as fair of a way as possible while balancing the rest of our workload. |
@gy2025 I successfully ran a test sequence on CCR. Let me know the directory (with sufficient space and write permission) where I should place the Singularity image. After cloning alphafold3.git, you will need to download the database by running fetch_databases.py (refer to the guide). Then, execute the Singularity command from the installation guide, ensuring you update the paths to point to your directories. You will also need to get the model parameters from DeepMind (check the guide for the form). It will take a few days to get it. |
@dsajdak we truly appreciate the incredible effort your team puts into supporting so many research groups and users. I’m happy to assist in any way I can to help streamline the process. Thank you again for starting this discussion. |
You need to adjust the permissions so that I can copy it to your directory. The image is about 5 GB, but the database is >600 GB. |
@rakabari, looks a bit of hassle to go through Globus, anyway my own directory /user/gyang24 is basically empty with about 20G space available. You can put the Singularity image there now (write allowed temporarily). I will download the database myself. Thanks a lot! |
@gy2025 Globus is very simple to use and you can create a share easily. You should never change the permissions on your home directory of a shared system. This will cause you problems with logging in via SSH. It is also insecure to have access open to everyone on our systems. Worse, we're discussing this in an open forum on the internet. If Globus is not going to work for you, please submit a ticket to CCR Help and I'll connect you both and provide an alternative. I am changing the permissions back on your home directory |
@dsajdak OK, I will submit a ticket. I looked at the Globus instructions and seems a bit too much to set it up just to do this once in probably years. |
Shared via UB dropbox. You can scp the image to your directory. Also shared on Globus |
@rakabari, Saw it, Thanks a lot! |
@dsajdak, Thank you too for facilitating! |
@carrollea Happy to provide the image. You can request it through Globus as well. |
@rakabari Thank you very much! I just want to confirm, is the collection named "GY_AlphaFold3"? |
@carrollea Yes, I believe it was set by @gy2025. |
@dsajdak @rakabari Seems it started running for me, but I got this warning: 2024-11-25 14:52:34.705054: W external/xla/xla/service/gpu/nvptx_compiler.cc:930] The NVIDIA driver's CUDA version is 12.2 which is older than the PTX compiler version 12.6.77. Because the driver is older than the PTX compiler version, XLA is disabling parallel compilation, which may slow down compilation. You should update your NVIDIA driver or use the NVIDIA-provided CUDA forward compatibility packages. Not sure this is sth we can change or need to live with it for now. |
@gy2025 The CUDA version on the CCR nodes is 12.2, but the module load version is 11.8. I created another SIF image using 11.8 but decided not to run it because 12.2 should be compatible with 12.6, despite the performance warnings. @dsajdak, can the module load version be upgraded (separate installation) to 12.6.77? |
@rakabari Very glad to report that Alphafold3 produced expected output, your singularity image and suggestions helped greatly. Thanks a bunch! |
@rakabari, did you notice a problem in af3 output: the model.cif is incorrect, but the the model file in the sample-0 folder is correct. I thought they would simply copy the 0 ranked cif file as model.cif, but that does not seem to be the case. |
It is not necessarily sample-0 that is model.cif; it is the highest-ranked sample in ranking_scores.csv. |
I see, thought sample-0 means ranking 0. Anyway, that helped to debug my code.Thanks. |
We know this is a highly anticipated software application and it is high on our priority list. While you wait, please be aware there is an Alphafold server available for non-commercial use. They also supply container installation information if you'd like to create your own container. |
Thank you for your efforts, Dori! The server is limited to 20 jobs per day, while we have thousands of jobs. We have been using the containerized Singularity image on CCR. While we wait for CCR to implement this, would it be possible to allocate additional database storage? Our group directory is running out of space. Additionally, there is a CUDA version mismatch between the GPU nodes and the version on which the image is built, which is impacting its performance. If it is not too time-consuming, would it be possible to make CUDA/12.6.77 available as a module? This would also support the CCR's implementation of AlphaFold3 in the future. |
@dsajdak Thank you for the information. I tried the server a while ago, the reason we want to run locally is to automate custome processing of the results afterward. So far, precessing speed is an issue, it runs really slowly with a GPU, like around 30min per input. |
Hi all! I understand the server might not be the best option for you. I was mentioning it for others who see this and don't have a working container themselves. @rakabari we have
The way the software environments are created is based off of the available Easybuild compilers and toolchains where we build one version per application. I don't expect we'll have a newer version of CUDA until the next software environment is rolled out which is a long time away. Your group leader can purchase additional storage following these instructions. |
FYI the Alphafold3 databases can be found on our systems here: We are working on installing Alphafold3. |
Thank you. If possible, I suggest addressing CUDA version compatibility. |
Thank you @dsajdak for the databases! I've noticed that the process that takes the most time in the pipeline is the multiple sequence alignment (msa). It took me 37 min to get a model with a size of 165 tokens. I've been trying a few things to increase the speed and have found that even if you change the flags "--jackhmmer_n_cpu=56" and "--nhmmer_n_cpu=56" the time reduced to 22 min. Which is nice but could still be better. Luckily Alphafold3 let's you supply an msa so I ended up using mmseq2 instead of jackhmmer since it is much faster. After supplying the msa the time was cut down from 37 min (my initial run) to 3 min (1min 30 sec for msa creation and 1 min 30 sec for model creation). Just as a note the mmseq2 on ccr is version 13 but you need version 14 or higher to get the correct .a3m output file needed for Alphafold3. |
@carrollea, Thanks for sharing the trick, maybe we need to request mmseq2 14 installation? |
@carrollea, just wonder whether you can share the code block to use mmseqs2 to make a3m MSA for af3 as input. I tried mmseqs2 (release 16), it took longer than 30min for one protein with the db file: uniref90_2022_05.fa to make alignment. Thanks |
I first create a database for the database and index it. I can use this for different queries.
I then create a database for the query.
Then to search the database I first load it to memory with "touchDB" and "--db-load-mode 2" flag in the search function. In the search parameters I changed the sensitivity to 1 which makes it much faster. I figured since alphafold3 has deemphasized the msa module, compared to alphafold2 I could reduce the sensitivity. I used the same e value and max sequence values that colabfold uses. Then I make the a3m file with "result2msa".
edited: Changed "target" to "query" for clarity. |
@carrollea, Glad to report that using your parameters it does process each input file in about 3 min for me. Thanks for the help! |
I am writing to request the installation of AlphaFold 3 on CCR, as the source code for the latest version is now available -
https://github.com/google-deepmind/alphafold3/blob/main/docs/installation.md
This tool will be valuable for various ongoing projects at UB, supporting advanced research across departments. Please let me know if there are any prerequisites or additional information needed from our end to facilitate this installation.
The text was updated successfully, but these errors were encountered: