-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into VASP-podman
- Loading branch information
Showing
30 changed files
with
208 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
# Example faculty cluster job | ||
|
||
This is an example of how to setup a slurm job on the faculty cluster. Substitute `partition_name` for the partition you'd like to run your job on and ensure that this same name is used in the `--qos` line. To see what access you have to faculty partitions, please view your allocations in [ColdFront](https://coldfront.ccr.buffalo.edu). | ||
|
||
## How to use | ||
|
||
TODO: write me | ||
|
||
## How to launch an interactive job on the faculty cluster | ||
|
||
Use the `salloc` command and the same Slurm directives as you use in a batch script to request an interactive job session. Please refer to our [documentation](https://docs.ccr.buffalo.edu/en/latest/hpc/jobs/#interactive-job-submission) for proper setup of the request and command to use to access the allocated node. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
#!/bin/bash -l | ||
|
||
#SBATCH --clusters=faculty | ||
#SBATCH --partition=partition_name | ||
#SBATCH --qos=partition_name | ||
#SBATCH --time=12:00:00 | ||
#SBATCH --ntasks=1 | ||
#SBATCH --cpus-per-task=32 | ||
#SBATCH --mem=64000 | ||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
# Example ub-hpc cluster job | ||
|
||
These are examples of how to setup a slurm job on the debug and general-compute partitions of the ub-hpc cluster. Refer to our documentation on [requesting cores and nodes](https://docs.ccr.buffalo.edu/en/latest/hpc/jobs/#requesting-cores-and-nodes) to understand these options. | ||
|
||
## How to use | ||
|
||
TODO: write me | ||
|
||
## How to launch an interactive job on the ub-hpc cluster | ||
|
||
Use the `salloc` command and the same Slurm directives as you use in a batch script to request an interactive job session. Please refer to our [documentation](https://docs.ccr.buffalo.edu/en/latest/hpc/jobs/#interactive-job-submission) for proper setup of the request and command to use to access the allocated node. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
#!/bin/bash -l | ||
|
||
#SBATCH --clusters=ub-hpc | ||
#SBATCH --partition=debug | ||
#SBATCH --qos=debug | ||
#SBATCH --time=01:00:00 | ||
#SBATCH --ntasks=1 | ||
#SBATCH --cpus-per-task=12 | ||
#SBATCH --mem=64G | ||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
#!/bin/bash -l | ||
|
||
#SBATCH --clusters=ub-hpc | ||
#SBATCH --partition=general-compute | ||
#SBATCH --qos=general-compute | ||
#SBATCH --time=12:00:00 | ||
#SBATCH --ntasks=1 | ||
#SBATCH --cpus-per-task=32 | ||
#SBATCH --mem=64G | ||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# Example script for job arrays | ||
|
||
TODO: write me | ||
|
||
## How to use | ||
|
||
TODO: write me | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
#!/bin/bash -l | ||
|
||
|
||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# Example scavenger job | ||
|
||
TODO: write me | ||
|
||
## How to use | ||
|
||
TODO: write me | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
#!/bin/bash -l | ||
|
||
|
||
|
||
|
||
|
||
|
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Example Slurm scripts | ||
|
||
These are examples of how to setup a slurm job on CCR's clusters. Refer to our documentation on [running and monitoring jobs](https://docs.ccr.buffalo.edu/en/latest/hpc/jobs/) for detailed information. These examples supplement the documentation. It's important to understand the concepts of batch computing and CCR's specific cluster use and limits prior to using these examples. | ||
|
||
## How to use | ||
|
||
The `slurm-options.sh` file in this directory provides a list of the most commonly used Slurm directives and a short explanation for each one. It is not necessary to use all of these directives in every job script. In the sample scripts throughout this repository, we list the required Slurm directives and a few others just as examples. Refer to the `slurm-options.sh` file for a more complete list of directives and also to our [documentation](https://docs.ccr.buffalo.edu/en/latest/hpc/jobs/#slurm-directives-partitions-qos) for specific cluster and partition limits. Know that the more specific you get when requesting resources on CCR's clusters, the fewer options the job scheduler has to place your job. When possible, it's best to only specify what you need to and let the scheduler do it's job. If you're unsure what resources your program will require, we recommend starting small and [monitoring the progress](https://docs.ccr.buffalo.edu/en/latest/hpc/jobs/#monitoring-jobs) of the job, then you can scale up. | ||
|
||
At CCR you should use the bash shell for your Slurm scripts; you'll see this on the first line of every example we share. In a bash script, anything after the `#` is considered a comment and is not interpretted when the script is run. In the case of Slurm scripts though, the Slurm scheduler is specifically looking for lines that start with `#SBATCH` and will interpret those as requests for your job. Do NOT remove the `#` in front of the `SBATCH` command or your batch script will not work properly. If you don't want Slurm to look at a particular `SBATCH` line in your script, put two `#` in front of the line. | ||
|
||
## Navigating these directories | ||
|
||
- `0_Introductory` - contains beginner batch scripts for the ub-hpc and faculty clusters | ||
- `1_Advanced` - contains batch scripts for more complicated use cases such as job arrays, parallel computing, and using the scavenger partition | ||
- `2_ApplicationSpecific` - contains batch scripts for a variety of applications that have special setup requirements. You will not find an example script for every piece of software installed on CCR's systems | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
#!/bin/bash -l | ||
## | ||
## How long do you want to reserve the node(s) for? By default, if you don't specify, | ||
## you will get 24 hours. Referred to as walltime, this is how long the job will be | ||
## scheduled to run for once it begins. If your program runs longer than what is | ||
## requested here, the job will be cancelled by Slurm when time runs out. | ||
## If you make the expected time too long, it may take longer for resources to | ||
## become available and for the job to start. The various partitions in CCR's | ||
## clusters have various maximum walltimes. Refer to the documentation for more info. | ||
## Walltime Format: dd:hh:mm:ss | ||
#SBATCH --time=00:01:00 | ||
|
||
## Define how many nodes you need. We ask for 1 node | ||
#SBATCH --nodes=1 | ||
|
||
## Refer to docs on proper usage of next 3 Slurm directives https://docs.ccr.buffalo.edu/en/latest/hpc/jobs/#requesting-cores-and-nodes | ||
## Number of "tasks" (use with distributed parallelism) | ||
#SBATCH --ntasks=12 | ||
|
||
## Number of "tasks" per node (use with distributed parallelism) | ||
#SBATCH --ntasks-per-node=12 | ||
|
||
## Number of CPUs allocated to each task (use with shared memory parallelism) | ||
#SBATCH --cpus-per-task=32 | ||
|
||
## Specify the real memory required per node. Default units are megabytes. | ||
## Different units can be specified using the suffix [K|M|G|T] | ||
#SBATCH --mem=20G | ||
|
||
## Give your job a name, so you can recognize it in the queue | ||
#SBATCH --job-name="example-debug-job" | ||
|
||
## Tell slurm the name of the file to write to. If not specified, output files are named output.log and output.err | ||
#SBATCH --output=example-job.out | ||
#SBATCH --error=example-job.err | ||
|
||
## Tell slurm where to send emails about this job | ||
#SBATCH --mail-user=myemailaddress@institution.edu | ||
|
||
## Tell slurm the types of emails to send. | ||
## Options: NONE, BEGIN, END, FAIL, ALL | ||
#SBATCH --mail-type=end | ||
|
||
## Tell Slurm which cluster, partition and qos to use to schedule this job. | ||
#SBATCH --cluster=ub-hpc | ||
OR | ||
#SBATCH --cluster=faculty | ||
|
||
## Refer to documentation on what partitions are available and determining what you have access to | ||
#SBATCH --partition=[partition_name] | ||
|
||
## QOS usually matches partition name but some users have access to priority boost QOS values. | ||
#SBATCH --qos=[qos] | ||
|
||
## Request exclusive access of the node you're assigned, even if you haven't requested all of the node's resources. | ||
## This prevents other users' jobs from running on the same node as you. Only recommended if you're having trouble | ||
## with network bandwidth and sharing the node is causing problems for your job. | ||
#SBATCH --exclusive | ||
|
||
## Use snodes command to see node tags used to allow for requesting specific types of hardware | ||
## such as specific GPUs, CPUs, high speed networks, or rack locations. | ||
#SBATCH --constraint=[Slurm tag] | ||
|
||
## Multiple options for requesting GPUs | ||
## Request GPU - refer to snodes output for breakdown of node capabilities | ||
#SBATCH --gpus-per-node=1 | ||
|
||
## Request a specific type of GPU | ||
#SBATCH --gpus-per-node=1 | ||
#SBATCH --constraint=V100 | ||
|
||
## Request a specific GPU & GPU memory configuration | ||
#SBATCH --gpus-per-node=tesla_v100-pcie-32gb:1 | ||
|
||
## Request a specific GPU, GPU memory, and GPU slot location | ||
#SBATCH --gpus-per-node=tesla_v100-pcie-16gb:1(S:0) or (S:1) | ||
|
||
## To use all cores on a node w/more than 1 GPU you must disable CPU binding | ||
#SBATCH --gres-flags=disable-binding | ||
|
||
## For more Slurm directives, refer to the Slurm documentation https://slurm.schedmd.com/documentation.html |