-
Notifications
You must be signed in to change notification settings - Fork 0
Most Commonly Used Commands
Please see below for a list of commands the instructors use on almost a daily basis! Some of these have been covered in our workshop, but they're so important, they bear repeating :)
Check how many cores/memory are being used on the server. This will show you overall usage for everyone on the server, not just you.
htop
Check what jobs just you are running.
ps -ef | grep <your_username>
Check how much memory just you are using.
smem -k
Compress a file.
tar cvzf <filename.tar.gz> <filename.txt>
Decompress a file.
tar xvzf <filename.tar.gz>
Check the size of a folder.
du -skh <folder_name>
Check number of sequences in a file.
grep -c '>' <fasta_filename.fasta>
Look at the first 5 fasta headers in a file.
grep '>' <fasta_filename.fasta> | head -n 5
If you are using a wildcard to copy, delete, link, etc., multiple files, you can hit tab
twice to double check which files that command will be run on.
Most of the jobs that you will end up running on the server will take much longer than a few minutes. To make sure that your job doesn't get killed because your connection timed out or temporarily get disconnected, you can run your job in a screen session. This page has a lot of helpful info if you want to read more about it, but below are the commands I use the most.
Create and name a new screen session.
screen -S <name>
When you want to detach from this screen session, hold ctrl
and a
, release the keys and then type d
. Once you are detached from that screen session, you can reattach and resume the screen session with:
screen -r <name>
Awk is it's own programming language, but it is most known for it's super powerful one liners to manipulate text files. What I use awk the most for is filtering tab-separated or comma-separated files. There are a million ways to filter these, but one way I've been using it recently is to make a new tab-separated file with only rows in which my first column matches my sixth column. This can be accomplished with:
awk '$1 == $6 {print $0}' <input_text_file> > <name_of_output_file>
It's worth noting that an extension to awk has been created for parsing common biological files (i.e. fasta, fastq, etc.) and it's known as bioawk. It's also super powerful and worth looking into too!
Check where python (or any other program you want to call) is stored in your system. This is sometimes helpful for debugging purposes.
which python
Happy coding :)