-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME
50 lines (31 loc) · 2.06 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
This folder contains scripts to do EPPIC pre-computation
Author : Kumaran Baskaran
Date : 05-07-2013
EPPIC pre-computation is usually carried out in two parts.
Part 1: to be done on a local machine
1. download the necessary files: DonwloadFiles
2. upload to mysql database: UploadToDatabase
3. update blastdb files: UpdateBlastDB
4. create unique fast seq: CreateUniqueFasta
all the above steps can be done manually using the corresponding scripts (or) eppic-precomp-local script will run all those jobs in sequence and transfer the necessary files to Merlin.
Hint: the best way to do it is to use the “eppic-precomp-local” script. If there is a problem in the middle, comment out the finished step and rerun “eppic-precomp-local”
At the end of this local computing part, a folder named “eppic-precomp-yyyy-mm-dd” will be created and transferred to the home folder of Merlin (if you have passwd-free access to merlin, otherwise one has to manually transfer the folder to merlin).
Part 2: cluster computing
This part consists of two steps:
step 1: create blast cache files
Simply edit the paths in the eppic-precomp-merlin script and run it. This will generate a qsub script to create blast cache files.
step 2: EPPIC pre computation:
1. rsync the local pdb database to latest one
2. set up the right UniProt version in the eppic.conf file
3. go to eppic_precomp folder inside eppic-precomp-yyyy-mm-dd and run ./configure_jobs 0
Note: 0 indicates initial run
It will generate folders named input, output, qsubscripts
4. go to qsubscripts and submit the jobs one by one
5. once the first chunk is finished you can run ./configure_jobs 1 1
first number indicates the rerun count
second number indicates the chunk number
this will generate the new files for the missing entries
similarly for the second chunk ./configure_jobs 1 2
in the rerun the amount of RAM is increased to 16G
after the first rerun if still some files are missing then carry out a second rerun ./configure_jobs 2 1
In this case one has to manually edit the RAM value in qsubscripts/eppic_chunk1run_2.sh