These instructions are not to be followed slavishly. Customization to fit your setup might be needed.
Use work_queue
etc from the CC lab:
export PATH=/afs/$PATH
or, for tcsh
setenv PATH /afs/${PATH}
These statements are best put into your shell startup scripts. Test the cctools with
parrot_run ls /
If you see any errors, try replacing cctools
with cctools-autobuild
Should that also fail, please contact us or the CC lab.
Then enter a CMSSW
release directory and source the software environment:
Use the following command to install the python setuptools, then proceed as above:
wget -O - | python - --user
Then, to actually install lobster, run
to get the most recent version, or
for the latest stable release. Then add .local/bin
to your PATH
export PATH=$PATH:$HOME/.local/bin
or, for tcsh
setenv PATH $PATH:$HOME/.local/bin
Download an example configuration with
wget --no-check-certificate \
and edit it with your favorite editor. The first two lines specify the ID the references your lobster instance, and the working directory for log files and the payload database.
Be sure that you issued
in the release you want your jobs to run in. Then obtain a proxy via
voms-proxy-init -voms cms -valid 192:00
or similar. Then start lobster:
lobster process <your_config_file>
Which should print you the location of a log file and stderr. You can follow these with
tail -f <your_working_directory>/lobster.{err,log}
If you see statements regarding failed jobs (exit code >0), some exit codes are listed here.
To stop lobster, use
lobster terminate <your_config_file/your_working_directory>
And to obtain progress plots:
lobster plot --outdir <some_web_directory> <your_config_file/your_working_directory>
The CRC login nodes opteron
, newcell
, and crcfe01
are connected to
the ND opportunistic computing pool. On these, multicore jobs are
preferred and can be run with
condor_submit_workers -N lobster_<your_id> --cores $cores \
--memory $(($cores * 1100)) --disk $(($cores * 4500)) 10
or, for tcsh
set cores=4
condor_submit_workers -N lobster_<your_id> --cores $cores \
--memory `dc -e "$cores 1100 *p"` --disk `dc -e "$cores 4500 *p"` 10
To submit 10 workers (= 10 cores) to the T3 at ND, run
condor_submit_workers -N lobster_<your_id> --cores 1 \
--memory 1000 --disk 4500 10
on earth
Create a file called acl
with default access permissions in your home
directory via: (you will need a valid proxy for this!)
echo "globus:$(voms-proxy-info -identity|sed 's/ /_/g') rwlda" > ~/acl
On earth, do something akin to the following commands:
chirp_server --root=<your_stageout_directory> -A ~/acl -p <your_port>
where the default port is 9094
, but may be occupied, in which case it
should be best to linearly increment this port until you find a free one.
If you are using chirp to stage out to /store
, limit the connections
by adding -M 50
to the arguments.
You should test chirp on ndcms
or any other computer than earth:
voms-proxy-init -voms cms -valid 192:00
chirp_put <some_file> earth:<your_port> spam
If this command fails with a permission issue, make sure you do not have
any .__acl
files lingering around in your stageout directory:
find <your_stageout_directory> -name .__acl -exec rm \{} \;
and try again.
Then add the follow line to your lobster configuration and you should be all set:
stageout server: "<your_port>"
This is optional, but will improve performance.
To run chirp with a direct connection to hadoop, the server command has to be altered slightly (change to suit your needs):
cd /var/tmp/
cp -r /usr/lib/hadoop/ .
cp /usr/lib64/libhdfs* hadoop/lib/
env JAVA_HOME=/etc/alternatives/java_sdk/ HADOOP_HOME=$PWD/hadoop chirp_server \
--root=hdfs://<your_stageout_directory_wo_leading_hadoop> \
-A ~/acl -p <your_port>
Test this command above, and add it to the configuration to enjoy talking to hadoop directly.
- CMS dasboard
- CMS squid statistics
- Condor usage
- NDCMS trends to monitor squid bandwidth
- External bandwidth
on the command line