====== Differences ====== This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
gtspring2009:howto:pace [2009/02/22 16:54] predrag fixed a few PACE links |
gtspring2009:howto:pace [2010/02/02 07:55] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== How to use the PACE cluster ====== | ====== How to use the PACE cluster ====== | ||
- | |||
- | FIXME This document is a mere stub of what it ought to be. | ||
The Georgia Tech Public Access Cluster Environment (PACE) is a cluster of | The Georgia Tech Public Access Cluster Environment (PACE) is a cluster of | ||
Line 16: | Line 14: | ||
====== PACE documentation ====== | ====== PACE documentation ====== | ||
+ | |||
+ | [[http://www.pace.gatech.edu/facilities/pacecc/starting.php|The official PACE documentation page from OIT.]] | ||
+ | |||
{{:gtspring2009:pc.png|}} temptation to Balkanization is irresistible, but there is a | {{:gtspring2009:pc.png|}} temptation to Balkanization is irresistible, but there is a | ||
[[http://www.cns.gatech.edu/CNS-only/PACE.html|CNS PACE homepage]] on the svn wwwcns repository, | [[http://www.cns.gatech.edu/CNS-only/PACE.html|CNS PACE homepage]] on the svn wwwcns repository, | ||
so please enter general (not channelflow specific) PACE documentation there. --- //[[predrag.cvitanovic@physics.gatech.edu|Predrag Cvitanovic]] 2009-02-22 16:41// | so please enter general (not channelflow specific) PACE documentation there. --- //[[predrag.cvitanovic@physics.gatech.edu|Predrag Cvitanovic]] 2009-02-22 16:41// | ||
+ | |||
+ | {{:gtspring2009:gibson.png?24}} You might be the only person on the planet with enough goodwill, | ||
+ | patience, and CNS permissions to document via svn checkout, editing html, checkin, and recheckout | ||
+ | as www on zero. Maybe we can get some contributions by lowering the overhead with dokuwiki editing. | ||
====== PACE login ====== | ====== PACE login ====== | ||
{{:gtspring2009:pc.png|}} Your PACE login ID is your GTID, but I have to create an account for each user, because CNS gets charged for PACE usage. --- //[[predrag.cvitanovic@physics.gatech.edu|Predrag Cvitanovic]] 2009-02-22 16:46// | {{:gtspring2009:pc.png|}} Your PACE login ID is your GTID, but I have to create an account for each user, because CNS gets charged for PACE usage. --- //[[predrag.cvitanovic@physics.gatech.edu|Predrag Cvitanovic]] 2009-02-22 16:46// | ||
- | |||
====== ssh tunneling ====== | ====== ssh tunneling ====== | ||
- | The denizens of CNS have written [[http://www.cns.gatech.edu/CNS-only/index.html|a bunch of shell scripts]] that simplify the | + | The denizens of CNS have written shell scripts that simplify the process of logging in to the CNS machines and the PACE cluster from outside Georgia |
- | process of logging in to the CNS machines and the PACE cluster from outside Georgia | + | Tech, and using the svn repositories. You will probably want to install these: [[http://www.cns.gatech.edu/CNS-only/fetchTunnel.html|"Fetch CNS ssh tunneling scripts"]]. |
- | Tech. You will probably want to install these and figure out how to use them. Ask oldtimers you do not know the CNS webpages password. | + | |
+ | Ask old-timers if you do not know the CNS webpages password. | ||
+ | |||
+ | If you run into a problem and then manage to solve it, please edit [[http://www.cns.gatech.edu/CNS-only/fetchTunnel.html|"Fetch CNS ssh tunneling scripts"]] in svn wwwcns repository accordingly, so the next person has less trouble figuring this out... | ||
+ | |||
+ | Channelflow is hosted at [[http://svn.channelflow.org]], so you can access it directly with svn commands from anywhere --no ssh tunneling required. See the [[docs:install|Installation instructions]]. | ||
====== install channelflow ====== | ====== install channelflow ====== | ||
Line 35: | Line 45: | ||
Install channelflow on the PACE cluster. This should be straightforward. Follow | Install channelflow on the PACE cluster. This should be straightforward. Follow | ||
the [[docs:install]] directions. | the [[docs:install]] directions. | ||
- | |||
====== submit a job to the queue ====== | ====== submit a job to the queue ====== | ||
- | ====== view the results ====== | + | The PACE cluster has PBS job-queue software that automatically distributes processes among the nodes (computers) |
+ | in the cluster (e.g. pace1, pace2, ...). You //must// use this system to start long-running jobs (anything that | ||
+ | runs for more than a few minutes), instead of logging onto a particular node, starting a job on the command-line, | ||
+ | and backgrounding it. If you do the latter it'll interfere with the queuing system and annoy other users and | ||
+ | the PACE administrators. | ||
+ | Below are two sets of instructions for submitting jobs to the queue. The first describes direct use of the | ||
+ | PBS ''qsub'' command. The second uses a shell script named ''qsubmit'' I wrote to enable one-line | ||
+ | queue submissions. | ||
+ | For further information see the ''qsub'' and ''qstat'' man pages on pacemaker.gatech.edu (type ''man qsub'' at the command-line) | ||
+ | and [[http://www.pace.gatech.edu/facilities/pacecc/starting.php?section=developing&topic=Submitting+Your+Jobs|The official | ||
+ | PACE documentation]]. | ||
+ | ===== using the PBS qsub command ===== | ||
+ | To submit a job using the PBS queue software, | ||
+ | |||
+ | 1. Log on to pacemaker.pace.edu. | ||
+ | |||
+ | 2. Create a PBS job description file with a text editor (e.g. xemacs, vi) | ||
+ | and with a ''.pbs'' filename extension. I'll use the filename ''foo.pbs''. | ||
+ | Use the following as a template. | ||
+ | |||
+ | <code> | ||
+ | #PBS -N arnoldi-EQ9-Re380 | ||
+ | #PBS -l nodes=1:ppn=1,walltime=72:00:00 | ||
+ | #PBS -q pace-cns | ||
+ | #PBS -k oe | ||
+ | #PBS -m abe | ||
+ | cd /nv/hp1/jg356/simulations/narrowbox/arnoldi-EQ9/Re380 | ||
+ | arnoldi --flow -Na 100 -R 380 ../../continue-EQ9/Re380/ubest.ff | ||
+ | </code> | ||
+ | |||
+ | The lines in the above file specify | ||
+ | * -N a name for the job | ||
+ | * -l how many nodes, processors per node, and wallclock time the jobs is allowed | ||
+ | * -q the name of the queue the job should be submitted to | ||
+ | * -k save output and error files to ''~/arnoldi-EQ9-Re380.oXXXXX'' and ''.eXXXXX'' | ||
+ | * -m send email when the job starts, stops, or is terminated | ||
+ | * ''cd /nv/hp1/jg356/....'' The Unix commands to execute to start the job | ||
+ | |||
+ | |||
+ | 4. Submit the job using ''qsub'' | ||
+ | |||
+ | <code> | ||
+ | jg356@pacemaker$ qsub foo.pbs | ||
+ | </code> | ||
+ | ===== using the qsubmit shell script ===== | ||
+ | |||
+ | 1. Put the following bash function definition in your ''~/.bash_aliases'' or ''~/.bashrc'' file | ||
+ | on PACE. | ||
+ | |||
+ | <code> | ||
+ | function qsubmit() { | ||
+ | tag=$1 | ||
+ | shift | ||
+ | echo "#PBS -N $tag" > tmp.pbs | ||
+ | echo "#PBS -l nodes=1:ppn=1,walltime=72:00:00" >> tmp.pbs | ||
+ | echo "#PBS -q pace-cns" >> tmp.pbs | ||
+ | echo "#PBS -k oe" >> tmp.pbs | ||
+ | echo "#PBS -m abe" >> tmp.pbs | ||
+ | echo "cd $(pwd)" >> tmp.pbs | ||
+ | echo $* >> tmp.pbs | ||
+ | cat tmp.pbs | ||
+ | qsub tmp.pbs | ||
+ | } | ||
+ | </code> | ||
+ | |||
+ | 2. Load the new definition into your current shell (this will happen automatically the next time | ||
+ | you log in). | ||
+ | |||
+ | <code> | ||
+ | jg356@pacemaker$ source ~/.bash_aliases | ||
+ | </code> | ||
+ | |||
+ | 3. Submit a job using the script | ||
+ | |||
+ | <code> | ||
+ | jg356@pacemaker$ qsubmit arnoldi-EQ9-Re380 arnoldi --flow -Na 100 -R 380 ../../continue-EQ9/Re380/ubest.ff | ||
+ | </code> | ||
+ | |||
+ | The first argument after ''qsubmit'' is the job name and the remaining arguments specify what is to be run, | ||
+ | i.e. the Unix commands you would type is you were just running the process directly in the shell. | ||
+ | |||
+ | The ''qsubmit'' script works by creating a ''tmp.pbs'' file with some default values for wallclock time | ||
+ | etc and then submitting that file to ''qsub''. Modify it to your liking. | ||
+ | |||
+ | ====== view the results ====== | ||
+ | The standard output and standard error streams of your process (what is normally printed in | ||
+ | a terminal) will be saved in the files ''~/jobname.oXXXX'' and ''~jobname.eXXXX'' where XXXX | ||
+ | is a job ID number set by the PBS queueing system. Any data saved to disk will be placed in the | ||
+ | directory where the job was started. |