User Tools

Site Tools


gtspring2009:howto:pace

====== Differences ====== This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
gtspring2009:howto:pace [2009/02/13 06:51]
gibson
gtspring2009:howto:pace [2010/02/02 07:55] (current)
Line 1: Line 1:
 ====== How to use the PACE cluster ====== ====== How to use the PACE cluster ======
- 
-FIXME This document is a mere stub of what it ought to be.  
  
 The Georgia Tech Public Access Cluster Environment (PACE) is a cluster of  The Georgia Tech Public Access Cluster Environment (PACE) is a cluster of 
Line 15: Line 13:
 keeps your laptop from melting. keeps your laptop from melting.
  
-====== PACE login ======+====== PACE documentation ​======
  
-FIXME Get a PACE login ID and passwordThough I did this once in the ancient past,  +[[http://​www.pace.gatech.edu/​facilities/​pacecc/​starting.php|The official PACE documentation ​page from OIT.]]
-I have no memory of it, and I can't find any documentation ​for it+
  
 +
 +{{:​gtspring2009:​pc.png|}} temptation to Balkanization is irresistible,​ but there is a
 +[[http://​www.cns.gatech.edu/​CNS-only/​PACE.html|CNS PACE homepage]] on the svn wwwcns repository,
 +so please enter general (not channelflow specific) PACE documentation there. ​ --- //​[[predrag.cvitanovic@physics.gatech.edu|Predrag Cvitanovic]] 2009-02-22 16:41//
 +
 +{{:​gtspring2009:​gibson.png?​24}} You might be the only person on the planet with enough goodwill,
 +patience, and CNS permissions to document via svn checkout, editing html, checkin, and recheckout ​
 +as www on zero. Maybe we can get some contributions by lowering the overhead with dokuwiki editing.
 +
 +====== PACE login ======
 +
 +{{:​gtspring2009:​pc.png|}} Your PACE login ID is your GTID, but I have to create an account for each user, because CNS gets charged for PACE usage. ​ --- //​[[predrag.cvitanovic@physics.gatech.edu|Predrag Cvitanovic]] 2009-02-22 16:46//
 ====== ssh tunneling ====== ====== ssh tunneling ======
  
-FIXME The denizens of CNS have written ​a bunch of shell scripts that simplify the +The denizens of CNS have written shell scripts that simplify the process of logging in to the CNS machines and the PACE cluster from outside Georgia 
-process of logging in to the CNS machines and the PACE cluster from outside Georgia +Tech, and using the svn repositories. You will probably want to install these: [[http://​www.cns.gatech.edu/​CNS-only/​fetchTunnel.html|"​Fetch CNS ssh tunneling scripts"​]].  
-Tech. You will probably want to install these and figure out how to use them.+ 
 +Ask old-timers if you do not know the CNS webpages password. 
 + 
 +If you run into a problem ​and then manage ​to solve it, please edit [[http://​www.cns.gatech.edu/​CNS-only/​fetchTunnel.html|"​Fetch CNS ssh tunneling scripts"​]] in svn wwwcns repository accordingly,​ so the next person has less trouble figuring this out... 
 + 
 +Channelflow is hosted at [[http://​svn.channelflow.org]],​ so you can access it directly with svn commands from anywhere --no ssh tunneling required. See the [[docs:​install|Installation instructions]]. 
  
 ====== install channelflow ====== ====== install channelflow ======
Line 30: Line 45:
 Install channelflow on the PACE cluster. This should be straightforward. Follow Install channelflow on the PACE cluster. This should be straightforward. Follow
 the [[docs:​install]] directions. the [[docs:​install]] directions.
- 
 ====== submit a job to the queue ====== ====== submit a job to the queue ======
  
-====== view the results ======+The PACE cluster has PBS job-queue software that automatically distributes processes among the nodes (computers) 
 +in the cluster (e.g. pace1, pace2, ...). You //must// use this system to start long-running jobs (anything that 
 +runs for more than a few minutes), instead of logging onto a particular node, starting a job on the command-line,​  
 +and backgrounding it. If you do the latter it'll interfere with the queuing system and annoy other users and 
 +the PACE administrators. ​
  
 +Below are two sets of instructions for submitting jobs to the queue. The first describes direct use of the 
 +PBS ''​qsub''​ command. The second uses a shell script named ''​qsubmit''​ I wrote to enable one-line ​
 +queue submissions. ​
  
 +For further information see the ''​qsub''​ and ''​qstat''​ man pages on pacemaker.gatech.edu (type ''​man qsub''​ at the command-line)
 +and [[http://​www.pace.gatech.edu/​facilities/​pacecc/​starting.php?​section=developing&​topic=Submitting+Your+Jobs|The official ​
 +PACE documentation]].
 +===== using the PBS qsub command =====
  
 +To submit a job using the PBS queue software, ​
 +
 +1. Log on to pacemaker.pace.edu.
 +
 +2. Create a PBS job description file with a text editor (e.g. xemacs, vi)
 +and with a ''​.pbs''​ filename extension. I'll use the filename ''​foo.pbs''​. ​
 +Use the following as a template.
 +
 +<​code>​
 +#PBS -N arnoldi-EQ9-Re380
 +#PBS -l nodes=1:​ppn=1,​walltime=72:​00:​00
 +#PBS -q pace-cns
 +#PBS -k oe
 +#PBS -m abe
 +cd /​nv/​hp1/​jg356/​simulations/​narrowbox/​arnoldi-EQ9/​Re380
 +arnoldi --flow -Na 100 -R 380 ../​../​continue-EQ9/​Re380/​ubest.ff
 +</​code>​
 +
 +The lines in the above file specify
 +  * -N a name for the job
 +  * -l how many nodes, processors per node, and wallclock time the jobs is allowed
 +  * -q the name of the queue the job should be submitted to
 +  * -k save output and error files to ''​~/​arnoldi-EQ9-Re380.oXXXXX''​ and ''​.eXXXXX''​
 +  * -m send email when the job starts, stops, or is terminated
 +  * ''​cd /​nv/​hp1/​jg356/​....''​ The Unix commands to execute to start the job
 +
 +
 +4. Submit the job using ''​qsub''​
 +
 +<​code>​
 +jg356@pacemaker$ qsub foo.pbs
 +</​code>​
 +===== using the qsubmit shell script ​ =====
 +
 +1. Put the following bash function definition in your ''​~/​.bash_aliases''​ or ''​~/​.bashrc''​ file
 +on PACE.
 +
 +<​code>​
 +function qsubmit() {
 +  tag=$1
 +  shift
 +  echo "#PBS -N $tag" > tmp.pbs
 +  echo "#PBS -l nodes=1:​ppn=1,​walltime=72:​00:​00"​ >> tmp.pbs
 +  echo "#PBS -q pace-cns"​ >> tmp.pbs
 +  echo "#PBS -k oe" ​  >>​ tmp.pbs
 +  echo "#PBS -m abe" ​ >> tmp.pbs
 +  echo "cd $(pwd)" ​   >> tmp.pbs
 +  echo $*             >>​ tmp.pbs
 +  cat tmp.pbs
 +  qsub tmp.pbs
 +}
 +</​code>​
 +
 +2. Load the new definition into your current shell (this will happen automatically the next time
 +you log in). 
 +
 +<​code>​
 +jg356@pacemaker$ source ~/​.bash_aliases
 +</​code>​
 +
 +3. Submit a job using the script
 +
 +<​code>​
 +jg356@pacemaker$ qsubmit arnoldi-EQ9-Re380 arnoldi --flow -Na 100 -R 380 ../​../​continue-EQ9/​Re380/​ubest.ff
 +</​code>​
 +
 +The first argument after ''​qsubmit''​ is the job name and the remaining arguments specify what is to be run,
 +i.e. the Unix commands you would type is you were just running the process directly in the shell.  ​
 +
 +The ''​qsubmit''​ script works by creating a ''​tmp.pbs''​ file with some default values for wallclock time 
 +etc and then submitting that file to ''​qsub''​. Modify it to your liking. ​
 +
 +====== view the results ======
  
 +The standard output and standard error streams of your process (what is normally printed in 
 +a terminal) will be saved in the files ''​~/​jobname.oXXXX''​ and ''​~jobname.eXXXX''​ where XXXX
 +is a job ID number set by the PBS queueing system. Any data saved to disk will be placed in the 
 +directory where the job was started.
gtspring2009/howto/pace.1234536672.txt.gz · Last modified: 2009/02/13 06:51 by gibson