Running Gaussian Jobs on the CIP-Cluster
Running Gaussian 03 jobs through the PBS queueing system on the
CIP-F cluster requires the combination of input information for the
queueing system and for Gaussian 03. A typical example named
test1 has the following structure:
#!/bin/csh
#PBS -l mem=128mb
#PBS -q long
setenv g03root /usr/local
setenv GAUSS_SCRDIR /scratch
setenv GAUSS_EXEDIR /usr/local/g03b3
setenv GAUSS_ARCHDIR /usr/local/g03b3
setenv LD_LIBRARY_PATH "${GAUSS_EXEDIR}:/usr/lib"
cat >$GAUSS_SCRDIR/$PBS_JOBNAME << EOF
%chk=/scratch/test1.chk
%mem=6000000
#P HF/6-31G(d) scf=tight
test1 HF/6-31G(d) sp formaldehyde
0 1
C1
O2 1 r2
H3 1 r3 2 a3
H4 1 r4 2 a4 3 d4
r2=1.20
r3=1.0
r4=1.0
a3=120.
a4=120.
d4=180.
EOF
touch $PBS_O_WORKDIR/$PBS_JOBNAME.$HOST
/usr/local/g03b3/g03 < $GAUSS_SCRDIR/$PBS_JOBNAME > $GAUSS_SCRDIR/$PBS_JOBNAME.log
mv $GAUSS_SCRDIR/$PBS_JOBNAME.log $PBS_O_WORKDIR/$PBS_JOBNAME.log
mv $GAUSS_SCRDIR/$PBS_JOBNAME.chk $PBS_O_WORKDIR/$PBS_JOBNAME.chk
rm -f $GAUSS_SCRDIR/$PBS_JOBNAME
exit
The first three lines of this input file starting with the #
symbol are pbs-commands that direct the pbs queueing system to execute all following
commands in the csh environment and submit this job to
queue long. A maximum of 128mb is allocated to the job.
Including the queue-name in the input file is not mandatory, but facilitates working
with multiple jobs on a compute cluster. The following lines starting with
setenv define Gaussian-specific environment
variables.
The cat command is then used to create a file called
"$GAUSS_SCRDIR/$PBS_JOBNAME".
GAUSS_SCRDIR is one of the environment variables set
before to designate the scratch directory for Gaussian 03 (a directory for
large files that will be deleted after the job terminates successfully). The
environment variable PBS_JOBNAME is provided by the
pbs queueing system itself and defaults to the name of the script file (here: test1).
What the cat command does in this
particular example is to copy all lines that follow until the
EOF (end of file) marker into the file
/scratch/test1.
The remaining lines after the EOF marker first generate an empty file in the users
working directory indicating the name of the node on which the calculation will be
executed and then execute Gaussian 98 using
/scratch/test1 as an input file and
/scratch/test1.log as the output file. After job completion
the output file /scratch/test1.log will be moved to the
working directory of the user. The location of the latter is reflected in the pbs
environment variable $PBS_O_WORKDIR. The value of this
variable is set upon submit time and defaults to the subdirectory containing the job
file. A second (binary) file containing additional information of the calculation
called /scratch/test1.chk will also be moved to the
users working directory. The final rm command cleans
up everything that is left over.
The advantage of reading and writing input and output files to /scratch instead of the
users home directory directly has to do with the fact that the former is a local file
system, while the latter is accessible only through a file server (here cicum1). In case
the file server responds too slowly, the Gaussian 03 jobs will simply terminate
with a file I/O error. Access to a local file system such as /scratch is much more
reliable. Unfortunately, there are also two disadvantages to this concept. First, as the
/scratch directories are local file systems on local disks, they are accessible only
from one single node of the cluster. This implies that a job running on cicum82 can't
access the files located in the /scratch directory on cicum93. Second, each /scratch
directory is open to all users of the cluster. This becomes problematic when multiple
users choose identical file names such as test1. In case
two users run a job in sequence on the same cluster node using identical file names,
the second job will typically die due to file access restrictions. This problem can be
circumvented by choosing appropriate job and file names.
The script described above can be extended with all common csh-commands. One particularly
useful one consists in copying or renaming files before Gaussian 03 executes.
If, for example, results from a previous run are stored already in the binary checkpoint
file named test1.chk, one could move this file first to the
/scratch directory by inserting:
cp $PBS_O_WORKDIR/$PBS_JOBNAME.chk $GAUSS_SCRDIR/$PBS_JOBNAME.chk
directly before the touch command in the above example.
Finally this file can be submitted to the queueing system with qsub test1.
In case no queueing information has been included in the actual input file, this information can be
added on submit time using, for example, qsub -l nodes=cicum92 test1.
last changes: 16.10.2004, HZ
questions & comments to: zipse@cup.uni-muenchen.de