SQS - Simple Queueing System ---------------------------- Version 2.8 June, 2008 ---------------------- There are many queueing and scheduling systems around, but they were all rather over-kill for what we needed. This was originally a single queue on a single machine allowing one or two jobs to run at once. This version allows several queues, primarily because it is useful to have a main working queue, and a queue for runs that just test the input files, and the ability to submit jobs to other machines (see the section at the end on remote use). It has two uses:- 1. For the general running of computational chemistry codes, which use a lot of scratch disk space, a lot of time and often a lot of memory. 2. For memory, disk intensive jobs submitted from web form pages. Here are some examples of using SQS with cluster or remote machines:- 1. We use two clusters of Linux Pentium IV machines. They are not connected directly to the internet and communicate with rlogin, rsh and friends. Almost the whole filesystem of each cluster is in common, mounted using NFS. We want to work on one machine, the master node, and run jobs on all machines. We call this cluster use. There is one set of work directories, one set of SQS queue files and one set of executables. 2. We use two machines that are identical, but we do not have root access and nothing is mounted in common. The administrator does not want us to do this. We want to run jobs on both machines, with each running its own SQS. We want to inspect and modify queues on each machine from the other, depending on which one we are connected to. We use slogin, ssh and friends on this pair of machines. We call this remote use. 3. I link a laptop running Linux to a PC running Linux, with a direct connection. We can treat this pair as cluster use or remote use. It has been used to develop this code. The system runs under Linux and is written in Perl. It should run on any system with Perl. It has been used with several versions of perl, the latest being 5.8.6. It has also been tested on a Tru64 Compaq Alpha and to a limited extent under Cygwin. It consists of 12 scripts, including 'install.pl' for making the installation and deinstallation more automatic. The original files are templates named 'file.t', which are used by 'install.pl' to create the following scripts:- qsub ---- qsub [-d] [-h] [-q queue-name] [-p priority] 'jobname' Submits the job 'jobname' to the queue named 'queue-name'. If '-q queue-name' is absent, it uses the default queue; the first local one defined in sqs.conf. If '-p priority' is present, it sets that priority for the job, otherwise it sets the priority to 1. The priority is an integer in the range 1 - 9. qsub is usually called by us from inside another script which takes the local name of the job, puts together a file to run the specific code and then calls that file by name in the call of qsub. If '-h' is present the job will be held in the queue and not start until released with 'qrls'. 'jobname' must be a complete path, so the script described above uses $PWD to get the full path. The correct direct use of qsub to run a file called 'file.job' is:- qsub [-d] [-q queue-name] [-p priority] $PWD/file.job where it is assumed that file.job directs output to a file. If '-d' is present as an argument, the file 'file.job' will be deleted. Note that '-d', '-h', '-q queue-name' and '-p priority' can appear in any order, but 'jobname' must be the last argument. qsub will not let you submit a job that already exists in one of the queues as this causes serious conflicts. If the queue is empty, or only contains held jobs, or only contains running jobs to a total less than the maximum allowed, the job is started. If jobs are added to the queue when the maximum number of jobs allowed is running, the jobs are kept in the queue. qsub starts qseek if it is not running. If you are submitting to a cluster queue using ssh and password authentication, you will be asked to type in your password on the cluster machine. qseek is not started if the job is submitted with '-h' to hold the job. The script returns the 'job no' to standard output, largely for our web scripts, so this can be directed to /dev/null in normal scripts. qseek ----- This file essentially runs as a daemon, looking for jobs to run. Each user starts his/her own qseek. qseek is started by a call to 'qinit start'. If it is not already running for that user, qseek is started by qsub, for obvious reasons; by qrls, since only held jobs may be in the queue through a shutdown/reboot; and by qmove if you are moving a non-held job to a queue on another machine. qrun ---- The script that is started by qseek to run a particular job in the background. The user should not be concerned with qrun and qseek, but if things go wrong qseek may need to be killed or restarted. They are not called directly by the user. qinit [-r host] --------------- qinit [-r host] start [queue] This starts qseek. It is called by several other scripts, but can be called directly. It does not start a second qseek if qseek is already running for the user on the machine that is linked to the queue. Starting qseek sets the priorities back from negative to their original values (see qinit stop). qinit [-r host] stop [queue] This kills the running qseek. You may need to do this, for example after you make changes to sqs.conf. Note it does not kill the running jobs started by that version of qseek. Stopping qseek will set the priority of all your jobs on the machine to the negative of its original value. If qseek stops in some other way, the script qclear can be used to set the priorities to the correct negative value. 'queue' is the name of a queue that is linked to the machine on which you want to start or stop the qseek daemon. This allows qinit to get the host details from the queue attributes in sqs.conf. If several queues are linked to a particular machine, any one of them can be used. Users will be more familiar with the queue names than with the precise hostname details. If [queue] is omitted, the local machine is assumed. If the queue is on a cluster machine and you are using ssh, you may be asked for password authentication. This will also occur when this command is called by qsub, qrls and qmove. You will be asked for a password twice if qseek is actually started and only once if it it finds it is already running. qinit [-r host] show This simply displays the queues indicating which job is, hopefully, running, which are queued and which are held. The first column gives the 'job no', needed by qdel, qhold, qrls, qmove and qprior. The second column gives the queue name. The third column gives the user name. The forth column gives the 'job', with the last 20 characters of the full path to the job. The fifth column gives the priority. If the priority shows as a negative number, it means that qseek is not running for that user. This is to ensure that one user can run jobs when another user has jobs of higher priority but qseek is not running. The sixth column gives the time and date. This information is also stored in the log file, although here the full job path is given. The seventh column gives the status - "Running", "Queueing", "Holding" or "No daemon". "No daemon" is merely emphasising that the priority is negative. The eighth column gives the PGIP (Group Process ID) for running jobs. Note that 'job no' is unique across queues. Note that if 'qinit show' is run just after submitting a job, the job may be listed as "Queued" as it takes a little time for qseek to find the queued job and start it. We generally alias 'qinit show' to 'qu'. qinit [-r host] status This checks the status of all qseek daemons belonging to the user. It also gives details of the important parameters for each queue. Note that if you are accessing the cluster machines by ssh with password authentication, you have to type in the password once for every machine in the cluster that has queues defined. For this reason, we have split the functionality of 'status' and 'show', which previously were together as 'status'. We generally alias 'qinit status' to 'qstat'. qinit [-r host] status-all This checks the status of all qseek daemons belonging to the user and any other user listed as listed in $sqsdir/sqs.users. It does not check whether entries in that file are valid users on each machine. It also gives details of the important parameters for each queue. Note that if you are accessing the cluster machines by ssh with password authentication, you have to type in the password once for every machine in the cluster that has queues defined, as indicated under 'qinit status'. Both 'qinit status' and 'qinit status-all' add the user to $sysdir/sqs.users if it is not already there. That file can be edited by hand, but there is no reason to do so. The argument [-r host] makes the script run on a remote (N.B. not a cluster) machine. See the section below on "Running on remote machines". It may not be advisable to use 'qinit -r host start', for the reasons discussed in that section below.. qdel [-r host] [-nd] 'job no' [list of other job nos.] ------------------------------------------------ This script merely deletes the job(s) specified by 'job no', or optionally by a list of 'job nos'. It can delete a running job. qdel can not delete other users' jobs. Note that root can delete any job. Note that qdel does not start qseek, but it should be running. If you are deleting a job that is running on a cluster machine and you are using ssh, you will be asked for password authentication. The argument '-nd' stops the job file from being deleted. Note that this is the opposite behaviour from qsub where the job file is kept unless the '-d' argument is present. The argument [-r host] makes the script run on a remote (N.B. not a cluster) machine. See the section below on "Running on remote machines". qhold [-r host] 'job no' [list of other job nos.] ------------------------------------------------- This puts on hold the job(s) indicated by 'job no', or optionally a list of 'job nos'. You can not hold a running job. You can not put on hold jobs for other users. root however can use this command for jobs belonging to any user, for the reason given in the 'BUGS' section below. The argument [-r host] makes the script run on a remote (N.B. not a cluster) machine. See the section below on "Running on remote machines". qrls [-r host] 'job no' [list of other job nos.] ------------------------------------------------ This releases the job(s), previous held in the queue, indicated by 'job no', or optionally by a list of 'job nos'. You can not release jobs for other users. root however can use this command for jobs belonging to any user. qrls starts qseek if it is not running and if the queue where it found the job to release previously contained only held jobs. These could have been held over a reboot where qseek was stopped. Note this does not mean qseek is not running as there maying be running jobs on that machine in other queues, or qseek may not have been stopped. It does however limit the number of checks on qseek, particularly on cluster machines. If the job is in a queue on a cluster machine and you are using ssh with password authentication, you will be asked for your password on the cluster machine if qrls tries to start qseek. The argument [-r host] makes the script run on a remote (N.B. not a cluster) machine. See the section below on "Running on remote machines". qmove [-r host] 'job no' 'queue-name' ------------------------------------- qmove moves the job specified by 'job no.' in its current queue to the queue specified by 'queue-name'. The job can be queued or holding, but it can not be running. You have to own the job. The job in the new queue holds its original status - "Holding" or "Queued", and its priority, but if the priority is negative it is made positive as it is assumed that you are moving the job to a queue where qseek is running. If this is not the case run 'qclear user queue' to set the situation right. If the job is not holding and the new queue is a cluster queue, qmove starts qseek on the cluster machine if it is not running. If you are moving a job from a queue on one machine to a queue on another machine, take care that the job file is not refering to something specific to the original queue. For example we use scratch files in a directory that is the queue name. Moving the job can not change that reference in the job file to that directory and it may not exist on the machine you are moving the job to. The script 'sub_g98' mentioned in the INSTALL file, does exactly that. The argument [-r host] makes the script run on a remote (N.B. not a cluster) machine. See the section below on "Running on remote machines". qprior [-r host] 'job no' 'new priority' ---------------------------------------- qprior changes the priority of 'job no' to 'new priority'. The job can be "Queued" or "Holding", but not "Running" as it is pointless to try to change the priority of a running job. If the priority is negative because qseek is not running, you can change the priority but you use a valid positive number and it will be made negative. The argument [-r host] makes the script run on a remote (N.B. not a cluster) machine. See the section below on "Running on remote machines". qclear [-r host] ---------------- Carries out various tasks to clean up various problems. qclear [-r host] 'user' ['queue'] If another user has non-held jobs in the queue with priority larger than yours and the other user for some reason does not have qseek running, your jobs will not start. Running qclear, with 'user' the name of the offending other user and 'queue' the name of the queue, will set the priority of the other user's job to negative. This is the correct setting, if qseek has been stopped with 'qinit stop'. Your jobs will now start. If 'queue' is omitted the default (first) queue is assumed. qclear [-r host] empty ['queue'] If nothing is running, but there are jobs in the queue, qclear sets the queue to empty. All jobs will have to be resubmitted. If 'queue' is omitted the default (first) queue is assumed. qclear [-r host] ID-no ['queue'] If the machine crashes when a job is running and the machine is then rebooted, 'qinit show' will still show the job as running. It does however try to check whether it is running by looking whether qrun is running with the PID for the job. 'qinit show' gives a long warning message. If the job really is not running, this use of qclear just removes the job with ID equal to 'no' (in the call always type 'ID-' followed by a number) in the given queue or default queue if 'queue' is missing. Take care using this command. qclear [-r host] zero If the sqs.id is found, it is kept and the running job ID is retained. If it is not found it is created with the running job ID set back to zero, just as install.pl does. Use of 'queue empty' and 'queue zero' are not generally recommended, but they may be usefull for the SQS administrator if things get badly wrong. With improvements in the code they are now less needed than they were. Their use will affect other users than yourself. In general it is recommended that you delete all queued jobs and wait for the running job to finish. This may indeed solve the problem anyway. The argument [-r host] makes the script run on a remote (N.B. not a cluster) machine. See the section below on "Running on remote machines". qexample -------- 'qexample' is a Perl script that is a general script to call 'qsub' to run a file. This version prompts the user about the use of all the arguments for qsub. qexample uses the TERM::ReadKey Perl module. qexample file.job is equivalent to:- qsub [arguments] $PWD/file.job with the arguments given by the user in response to prompts. The earlier version of qexample is kept as 'qexample.short.t. Note that you can install this with 'install.pl qexample.short', but that it is not installed by a full install. If you want to use it as the main qexample, save qexample.t and then copy qexample.short.t to qexample.t. You will probably want to add the flags you normally use to the exec line for qsub in this simple qexample.t. If you do not have the TERM::ReadKey Perl module, you can try to use qexample.getc.t which uses stuff I do not understand to get getc work properly. Again you can install this individually, but it is not installed with a full install. It should be equivalent to qexample itself, but is much longer and more difficult to understand. install.pl ---------- This script installs the other scripts and files, but needs some editing as indicated in the INSTALL file. This script also installs the man pages which are in the man sub-directory. install.pl also copies README into the $sqsdir. A call of install.pl with one parameter, being the name of one of the scripts above, will just install that script. Similarly, a call with 'sqs.conf' as argument will simply install that item and a call with 'man' as argument will install all the man pages. A call with 'perl' as argument will install all the perl scripts only and a call with 'README' as the argument will just install this file. If you call install.pl with the first parameter as '-c', it uses perlcc to compile the perl scripts. In this case the single script or the 'perl' argument is the second parameter. If you run './install queue {queue-name' it will create the queue file for queue {queue-name}. This is useful when you have altered sqs.conf, adding a new queue, and not done a full install. install.pl removes everything installed if called as 'install.pl -u'. There are also two include files- sqs.conf -------- This file is essentially a configuration file that contains declarations and assignments that are common to all scripts, and is included into the scripts at run-time. It needs editing to fit a particular local system (see INSTALL file). See the section below about altering sqs.conf and the comments in sqs.conf itself. sqs.inc ------- This file contains declarations that are common to all scripts and is inserted into the scripts by install.pl. You should not need to alter this. Four associated files are used:- sqs.id ------ This holds the job serial number which is incremented every time a job is submitted. Initially installation makes it contain just 0 (zero). sqs.queue.queue-name -------------------- These contain the jobs in the queue - 'queue-name', 1 line per job - '$going', 'job no', 'queue-name', 'flag' (d if '-d' set on qsub, n otherwise),'myname', 'job', 'priority' and the time/date information, with the 8 items separated by commas. Initially it should be an empty file. $going = the Group Process ID for qrun, if this is a running job, otherwise $going = 0. sqs.log ------- A log file for progress information and errors. sqs.pid.$USER.hostname ---------------------- Stores the PID of qseek for use by qstop, but not, note, by qdel on running jobs. This file is not created by install.pl, but by each user when qseek is started. There is such a file for every user on every machine where the user is running jobs. It is removed by 'qinit stop'. The system also uses temporary files called 'sqs.queue.tmp.{queue-name}', but these should be deleted by the various scripts that use them. The version number can be obtained by typing "qsub -v" or "qinit -v", these being the two most frequently used scripts at the prompt. The '-v' flag is not documented below. In fact this works with all the scripts, except for qrun, but this use, except for qsub and qinit, is not documented anywhere. PROBLEMS -------- If no jobs are running, first, use 'qinit status' to check whether qseek is running. If it is not, start it with 'qinit start'. Second, look to see if others users have jobs not running and not held that have higher priorities than yours. It could be that their qseek has died. If qseek has been stopped properly by that user, the priorities should be negative. If they are positive, check that qseek is not running using 'qinit status' and then run 'qclear user queue' where 'user' is the other user's username and 'queue' is the queue where the problem is. This makes the priorities negative and your jobs should start. Do not worry about affecting the other user. When they start qseek, the queue will be set right for them and their jobs will start. BUGS ---- Not so much a bug, but possibly a piece of non-transferable code, is the use of /bin/kill with the full path to delete process groups in qdel. I believe some errors arise on cluster machine if the shell version of kill is selected. This has been partially corrected in that /bin/kill has been replaced by $kill, where this variable is what `which kill` returns. This of course is restricted to unix. The code in qinit for starting the qseek daemon is a bit of a kludge and makes specific reference to ssh. The problem is ensuring that qseek is running in the background and is effectively nohup'ed. At the same time, if using ssh, you have to get the code to pause for you to enter the password. I think there must be a better way to do all this. The way of starting qseek on a remote machine is even more of a kludge. Note that the way it is done assumes that the machine is known by exactly the same name on both machines. WEB USE ------- When using SQS for submission of programs from web forms pages, some care has to be taken. Your calling CGI script that calls qsub must use the full path to qsub and it must get all the parameters correct, as normal output does not go anywhere. There is some use of the logfile 'sqs.log' that gives messages only for web use. Second, generally the web software, e.g. apache, runs as a user such as 'nobody' or 'web' or 'apache' which does not have a good PATH variable. To avoid this, you must define $webuser and $webhost in sqs.conf. It is assumed that $webuser is different from any other user and the value of $webuser is used to tell the scripts whether it is running off the web or not. You should also define the server name that the web user sees. This might be different from the actual machine name, as it is with my server. Care should also be taken to make sure that '$sqsdir' is somewhere that the apache user can write to. You should do a local install in the web area. You may need to have $bindir in the web area also along with any directories for scratch files that your software needs. In web use, we only use the basic scripts - 'qsub', 'qseek' and 'qrun', along with one CGI script that uses 'qinit show'. The other scripts are unlikely to be used, even by the web administrator. However in the web use directory there is one html page and one CGI script. This allows a number of administrative tasks that can be used if you do not have root access. It runs the tasks as the web user, which of course is what is required. These include the use of qinit show, qinit start, qdel, qhold, qrls, qprior and qclean, plus the deletion of rogue files created by our scripts. You should move the html file into your htdocs area and the CGI script into your cgi-bin. You are recommended to put these into directories that are protected by user/password protection. You will need to make some changes to the html page and the CGI script as they refer to some temporary files and directories that our web scripts create and do not always properly delete. It is recommended that you define only one local queue for web use. It would be normal to run only one job at a time with '$maxqu = 1'. ALTERING sqs.conf ----------------- To alter sqs.conf, first, allow all jobs to complete or delete them from the queues. Then run 'qinit stop' to stop qseek. Then edit sqs.conf and run './install.pl sqs.conf' to install it correctly. For a temporary change, you can just edit sqs.conf in $sqsdir. You probably will only be altering $maxqu or $maxperuser. Of course, if there is a root install, you will have to be root or see your systems administrator. If you installed the scripts as compiled executables with './install.pl -c', you will have to recompile them. If you add new queues, you will have to create the queue files with './install queue {queue-name}'. RUNNING ON REMOTE MACHINES. --------------------------- First, we define 'cluster' machines and 'remote' machines. 'cluster' machines all access the same work area and the same $sqsdir directory where the queue files are kept. Running on these machines is fairly well developed in this version. In this system no one machine is privileged. When logged on to one machine, that appears as the local machine and the other machines in the cluster appear as "cluster" machines accessed by 'rsh' or 'ssh'. The queues are however always listed in the order given in the definition in sqs.conf of @queue. This makes the machine that has the first queue in the list appear to be a privileged machine. The default queue however is always the first queue listed that is local when logged on a particular machine. 'remote' machines are ones that stand alone running their own queueing system and hence their own queue files. Again none are privileged. They all act the same. We have added a limited facility in this version to inspect queues, delete jobs, hold or release jobs, move jobs and change job priorities on other remote machines. You can stop the qseek daemon on other remote machines, but there are still some difficulties with starting it (see below). Running on cluster machines. ---------------------------- It is assumed that the machines have a common file system or at least that $sqsdir and your work directory is common to all machines, using, for example, NFS. We describe such machines as cluster machines. The remote queue will run the job file from your work directory and return output to that directory. You can have one set of scripts in a directory common to all machines in the cluster, or you can install the scripts on each machine, but with the same path. You have a common work directory for the job files you want to run mounted on the cluster machine by NFS. It has to be mounted with 'exec' in the mount line in fstab. It is recommended that the binaries directory be also in common mounted using NFS. If you do not want the work directory to be common between all machines, you can have work directories that have the same path. You then use the script that calls qsub to remote copy (rcp or scp) the job file and any input files that the job files calls, to the cluster machine in the work directory there. The output then resides on the cluster machine and you may want to remote copy it back after the job has run. An example script, sub_g98_cluster, is in the scripts directory. It uses the 'remote' queue on a machine called 'monster'. Note that g98 has the same directory structure on 'monster' as the local machine. Each queue now has two extra attributes. @quhost lists the host names as given by `hostname -s` for each queue. You must give these in sqs.conf. @quloc is then constructed listing either 'l' or 'c', depending whether the queue is on the local machine or another machine in the cluster. The elements of this array are then used to put the command used to run qstart on each machine from this machine into the @qucomm array. This would be '' for the local machine and the value of $clustercommand for the other cluster machines. $clustercommand, to give one example, might be 'rsh', indicating that qseek will be started on the cluster machine by 'rsh host-name qinit start cluster-queue'. The system has been tested with rsh and ssh, but with ssh it gets a bit tedious if password authentication is retained. The command for ssh should be 'ssh -f'. qsub adds the job to the appropriate queue, without reference to the cluster machine, except that qseek, if not running, is started on the cluster machine. On a given machine, the default queue is the first listed in @queue that is local. qseek on the cluster machine then finds the job in the queue and runs it. qseek only checks the queues that are specified by the @quhost variable as being the same host as the one qseek is running on. qrls starts qseek on the cluster machine if the queue is on a cluster machine. Similarly qmove starts qseek on the cluster machine if the moved job is not held. qdel can kill a job on a cluster machine. All other scripts are unaltered as they only manipulate the queue files. There is now a pid file for every user on every machine, formed when qseek is started by a user on a machine. 'qinit start' and 'qinit stop' now has an extra optional argument - a queue linked to the required machine if cluster. 'qinit status' has been split to 'qinit status' which queries the cluster machine about the status of qseek, and 'qinit show' which just shows the contents of the queues. The latter will be be more widely used by users than the former, and it runs only on the master host. Note that cluster use does not involve a privileged member of the cluster. When on one machine, that machine appears as the local machine and the others as cluster machines. It should look the same from all machines in the cluster, except that the queues are always listed in the same order as given in @queue and that might suggest a privileged machine as the one having the first listed queue. Running on remote machines. --------------------------- Seven scripts have been modified to add the '-r host' arguments. This allows the script to use a wrapper to call the script again, but this time on a remote machine. Thus if the remote machine is called 'remote':- qinit -r remote show - will show the queues on the remote machine. qinit -r remote status - will show the status of the queues on the remote machine. qinit -r remote stop - will stop the qseek daemon on the remote machine. qinit -r remote start - will start the qseek daemon on the remote machine. You may consider it not advisable to start qseek on the remote machine, as there are several problems. It works if the remotecommand is 'ssh -f', but be warned that qseek and it child processes do not know the environment set in your profile or other start-up files. For 'rsh' as the renote command, qinit starts qseek directly and not via a second call to qinit on the remote machine. The latter is a clear kludge. In later versions I will explore having qinit source the users start-up files (taken from the sqs.conf file on the local machine). You must be certain that qseek is not already running. Even then you are better advised to go to the remote machine and start qseek there. qdel -r remote 11 - will delete job 11 on the remote machine. qhold -r remote 12 - will hold job 12 on the remote machine. qrls -r remote 12 - will release job 12 on the remote machine. qmove -r remote 15 queue2 - will move job 12 on the remote machine to the queue2 queue. qprior -r remote 16 9 - will change the priority of job 16 to 9 on the remote machine. qclean -r remote brian queue2 - will correct positive priorities for user brian in the queue queue2, if qseek is not running. The name of the remote machine is set in sqs.conf and linked to perhaps the full host name or an alias set in /etc/hosts that can be used to access the remote name. If there is an alias set, set that in sqs.conf. Some changes have been made in sqs.conf to allow remote use in this way. Jobs can be submitted to a queue on the remote machine and will run in the same directory as the one they are submitted from. This script submits a job to the remote queue, quantum on the remote machine, monster. Note that the full path has to be given to qsub in the last line. #!/bin/bash cat > $1.job << END cd $PWD export g98root="/home/qchem" . \$g98root/g98/bsd/g98.profile export GAUSS_SCRDIR="/scr/molecule" rm -f $GAUSS_SCRDIR/Gau* rm -f core g98 $1 END chmod 0755 $1.job rcp $1.job monster:$PWD rcp $1.com monster:$PWD rsh monster /home/brian/bin/qsub -d -q quantum $PWD/$1.job > /dev/null This script is in the scripts directory as sub_g98_remote. ACKNOWLEDGEMENTS. ----------------- Nicolas Ferre' for testing and suggestions. ----------------------------------------------------------------------------- Brian Salter-Duke b_duke@bigpond.net.au June, 2008