Computational Cluster Programs

Commonly-Used SGE Commands

This page lists some of the more frequently used Sun Grid Engine commands. It does not list all of the options for each command. The man command can be used to see the detailed description of any of these commands. For example, to see a detailed description of the qsub command, enter:

man qsub

List of Commands:

qsubSubmit a Job
qstatDetermine the Status of a Job
qhostDisplay Node Information
qdelCancel a Job
qholdPlace a hold on a queued job to prevent it from running
qrlsRelease a job held with qhold
qmonAn X-Windows interface to SGE commands

Use the qsub Command to Submit a Job

The qsub command is used to submit jobs to SGE. The syntax of the qsub command is:

qsub [-cwd] [-v SOME_VAR] [-o path] [-e path] [-M mail_address] [-m mail_options] [-l resources] script

Where:

-cwd
Directs SGE to run the job in the same directory from which you submitted it. Alternatively, you can specify this flag in the SGE command file for the job.

-v SOME_VAR
Passes environment variable SOME_VAR to the job. Alternatively, you can specify this flag in the SGE command file for the job.

-o path
Redirects stdout from the SGE script. The default is your home directory. Specify /dev/null to disgard SGE messages. Alternatively, you can specify this flag in the SGE command file for the job.

-e path
Redirects stderr from the SGE script. The default is your home directory. Specify /dev/null to disgard SGE error messages. Alternatively, you can specify this flag in the SGE command file for the job.

-M mail_address
where mail_address is user's email address. It is always login_id@mail on Hoffman2.

-m mail_options
Specifies the circumstances under which mail is to be sent to the job owner defined by -M option. For example options "bea" mean mail is sent at the begining, end, and at abort time (if it happens) of the job. Option "n" means no mail will be sent.

-l resources
Specifies a list of resouces required for your job, for example memory and time per core:

-l h_data=1024M,h_rt=24:00:00

script
Either the SGE command file or the script that starts up your job.

The qsub command line switches and options can also be used as active comments or embedded directives in an SGE command file that you submit with the qsub command. Advantages of this approach are: you have a record of what options were used to run your job; you can easily resubmit jobs; and you can use one command file as the basis for creating other similar command files. For example, if the file myjob.cmd contains:

#!/bin/csh
/path/to/executable

and the qsub command used to submit it is:

qsub -cwd -o path -M login_id@mail -m bea -l h_data=1024M,h_rt=24:00:00 myjob.cmd

then the same result could be achieved by adding the following lines to the myjob.cmd file before the /path/to/executable line:

#$ -cwd
#$ -o path
#$ -M login_id@mail
#$ -m bea
#$ -l h_data=1024M,h_rt=24:00:00

and submitting the myjob.cmd script with:

qsub myjob.cmd

After submitting a job with qsub, SGE will respond with something like:

Your job 624556 ("myjob.cmd") has been submitted

where 624556 is the job number assigned by SGE to your job.

Use the qstat Command to Determine the Status of a Job

The qstat command displays information about the jobs in the SGE queues, both running and waiting to run. The syntax of the qstat command is:

qstat [-f] [-j job_number] [-U login_id] [-u login_id]

where:

(qstat alone with no arguments)
Displays a list of all running and waiting jobs.

-f
Displays summary information on each queue as well as the job list.

-j job_number
Displays the status of the job whose job number is job_number

-U login_id
Displays a list of running and waiting jobs for those queues which login_id can access. Or use the groupjobs script for this information; enter groupjobs -help for usage information.

-u login_id
Displays a list of login_id 's running and waiting jobs. Or use the myjobs script for this information for your own login_id.

Use the qhost Command to Display Node Information

The qhost command displays information about compute nodes: their architectures, number of processors, load, etc. The syntax of the qhost command is:

qhost [-j] [-q]

where:

(qhost alone with no arguments)
Displays a table of information about the compute nodes.

-j
Adds information about the specific jobs that are running on each compute node.

-q
Shows the queues each compute node accepts.

Use the qdel Command to Cancel a Job

The qdel command is used to cancel a job either while it is waiting to execute or while it is running. The syntax of the qdel command is:

qdel job_number

If a running job does not get cancelled right away, enter:

qdel -f job_number

to force it to be cancelled. Jobs in the "dr" state (disabled running) cannot be cancelled by the job owner. They must be cancelled by a system administrator. "dr" state jobs usually indicate a system hardware problem.

Additional SGE Commands

qhold job_number
Place a hold on a queued job to prevent it from running

qrls job_number
Release a job held with qhold

qmon
An X-Windows interface to SGE commands