Managing Jobs and Processes
Determining current system load
Load average is probably the best guide to use for anticipating the amount of delay to expect when running your jobs. A load average value of 10 is definitely a heavily loaded system and you can expect delays. Of course, it's not possible to know how long a heavily loaded system will stay that way; only experience with daily usage patterns will help there.
The command uptime gives one line of stats for current machine, including the load
average.
almaak.usc.edu(36): uptime
5:25pm up 6 days, 16 mins, 27 users, load average: 4.57, 4.73, 4.52
The last three numbers are the average number of jobs in the run queue for last 1, 5, and 15 minutes.
On RCF, the command /usr/rcf/bin/rcf-load will give load averages for all RCF machines,
exactly as seen when you first log in. All RCF machines are not equally usable for a given purpose, so you will have
to apply that load information to your particular use.
The top command displays currently running jobs ranked in order of CPU usage and shows
various stats for each, including CPU %, CPU time, resident memory size, nice value (to be explained below), load averages,
etc. By default it updates every 5 seconds and shows all users.
Example, showing top niced to +19, to reduce impact on the system:
almaak.usc.edu(36): nice +19 top
last pid: 19097; load averages: 7.03, 7.12, 7.46 14:56:47
625 processes: 581 sleeping, 22 zombie, 15 stopped, 7 on cpu
CPU states: 0.0% idle, 69.1% user, 14.1% kernel, 16.7% iowait, 0.0% swap
Memory: 2005M real, 32M free, 902M swap, 2807M free swap
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 19728 guanggon -25 10 1016K 744K cpu 27.1H 12.48% 12.50% a.out 22826 potatov -25 10 52M 8928K cpu 138:08 12.50% 12.50% Readncm 25370 mitaim -25 10 912K 832K cpu 124:51 12.50% 12.50% gaussh10 11848 poller -25 10 7160K 3560K cpu 41:11 12.50% 12.50% matlab 16838 shariati -25 0 17M 2576K cpu 12:42 12.50% 12.50% vfehs.out 16793 perryros -25 11 12M 11M cpu 12:22 12.10% 11.96% sht3ell2dprmn2
Options include restricting output to one user only, to a fixed number of processes, etc. Use
<Control>-C to exit top.
Server status may also be found at Systems Status.
Running jobs sequentially
You can combine commands so that they run sequentially rather than simultaneously, thereby avoiding competing with yourself for computing cycles. Do this by stacking them with a semi-colon on the same command line:
almaak.usc.edu(36): ls; date; whoami
Long command lines may be wrapped by continuing to type without pressing the newline key (up to 256 characters) or by typing a backslash (\) immediately before the newline and between words.
Delaying command execution
With the at command you can delay execution of commands with slightly reduced priority. This
can be done either interactively from the command line or from commands contained in a file, called a shell script (see below).
Mail is automatically sent by the system to the user upon completion (jobs with standard output send the results in this
mail).
The at command executes your commands at a specified later time by putting them into
queue a, which has a nice value of 1 (see nicing below). You can only submit up to 4 simultaneous jobs. The following
example demonstrates how to execute the commands contained in the file myscript at
5pm on Friday:
at 5pm Friday myscript
The next example shows how to execute commands interactively from the command line (you will need the double quotes), in this case, at 3am on Sunday:
almaak.usc.edu(36): echo "cmd -options" | at 3:00am Sunday
Lowering job priority
You can reduce the priority level of your job in several ways.
Using the nice Command
The nice command allows you to lower the priority of your command by a specific
value. By default, the nice value is zero (20 under Solaris 2.5). Niceness represents a scheduling priority based on cpu
usage, wait time, etc. A nice value of zero (20 under Solaris) is high priority, while a value of 19 is the
lowest (39 under Solaris). Processes with a high nice value will run slower when the system is busy, but of course will
run faster when fewer jobs are running. The following are some examples (system prompts are omitted for clarity; C
Shell usage shown):
Solaris 2.5:
nice <command> -options [without increment value, sets nice value to 24]
nice +10 <command> -options [increments nice value by 10, now equals 30]
nice +20 <command> -options [any value above 18 gets set to 39, the max]
Using the renice Command
The renice command can be used on currently running processes only to decrease priority; you cannot increase it even though you are the user who decreased it
originally. The PID number is the process id as seen with the ps command.
Solaris 2.5:
renice 10 <pid> [increments base value of 20 to 30]
Using the batch Command
The batch command executes your commands by placing them in batch queue b, which has a nice value of 2 and a maximum number of simultaneous jobs of 2. The following example shows how to batch commands contained in the file called myscript:
batch myscript
The next example shows how to batch commands given interactively at the command line:
almaak.usc.edu(36): batch
at> sas house1
at> sas house2
at> sas house3
at> speakez model1
at> sas house4
at> ^D
C Shell Scripts (command files)
You can place your command or series of commands into an executable file which can be "run" just as system commands are run.
Create a file with any editor, as long as you save it as text only.
Type your commands into the file just as you would type them on the command line.
Make the file executable by changing the permissions (For help with this step, please see the Permissions page).
A sample script might look like this:
#!/bin/csh #always start with this line exactly as written # any line except line 1 starting with a # is just a comment sas mycmd #this is also a way to make a comment echo sas job done #send a message to the screen when the job is doneLast updated:
February 03, 2011