USC Center for High Performance Computing and Communications

:: Grid Computing at USC

:: Setting up credentials

:: Using Globus Toolkit

:: Condor

    - Using Globus Universe

    - Using Standard Universe

    - Using Vanilla Universe

:: RSL Parameters

:: USCGrid News Archives

:: Related Links

    - NMI Testbed

    - The Globus Project

    - Global Grid Forum

    - Condor Project

    - KX.509

    - PubCookie

    - Shibboleth

Using HPCC's Globus Universe - A Simple Example

The example below covers a standard submission configuration: from Sun Computing Resource to the Linux Cluster.

Submitting a Job from Almaak-01 to the Linux Cluster

The first step in submitting a job from almaak-01.usc.edu to run on the Linux cluster hpc.usc.edu is to create a job description as a plain-text file using your favorite editor (e.g., emacs or vi). Our example is condortest1:

executable = /usr/bin/uptime
transfer_executable = false
globusscheduler = hpc-master/jobmanager-pbs
universe = globus
output = condortest1.out
log = condortest1.log
queue

To submit jobs to almaak-01.usc.edu, you need to specify globusscheduler = almaak-01/jobmanager-pbs

This tells Condor-G that:

  • The program it will run is /usr/bin/uptime.
  • It does not need to transfer the program to the remote site.
  • It should request the program be run on the remote machine hpc.
  • The remote machine hpc should submit the job to the pbs scheduler for job management.
  • Output from the remote execution should be put in the local file condortest1.out.
  • A log of the execution of the Condor-G job should be put in the local file condortest1.log.

The second step is to set up your shell environment to include Globus Toolkit settings by sourcing the setup file.

  for csh or tcsh:
  almaak-01.usc.edu(5): source /usr/usc/globus/default/setup.csh

  for any other shell:
  almaak-01.usc.edu(5): source /usr/usc/globus/default/setup.sh

The third step is to request a Kerberos ticket, which authenticates you to the security infrastructure at USC.

almaak-01.usc.edu(6): kinit
Password for shelley@ISD.USC.EDU:
almaak-01.usc.edu(7): klist
Ticket cache: FILE:/tmp/krb5cc_584
Default principal: shelley@ISD.USC.EDU

Valid starting     Expires            Service principal
10/22/02 10:31:50  10/22/02 20:31:50  krbtgt/ISD.USC.EDU@ISD.USC.EDU

In the above example, the user typed kinit at the command-line prompt and was in turn asked for the user's login password. As no error messages appeared, the user then typed klist to verify that a Kerberos ticket has been issued.

The fourth step is to translate the existing Kerberos ticket into an X.509 certificate.

almaak-01.usc.edu(8): kx509
almaak-01.usc.edu(9): kxlist -p
Service kx509/certificate
 issuer= /C=US/ST=California/L=Los Angeles/O=University of Southern California
/CN=usc.edu
 subject= /C=US/ST=California/L=Los Angeles/O=University of Southern California
/OU=usc.edu/CN=shelley/USERID=shelley/Email=shelley@USC.EDU
 serial=68
 hash=25b24e07
almaak-01.usc.edu(10): grid-proxy-info
subject  : /C=US/ST=California/L=Los Angeles/O=University of Southern California
/OU=usc.edu/CN=shelley/USERID=shelley/Email=shelley@USC.EDU
issuer   : /C=US/ST=California/L=Los Angeles/O=University of Southern California
/CN=usc.edu
type     : not a proxy
strength : 512 bits
timeleft : 9:58:39

In the above example, the user typed kx509 to perform the translation. To verify that the grid software recognizes the certificate, the user typed grid-proxy-info.

Users should verify that their kx509 certificate is valid by typing kxlist -p.

The fifth step is to submit the defined job using Condor-G:

almaak-01.usc.edu(11): condor_submit condortest1
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 5.
almaak-01.usc.edu(12): condor_q


-- Submitter: almaak-01.usc.edu : <128.125.253.166:41674> : almaak-01.usc.edu
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
   5.0   shelley        10/22 10:34   0+00:00:00 I  0   0.0  uptime

1 jobs; 1 idle, 0 running, 0 held

In the above example, the user typed condor_submit condortest1 to tell Condor-G to run the job. The user then typed condor_q to check the status of the job.

Output files from the sample run

To see the output, the user can list the output file:

almaak-01.usc.edu(19): cat condortest1.out
----------------------------------------
Begin PBS Prologue Tue Oct 22 10:34:18 PDT 2002
Job ID:         15650.hpc-master.usc.edu
Username:       shelley
Group:          rds
Nodes:          hpc018
End PBS Prologue Tue Oct 22 10:34:18 PDT 2002
----------------------------------------
 10:34am  up 27 days, 11:14,  0 users,  load average: 0.00, 0.01, 0.00
 10:34am  up 27 days, 11:14,  0 users,  load average: 0.00, 0.01, 0.00
--------------------------------------------------
Begin PBS Epilogue Tue Oct 22 10:34:24 PDT 2002
Job ID:         15650.hpc-master.usc.edu
Username:       shelley
Group:          rds
Job Name:       STDIN
Session:        16755
Limits:         neednodes=hpc018:ppn=2,walltime=00:30:00
Resources:      cput=00:00:00,mem=0kb,vmem=0kb,walltime=00:00:01
Queue:          pbs_allnodes
Account:
Nodes:          hpc018
Killing leftovers...
End PBS Epilogue Tue Oct 22 10:34:24 PDT 2002
--------------------------------------------------

To see information about the execution of the job, the user can list the log file:

almaak-01.usc.edu(20): cat condortest1.log
...
000 (005.000.000) 10/22 10:34:00 Job submitted from host: <128.125.253.166:41674
>
...
017 (005.000.000) 10/22 10:34:13 Job submitted to Globus
    RM-Contact: hpc/jobmanager-pbs
    JM-Contact: https://hpc-master.usc.edu:1440/3694/1035308043/
    Can-Restart-JM: 1
...
001 (005.000.000) 10/22 10:34:43 Job executing on host: hpc
...
005 (005.000.000) 10/22 10:35:25 Job terminated.
        (1) Normal termination (return value 0)
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
        0  -  Run Bytes Sent By Job
        0  -  Run Bytes Received By Job
        0  -  Total Bytes Sent By Job
        0  -  Total Bytes Received By Job
...


  ITS Policies       Contact HPCC