Difference between revisions of "Using SweGrid resources"

From SNIC Documentation
Jump to: navigation, search
(Job submission)
(Swestore documentation moved)
(Tag: New redirect)
 
(4 intermediate revisions by one other user not shown)
Line 1: Line 1:
[[Category:Grid computing]]
+
#REDIRECT[[Swestore Documentation Moved]]
[[Category:SweGrid user guide]]
 
[[Getting started with SweGrid|< Getting started with SweGrid]]
 
 
 
This chapter will go through the basic steps needed to ''get on the grid''. The basic procedure is not much different from the procedure used when using a normal cluster resource.
 
 
 
# Authentication
 
# Defining the job parameters
 
# Job submission
 
# Job monitoring
 
 
 
Authentication on the grid done by delegation. A short lived proxy certificate is created and delegated to the resource used, see :ref:`creating-proxy-cert`. This gives the resource the mandate to act as the delegated user when accessing storage resources specified by the job.
 
 
 
On a normal compute resource jobs are submitted to the queuing system typically as a special script containing the job parameters such as, walltime, number of processors and memory requirements. The job script also contains the actual statements to execute the job. Jobs on grid resources consists of a job description written in one of the available job description languagues, XRSL, JSDL or JDL and a setup input files and scripts. For the current SweGrid resources the ARC middleware 0.8.x only supports the XRSL job description language. The job descriptions describe the job parameters and files. The description file itself is not executable, but contain references to which file is going to be used to execute the job.
 
 
 
Job submission and monitoring is done in a similar way as on an existing resource. The only difference is that the tools for job submission and monitoring are executed on the users own computer.
 
 
 
== Describing you grid job ==
 
 
 
When a job is to be submitted on the SweGrid resource it is described in a special task description language. The NorduGrid software uses XRSL for describing a grid task. A XRSL contains a set of attribute definitions. The files starts with a '''&amp;''' (AND) to define the default relation between the attributes. Every attribute definition is enclosed in '''(''' and ''')'''. An example:
 
 
 
<pre>&amp;(executable=&quot;/bin/echo&quot;)(arguments=&quot;Hello, World!&quot;)</pre>
 
This example defines the attributes '''executable''', defining the executable used ('''/bin/echo''') and the '''arguments''' attribute defining the arguments that will be used with the executable. The '''&amp;''' means that all attributes must be used. The output from this job will be to write &quot;Hello, World!&quot; to the standar output file of the job.
 
 
 
=== Specifying an executable and arguments ===
 
 
 
An executable specified without any slashes is treated like a local file and transferred to the remote system and executed. If the executable is given with a leading slash it is treated as if it where located at the remote machine. See the following examples:
 
 
 
<pre>&amp;(executable=&quot;/bin/echo&quot;)(arguments=&quot;Hello, World!&quot;)</pre>
 
In this example the executable '''/bin/echo''' is treated as a remote executable located in the system folder '''/bin'''.:
 
 
 
<pre>&amp;(executable=&quot;bin/echo&quot;)(arguments=&quot;Hello, World!&quot;)</pre>
 
In this example the executable is treated as a local file located in a directory '''bin''' relative to the current directory and automatically transferred to the remote system.
 
 
 
The executable will be called with the arguments specified in the '''arguments''' attribute as shown in the above examples.
 
 
 
=== Handling job input and output ===
 
 
 
In the previous example the output of the application are silently thrown away. If the output and input is needed for the application this can be specified by using the attributes '''stdout''', '''stdin''' and '''stderr'''. The input for the attributes are files. If '''stdin''' is used the input file used for standard input must be transferred as an input file, see next section. An example of using standard input and output is shown in the following listing:
 
 
 
<pre>&amp;
 
(executable=&quot;/bin/ls&quot;)
 
(arguments=&quot;-la&quot;)
 
(stdout=&quot;stdout.txt&quot;)
 
(stderr=&quot;stderr.txt&quot;)</pre>
 
Here standard output is directed to the file '''stdout.txt''' and standard error to '''stderr.txt'''. If standard input is used a typical XRSL description can be:
 
 
 
<pre>&amp;
 
(executable=&quot;myapp&quot;)
 
(stdout=&quot;stdout.txt&quot;)
 
(stderr=&quot;stderr.txt&quot;)
 
(stdin=&quot;stdin.txt&quot;)
 
(inputFiles=(&quot;stdin.txt&quot; &quot;&quot;))</pre>
 
As the input specifies a file is has to be transferred to the resource, which is the last attribute, '''inputFiles''', described in more detail in the next section.
 
 
 
=== Giving jobs meaningful names ===
 
 
 
To make it easier retrieving jobs, meaningful names can be given to a job using the '''jobName''' attribute. This name can then be used by the ARC commands instead of the normal job id to refer to jobs. In the following example the job is given the name '''job0001''':
 
 
 
<pre>&amp;
 
(executable=&quot;myapp&quot;)
 
(stdout=&quot;stdout.txt&quot;)
 
(stderr=&quot;stderr.txt&quot;)
 
(stdin=&quot;stdin.txt&quot;)
 
(inputFiles=(&quot;stdin.txt&quot; &quot;&quot;))
 
(jobName=&quot;job0001&quot;)</pre>
 
=== Specifying input and output files ===
 
 
 
Jobs often depend on a set of input files that must be transferred to the job directory on the grid resource before execution. When the job has finished it often also produces a set of output files that should be kept or transferred to other storage resources. Input and output files in are defined using the '''inputFiles''' and '''outputFiles''' attributes in XRSL.
 
 
 
Input and output files can be accessed and transferred to and from other resources than the users client machine. The default option is to use the users local directory to transfer and receive files. There are however extra parameters can be specified to specify files located on specific URL:s.
 
 
 
The syntax for the '''inputFiles''' directive is as follows:
 
 
 
<pre>(inputFiles=(&lt;filename&gt; &lt;source&gt;) ... )</pre>
 
'''filename''' is the filename that will be written to the job directory. '''source''' specifies an from where the input file will be retrieved. Source can be both a local directory or a URL. If '''source''' is empty, the input files is taken from the directory from where the job is submitted.
 
 
 
The syntax for the '''outputFiles''' directive is as follows:
 
 
 
<pre>(outputFiles=(&lt;string&gt; &lt;URL&gt;) ... )</pre>
 
'''string''' is a file located in the job directory on the computational resource. '''URL''' sets the destination where the output file should be transferred after job execution. If '''string''' is set to &quot;/&quot; and the '''URL''' is empty the entire job directory is kept for later retrieval by the user. If the '''string''' is set to &quot;/&quot; and the URL is not empty the entire job directory is transferred to the destination.
 
 
 
In the following example all input files are located in the job submission directory. The output files '''outputfile1.dat''' and '''outputfile2.dat''' will be kept in the job directory on the computationa resources until the user retrieves the files:
 
 
 
<pre>&amp;
 
(executable=&quot;myapp&quot;)
 
(stdout=&quot;stdout.txt&quot;)
 
(stderr=&quot;stderr.txt&quot;)
 
(stdin=&quot;stdin.txt&quot;)
 
(inputFiles=
 
    (&quot;stdin.txt&quot; &quot;&quot;)
 
    (&quot;datafile1.dat&quot; &quot;&quot;)
 
    (&quot;datafile2.dat&quot; &quot;&quot;)
 
)
 
(outputFiles=
 
    (&quot;outputfile1.dat&quot; &quot;&quot;)
 
    (&quot;outputfile2.dat&quot; &quot;&quot;)
 
)</pre>
 
A similar example, using only files located on external resources is shown below:
 
 
 
<pre>&amp;
 
(executable=&quot;myapp&quot;)
 
(stdout=&quot;stdout.txt&quot;)
 
(stderr=&quot;stderr.txt&quot;)
 
(stdin=&quot;stdin.txt&quot;)
 
(inputFiles=
 
    (&quot;stdin.txt&quot; &quot;http://www.swegrid.se/example/stdin.txt&quot;)
 
    (&quot;datafile1.dat&quot; &quot;gsiftp://swegrid.se/storage/datafile1.dat&quot;)
 
    (&quot;datafile2.dat&quot; &quot;rc://swegrid.se.se/datafile2.dat&quot;)
 
)
 
(outputFiles=
 
    (&quot;outputfile1.dat&quot; &quot;srm://swegrid.se/storage/outputfile1.dat&quot;)
 
    (&quot;outputfile2.dat&quot; &quot;srm://swegrid.se/storage/outputfile2.dat&quot;)
 
)</pre>
 
Sometimes it is useful to transfer all files in the output directory or input directory. The following example shows how this is accomplished:
 
 
 
<pre>&amp;
 
(executable=&quot;myapp&quot;)
 
(stdout=&quot;stdout.txt&quot;)
 
(stderr=&quot;stderr.txt&quot;)
 
(stdin=&quot;stdin.txt&quot;)
 
(inputFiles=
 
    (&quot;/&quot; &quot;&quot;)
 
)
 
(outputFiles=
 
    (&quot;/&quot; &quot;&quot;)
 
)</pre>
 
An URL can also be used in conjunction with the &quot;/&quot; attribute. This will transfer the entire output directory to the specific resource defined in the URL.
 
 
 
=== Specifying resource usage ===
 
 
 
An important part of the job description is specifying job resource limits such as required walltime, nodes and memory. If any of these parameters are not given the default limits of the resource will be used, which can differ on different resources.
 
 
 
walltime is given in by the '''wallTime''' attribute. The time can be given in many different unit. If no unit is specified the minutes are assumed. The following lists show different allowed '''wallTime''' specifications:
 
 
 
<pre>1 week
 
3 days
 
2 days, 12 hours
 
1 hour, 30 minutes
 
36 hours
 
9 days
 
240 minutes
 
240</pre>
 
Memory and disk requirements are given in '''MB'''. Memory and disk requirement are usually given using relational operators such as '''&gt;='''. The reason for this is that you often need at least the amount of memory or disk. The disk attribute is to be avoided as it is better to specify disk using a runtime enviromnent described in the following sections. The following example illustrate how the '''wallTime''' and '''memory''' attributes can be used:
 
 
 
<pre>&amp;
 
(executable=&quot;myapp&quot;)
 
(stdout=&quot;stdout.txt&quot;)
 
(stderr=&quot;stderr.txt&quot;)
 
(stdin=&quot;stdin.txt&quot;)
 
(wallTime=240)
 
(memory&gt;=500)
 
(inputFiles=
 
    (&quot;stdin.txt&quot; &quot;&quot;)
 
    (&quot;datafile1.dat&quot; &quot;&quot;)
 
    (&quot;datafile2.dat&quot; &quot;&quot;)
 
) (outputFiles=
 
    (&quot;outputfile1.dat&quot; &quot;&quot;)
 
    (&quot;outputfile2.dat&quot; &quot;&quot;)
 
)</pre>
 
=== Runtime environments ===
 
 
 
A runtime environment is special script that will setup a number of standard variables and search paths for applications or special application needs. The runtime environment shields the user from differences the available grid-resources. The available runtime environments are published in the information system which guarantees that they will only be submitted to resources with correct environments installed.
 
 
 
There are a number of supported runtime environments available on the SweGrid resources, listed at:
 
 
 
''* [http://www.nordugrid.org http://www.nordugrid.org]''* [http://docs.swegrid.se http://docs.swegrid.se]
 
 
 
In the job description, runtime environments are specified using the '''runTimeEnvironment''' attribute. Runtime environments support versioning, so different version can be specified. If no version i specified the highest version is chosen. If not a specific version is required the relational operator '''&gt;=''' should be used to select the minimum required version. If using an environment that sets up the path for the application executable on the remote resource. The executable for the application should not be specified in the '''executable''' attribute, but in a script-file that is passed as the executable file in the xRSL file. An example:
 
 
 
<pre>&amp;
 
(executable=run.sh)
 
(arguments=inputfile.dat)
 
(inputFiles=(intputfile.dat &quot;&quot;))
 
(outputFiles=(outputfile.dat &quot;&quot;))
 
(wallTime=240)
 
(runTimeEnvironment&gt;=MYAPP-1.42)</pre>
 
The '''run.sh''' is a script-file using the paths set up in the runtime-environment. An example '''run.sh''' is illustrated below:
 
 
 
<pre>!/bin/sh
 
myapp $1</pre>
 
The '''myapp''' executable is available by the environment setup by the runtime environment '''MYAPP-1.42'''.
 
 
 
=== Job log information ===
 
 
 
As an aid in debugging grid jobs additional information on the execution of the job can be added to the job results using the '''gmlog''' attribute. The attribute defines a directory containing job logs and other useful information for debugging. The following example shows a job description with the '''gmlog''' attribute set:
 
 
 
<pre>&amp;
 
(executable=run.sh)
 
(wallTime=&quot;5 minutes&quot;)
 
(stdout=&quot;stdout.txt&quot;)
 
(stderr=&quot;stderr.txt&quot;)
 
(gmlog=&quot;gm.log&quot;)</pre>
 
In this example a special directory &quot;gm.log&quot; will be added to the retrieved job directory containing the following files:
 
 
 
* '''description''' - contains the parsed and transformed XRSL description transferred to the resource.
 
* '''diag''' - front-end and job information.
 
* '''errors''' - complete log of job activity.
 
* '''input''' - job input files.
 
* '''local''' - local job information specific to resource management system.
 
* '''output''' - job output files.
 
* '''status''' - job status. FINISHED/FAILED etc.
 
 
 
== Job submission ==
 
 
 
When the job description has been created and any additional needed files have been setup, the job can be submitted to a grid resource. In ARC jobsubmission is done using the '''arcsub''' command. The command is similar to the '''qsub''' command found on non-grid resources. The job submission procedure can be described in the following steps:
 
 
 
# Parse XRSL definition.
 
# Query information system for available resources taking in any constraints defined in the XRSL definition such as memory, wallTime and runtime environments.
 
# Submit job to selected resource. Transferring any files local to the submission directory (if any).
 
 
 
The general syntax of the '''arcsub''' command is as follows:
 
 
 
<blockquote>arcsub [options] [filename]
 
</blockquote>
 
The most important option is the "filename" option which defines the job description file to be used for job submission.
 
 
 
To illustrate the job submission process we use the following example descriptions and scripts.
 
 
 
Job description:
 
 
 
<pre>&amp;
 
(executable=run.sh)
 
(wallTime=&quot;5 minutes&quot;)
 
(stdout=&quot;stdout.txt&quot;)
 
(stderr=&quot;stderr.txt&quot;)</pre>
 
Executable script '''run.sh''':
 
 
 
<pre>#!/bin/sh
 
echo &quot;Hello, grid&quot;</pre>
 
The simples form of submission of this job is shown in the following example:
 
 
 
<pre>[user@localhost ex1]$ arcsub ex1.xrls
 
Job submitted with jobid: gsiftp://siri.lunarc.lu.se:2811/jobs/2817512964675921399075910</pre>
 
If the submission is succesful the command displays the job id of the job. The job id is a URL which uniquely identfies the job and the resource to which it was submitted. The arcsub command also stores submitted job id:s in the '''$HOME/.arc/jobs.xml''' file for later use by other commands.
 
 
 
The see more output from the job submission the '''-d''' flag can be used with a parameter for the level of debug output. It is often enough to use '''-d 1''' to get more useful information as in the following example:
 
 
 
<pre>[user@localhost ex1]$ arcsub -d 1 ex1.xrls
 
Proxy subject name: /O=Grid/O=NorduGrid/OU=lunarc.lu.se/CN=User/CN=1212121
 
Proxy valid to: 2011-01-31 22:32:58
 
Proxy valid for: 10 hours, 32 minutes, 59 seconds
 
Queue selected: arc@arc-ce.smokerings.nsc.liu.se
 
File uploaded: /tmp/user/rsl.1lY843
 
File uploaded: /home/user/usersguide/examples/ex1/run.sh
 
Job submitted with jobid: gsiftp://arc-ce.smokerings.nsc.liu.se:2811/jobs/213512964716031312926006</pre>
 
The debug output also shows information on proxy lifetime, queue used and which files that have been uploaded to the used resource.
 
 
 
In some cases the information system on some resources is overloaded. This means that the job submission can get stuck waiting for a response. To limit the time waiting for non-responsive sites, the '''-t''' flag can be used to set a timeout in seconds. The following example shows a job submission with the '''-t''' flag set to 20 seconds:
 
 
 
<pre>[user@localhost ex1]$ arcsub -t 20 -f ex1.xrls</pre>
 
It is also possible to bypass the resource brokering and submit a job directly to a resource using the '''-c''' switch. The '''-c''' switch takes a hostname for the resource as input and will only submit to this resource. The '''-c''' switch can be given repeatedly to submit to multiple resources. It is also possible to reject a specific cluster using the switch by adding a minus sign in front of the hostname. The following examples illustrate different options of using this switch.
 
 
 
Job submission directly to the resource given by '''siri.lunarc.lu.se''':
 
 
 
<pre>[user@localhost ex1]$ arcsub -c siri.lunarc.lu.se ex1.xrsl</pre>
 
Job submission to all available resources `except` '''siri.lunarc.lu.se''':
 
 
 
<pre>arcsub -c -siri.lunarc.lu.se ex1.xrls </pre>
 
Instead of using the default '''$HOME/.arc/jobs.xml''' job file a user defined job list file can be specified using the '''-j''' switch. This can be useful when submitting a number of jobs in a parameter sweep:
 
 
 
<pre>[user@localhost ex1]$ arcsub -j job_sweep1 ex1.xrsl</pre>
 
Additional options for the '''arcsub''' command can be found by using the '''-h''' switch:
 
 
 
<pre>[user@localhost ex1]$ arcsub -h
 
Usage:
 
  arcsub [OPTION...] [filename ...]
 
 
 
The arcsub command is used for submitting jobs to Grid enabled computing
 
resources.
 
 
 
Help Options:
 
  -?, --help                        Show help options
 
 
 
Application Options:
 
  -h, --help                        Show help options
 
  -c, --cluster=[-]name            explicitly select or reject a specific resource
 
  -g, --index=[-]name              explicitly select or reject an index server
 
  -e, --jobdescrstring=string      jobdescription string describing the job to be submitted
 
  -f, --jobdescrfile=string        jobdescription file describing the job to be submitted
 
  -j, --joblist=filename            the file storing information about active jobs (default ~/.arc/jobs.xml)
 
  -o, --jobids-to-file=filename    the IDs of the submitted jobs will be appended to this file
 
  -D, --dryrun                      submit jobs as dry run (no submission to batch system)
 
  -x, --dumpdescription            do not submit - dump job description in the language accepted by the target
 
  -t, --timeout=seconds            timeout in seconds (default 20)
 
  -z, --conffile=filename          configuration file (default ~/.arc/client.conf)
 
  -d, --debug=debuglevel            FATAL, ERROR, WARNING, INFO, VERBOSE or DEBUG
 
  -b, --broker=broker              select broker method (list available brokers with --listplugins flag)
 
  -P, --listplugins                list the available plugins
 
  -v, --version                    print version information
 
 
 
  ...</pre>
 
 
 
More information on arcsub can also be found using the arcsub man-page by issuing:
 
 
 
<pre>man arcsub
 
</pre>
 
 
 
== Job status information ==
 
 
 
The status of the submitted job can be queried using the '''ngstat''' command which is similar to the '''qstat''' or '''showq''' commands on normal computational resources. The command takes a job id as input and queries the resource for information on the status of the job. This is shown in the following examples:
 
 
 
<pre>[user@localhost ex1]$ ngstat gsiftp://siri.lunarc.lu.se:2811/jobs/261871296472351384107384
 
Job gsiftp://siri.lunarc.lu.se:2811/jobs/261871296472351384107384
 
  Status: FINISHED
 
  Exit Code: 0</pre>
 
This shows the status and exit code of the job, in this case that it has been executed and returned with an exit code of 0.
 
 
 
To query all submitted jobs found in the '''$HOME/.ngjobs''' file, the '''-a''' switch can be used. A typical output from this command is show below:
 
 
 
<pre>[user@localhost ex1]$ ngstat -a
 
Job gsiftp://siri.lunarc.lu.se:2811/jobs/2817512964675921399075910
 
  Status: FINISHED
 
  Exit Code: 0
 
Job gsiftp://arc-ce.smokerings.nsc.liu.se:2811/jobs/213512964716031312926006
 
  Status: INLRMS:Q
 
Job gsiftp://arc-ce.smokerings.nsc.liu.se:2811/jobs/359712964719111059953303
 
  Status: INLRMS:Q
 
Job gsiftp://siri.lunarc.lu.se:2811/jobs/261871296472351384107384
 
  Status: FINISHED
 
  Exit Code: 0
 
Job gsiftp://arc-ce.smokerings.nsc.liu.se:2811/jobs/84411296472389643492645
 
  Status: INLRMS:Q</pre>
 
If more information on the job is needed, the '''-l''' switch can be used to provide additional information on the job as in the following example:
 
 
 
<pre>[user@localhost ex1]$ ngstat -l gsiftp://siri.lunarc.lu.se:2811/jobs/261871296472351384107384
 
Job gsiftp://siri.lunarc.lu.se:2811/jobs/261871296472351384107384
 
  Status: FINISHED
 
  Exit Code: 0
 
  Owner: /O=Grid/O=NorduGrid/OU=lunarc.lu.se/CN=User Userson
 
  Cluster: siri.lunarc.lu.se
 
  Queue: arc
 
  Requested Number of CPUs: 1
 
  Execution Nodes:
 
    sn001
 
    sn001.mpi
 
  stdout: stdout.txt
 
  stderr: stderr.txt
 
  Submitted: 2011-01-31 12:12:32
 
  Completed: 2011-01-31 12:12:41
 
  Submitted from: 81.230.189.149:48701;localhost.localdomain
 
  Submitting Client: nordugrid-arc-0.8.3.1
 
  Required CPU Time: 5 minutes
 
  Used CPU Time: 0
 
  Used Wall Time: 1 minute
 
  Results must be retrieved before: 2011-02-11 03:19:21
 
  Proxy valid to: 2011-01-31 22:32:58
 
  Entry valid from: 2011-01-31 12:43:07
 
  Entry valid to: 2011-01-31 12:44:37</pre>
 
The '''ngstat''' command can also used to query status of jobs from job lists created with the '''-o''' switch in the '''ngsub''' command. In '''ngstat''' this is accomplished using the '''-i''' switch:
 
 
 
<pre>[user@localhost ex1]$ ngstat -a -i job_sweep1</pre>
 
The options timeout ('''-t'''), debug ('''-d''') and '''-c''' options can be used in the same way as in the '''ngsub''' command.
 
 
 
== Retrieving finished jobs ==
 
 
 
When a job has finished executing on a grid resource the job output files and results can be downloaded using the '''ngget''' command. The general syntax:
 
 
 
<pre>ngget [options] [jobid|jobname]</pre>
 
Retrieving a single job is accomplished by using '''ngget''' and the job identifier:
 
 
 
<pre>[user@localhost ex2]$ ngget gsiftp://siri.lunarc.lu.se:2811/jobs/126181296507986385553351
 
Results stored at /home/user/usersguide/examples/ex2/126181296507986385553351
 
Jobs processed: 1, successfuly downloaded: 1</pre>
 
Retrieveing a job by the job name attributes can be done by specifying the job name instead of the job id. In the following example the job description had the '''jobName''' attribute set to '''job0001''':
 
 
 
<pre>[user@localhost ex3]$ ngget job0001
 
Results stored at /home/user/usersguide/examples/ex3/69012965093911208764313
 
Jobs processed: 1, successfuly downloaded: 1</pre>
 
Downloading all jobs in the '''$HOME/.ngjobs''' file is done by using the '''-a''' switch:
 
 
 
<pre>[user@localhost ex3]$ ngget -a
 
Results stored at /home/user/usersguide/examples/ex3/131291296508001121515985
 
Results stored at /home/user/usersguide/examples/ex3/133541296508018329511759
 
Jobs processed: 2, successfuly downloaded: 2</pre>
 
By default downloaded jobs are stored in directories with the same name as the last part of the job id. A job with the job id '''gsiftp://siri.lunarc.lu.se:2811/jobs/126181296507986385553351''' will be stored in the directory '''126181296507986385553351''' in the same directory as the '''ngget''' command was executed. This behavior can be changed using the '''-dir''' switch. Using the '''-dir''' switch will place create the downloaded job directories in the directory specified by the switch. The following example will download all jobs to the '''job_sweep1''' directory:
 
 
 
<pre>[user@localhost ex3]$ ngget -a -dir ./job_sweep1
 
Results stored at ./job_sweep1/198512965106411445114039
 
Results stored at ./job_sweep1/221412965106421926028190
 
Results stored at ./job_sweep1/265812965106431621700746
 
Results stored at ./job_sweep1/293912965106452076344440
 
Jobs processed: 4, successfuly downloaded: 4</pre>
 
It is also possible to use the job name as the job directory by using the '''-j''' switch as shown in the following example:
 
 
 
<pre>[user@localhost ex3]$ ngget -a -j -dir ./job_sweep2
 
Results stored at ./job_sweep2/job0001
 
Results stored at ./job_sweep2/job0002
 
Results stored at ./job_sweep2/job0003
 
Results stored at ./job_sweep2/job0004
 
Jobs processed: 4, successfuly downloaded: 4</pre>
 
== Killing running jobs ==
 
 
 
If for some reason you need to kill any of the jobs submitted to a resource the '''ngkill''' command can be used. In the most basic form the command takes a job id or a job name as input. Killing a job using a job id is shown below:
 
 
 
<pre>[jonas@localhost ex4]$ ngkill gsiftp://siri.lunarc.lu.se:2811/jobs/312511297331801573492658
 
Proxy subject name: /O=Grid/O=NorduGrid/OU=lunarc.lu.se/CN=Jonas Lindemann/CN=792268717
 
Proxy valid to: 2011-02-10 21:11:16
 
Proxy valid for: 10 hours, 9 minutes, 20 seconds
 
Killing job: gsiftp://siri.lunarc.lu.se:2811/jobs/312511297331801573492658
 
Deleting job: gsiftp://siri.lunarc.lu.se:2811/jobs/312511297331801573492658
 
Jobs processed: 1, killed: 1, deleted: 1</pre>
 
killing a job using a job name is done in a similar procedure:
 
 
 
<pre>[jonas@localhost ex4]$ ngkill job0001
 
Proxy subject name: /O=Grid/O=NorduGrid/OU=lunarc.lu.se/CN=Jonas Lindemann/CN=792268717
 
Proxy valid to: 2011-02-10 21:11:16
 
Proxy valid for: 10 hours, 9 minutes, 39 seconds
 
Killing job: gsiftp://siri.lunarc.lu.se:2811/jobs/3071412973317791892872442
 
Deleting job: gsiftp://siri.lunarc.lu.se:2811/jobs/3071412973317791892872442
 
Jobs processed: 1, killed: 1, deleted: 1</pre>
 
To kill all runnig jobs, the '''-a''' switch can be used, which is illustrated in the following example:
 
 
 
<pre>[jonas@localhost ex4]$ ngkill -a
 
Proxy subject name: /O=Grid/O=NorduGrid/OU=lunarc.lu.se/CN=Jonas Lindemann/CN=792268717
 
Proxy valid to: 2011-02-10 21:11:16
 
Proxy valid for: 10 hours, 4 minutes, 25 seconds
 
Killing job: gsiftp://siri.lunarc.lu.se:2811/jobs/130621297332339928417805
 
Deleting job: gsiftp://siri.lunarc.lu.se:2811/jobs/130621297332339928417805
 
Killing job: gsiftp://siri.lunarc.lu.se:2811/jobs/1324612973323401688656595
 
Deleting job: gsiftp://siri.lunarc.lu.se:2811/jobs/1324612973323401688656595
 
Killing job: gsiftp://siri.lunarc.lu.se:2811/jobs/1327812973323411385420097
 
Deleting job: gsiftp://siri.lunarc.lu.se:2811/jobs/1327812973323411385420097
 
Killing job: gsiftp://siri.lunarc.lu.se:2811/jobs/1329512973323431857144927
 
Deleting job: gsiftp://siri.lunarc.lu.se:2811/jobs/1329512973323431857144927
 
Jobs processed: 4, killed: 4, deleted: 4</pre>
 
Custom job lists can also be used by specifying the job list file using the '''-i''' switch.
 
 
 
== Cleaning job data on resources ==
 
 
 
The output files and logs from finished jobs are kept a couple of days on the resource and will be eventually erased automatically. If you are not interested in downloading results from jobs or want to remove old job results, the '''ngclean''' command can instruct the resources to clean the data from the jobs. '''ngclean''' works in the same way as '''ngkill'''. Cleaning a job using a job id can be achieved in the following way:
 
 
 
<pre>[jonas@localhost ex4]$ ngclean gsiftp://siri.lunarc.lu.se:2811/jobs/1388112973341962029409606
 
Proxy subject name: /O=Grid/O=NorduGrid/OU=lunarc.lu.se/CN=Jonas Lindemann/CN=792268717
 
Proxy valid to: 2011-02-10 21:11:16
 
Proxy valid for: 9 hours, 22 minutes, 28 seconds
 
Deleting job: gsiftp://siri.lunarc.lu.se:2811/jobs/1388112973341962029409606
 
Jobs processed: 1, deleted: 1</pre>
 
Job names can also be used:
 
 
 
<pre>[jonas@localhost ex4]$ ngclean job0001
 
Proxy subject name: /O=Grid/O=NorduGrid/OU=lunarc.lu.se/CN=Jonas Lindemann/CN=792268717
 
Proxy valid to: 2011-02-10 21:11:16
 
Proxy valid for: 9 hours, 24 minutes, 45 seconds
 
Deleting job: gsiftp://siri.lunarc.lu.se:2811/jobs/135081297334194489234891
 
Jobs processed: 1, deleted: 1</pre>
 
All jobs can be cleaned by using the '''-a''' switch:
 
 
 
<pre>[jonas@localhost ex4]$ ngclean -a
 
Proxy subject name: /O=Grid/O=NorduGrid/OU=lunarc.lu.se/CN=Jonas Lindemann/CN=792268717
 
Proxy valid to: 2011-02-10 21:11:16
 
Proxy valid for: 9 hours, 16 minutes, 40 seconds
 
Deleting job: gsiftp://siri.lunarc.lu.se:2811/jobs/137781297334195171034347
 
Deleting job: gsiftp://siri.lunarc.lu.se:2811/jobs/1394612973341971718841954
 
Jobs processed: 2, deleted: 2</pre>
 
Custom job lists can also be used by specifying the job list file using the '''-i'''.
 

Latest revision as of 10:19, 8 February 2023