Difference between revisions of "Application examples"

From SNIC Documentation
Jump to: navigation, search
(Generic examples)
(Importing ARC and logging)
(8 intermediate revisions by the same user not shown)
Line 53: Line 53:
  
 
<pre>arcsub ex2.xrsl</pre>
 
<pre>arcsub ex2.xrsl</pre>
 +
 +
=== Example 3 - Job sweep ===
 +
 +
This example illustrates how to do a simple job sweep and manage the sweep using ARC job lists. The implementation of the the job sweep will be done using Python, but could also easily be implemented using bash-scripts or any other scripting language.
 +
 +
To enable dynamic generation of XRSL description a template XRSL is defined in the jobDescription string variable.
 +
 +
<pre>#!/usr/bin/python
 +
 +
import os, sys
 +
 +
jobDescription = '''&(executable=run.sh)
 +
(cpuTime='5 minutes')
 +
(stdout=stdout.txt)
 +
(stderr=stderr.txt)
 +
(inputFiles=('run.sh' ''))
 +
(jobName=job%04d)'''
 +
</pre>
 +
 +
In the string template, "job%04d" will be substituted with an integer value and the job names will have the format "job0000"-"job000n-1".
 +
 +
Next, a variable totalJobs is set to the total number of jobs to be submitted.
 +
 +
<pre>totalJobs = 4</pre>
 +
 +
The job submission loop is implemented using a standard Python loop.
 +
 +
<pre>
 +
for i in range(totalJobs):
 +
</pre>
 +
 +
In the job submission we are going to pass the XRSL as a string to the arcsub command. To do this we have to remove the line breaks in the template. This is done using the following statement:
 +
 +
<pre>
 +
jobDescriptionString = "".join(jobDescription.split("\n"))
 +
</pre>
 +
 +
To be able to keep track of the jobs we submit we are going to instruct the arcsub command to store the submitted jobs in the joblistfile ex3.list. This file can the be used with the other ARC commands for managing these specific jobs. A job list is defined int the ARC commands using the "-j" or "--joblist" command line parameters.
 +
 +
Finally the arcsub command is called using the os.system command which executes a system command and blocks until the command has finished. There are several other ways of executing external commands which provide more control over the execution, but for this example os.system is enough. Substitution of variables is done using the "%" operator and will create the correct names for the jobs.
 +
 +
<pre>
 +
os.system('arcsub --joblist=ex3.list --jobdescrstring="%s"' % (jobDescriptionString % i))
 +
</pre>
 +
 +
'''Python-script (submit.py):'''
 +
 +
<pre>#!/usr/bin/python
 +
 +
import os, sys
 +
 +
jobDescription = '''&(executable=run.sh)
 +
(cpuTime='5 minutes')
 +
(stdout=stdout.txt)
 +
(stderr=stderr.txt)
 +
(inputFiles=('run.sh' ''))
 +
(jobName=job%04d)'''
 +
 +
totalJobs = 4
 +
 +
for i in range(totalJobs):
 +
 +
# Removing newlines from jobDescription and convert
 +
# to a string for use with arcsub
 +
 +
jobDescriptionString = "".join(jobDescription.split("\n"))
 +
os.system('arcsub --joblist=ex3.list --jobdescrstring="%s"' % (jobDescriptionString % i))
 +
</pre>
 +
 +
'''Shell script:'''
 +
 +
<pre>#!/bin/sh
 +
echo "Hello, grid"
 +
</pre>
 +
 +
'''Usage:'''
 +
 +
<pre>python submit.py</pre>
 +
 +
Monitoring status of grid jobs in job list:
 +
 +
<pre>arcstat -j ex3.list</pre>
 +
 +
Retrieving results from grid jobs:
 +
 +
<pre>arcget -j ex3.list</pre>
  
 
== Octave / MATLAB examples ==
 
== Octave / MATLAB examples ==
Line 58: Line 144:
 
== SciPy / Numpy examples ==
 
== SciPy / Numpy examples ==
  
== ARC Python API examples ==
+
== ARC Python binding examples ==
 +
 
 +
=== Installing the ARC Python-binding ===
 +
 
 +
RHEL/CentOS:
 +
 
 +
<pre>
 +
yum install nordugrid-arc-python
 +
</pre>
 +
 
 +
Ubunut/Debian:
 +
 
 +
<pre>
 +
apt-get install nordugrid-arc-python
 +
</pre>
 +
 
 +
=== Importing ARC and logging ===
 +
 
 +
The ARC python binding is located in the module *arc*. Before using any ARC functions and classes logging should be setup. In the following example the *arc* module is imported and logging is setup. Application specific logging is also setup tagged with *Ex1*.
 +
 
 +
<pre>
 +
import arc, sys
 +
 
 +
# ----- Setup logging
 +
 
 +
logcout = arc.LogStream(sys.stdout)
 +
arc.Logger_getRootLogger().removeDestinations()     
 +
arc.Logger_getRootLogger().addDestination(logcout)
 +
 
 +
# ----- Setup logging threshold (optional)
 +
 
 +
#arc.Logger_getRootLogger().setThreshold(arc.INFO)
 +
 
 +
# ----- Setup application specific logging
 +
 
 +
logger = arc.Logger(arc.Logger_getRootLogger(), "Ex1")
 +
 
 +
# ----- Appliation logging
 +
 
 +
logger.msg(arc.INFO, "Example 1 starting")
 +
</pre>
 +
 
 +
This code will produce the following output:
 +
 
 +
<pre>
 +
$ python ex1.py
 +
[2011-11-08 15:05:28] [Arc.Ex1] [INFO] [954/157611936] Example 1 starting
 +
</pre>

Revision as of 14:07, 8 November 2011

Generic examples

Example 1

This job just sends a script to a grid resource which writes "Hello, grid!" to standard output. The scripts stores standard input and output to the files stdout.txt och stderr.txt. All output is collected upon retrieval using the directive "/" in the outputFiles attribute. Walltime is set to 5 minutes. The executable is also added to the input files section.

XRSL job description (ex1.xrsl):

&(executable=run.sh)
(wallTime="5 minutes")
(stdout="stdout.txt")
(stderr="stderr.txt")
(inputFiles=("run.sh" ""))
(outputFiles=("/" ""))

Executable shell script (run.sh):

#!/bin/sh
echo "Hello, grid"

Usage:

arcsub ex1.xrsl

More verbose output is achieved with:

arcsub --debug=INFO ex1.xrsl

Example 2

Debug information from the used resources can be retrieved using the "gmlog" attribute. The value of the attribute specifies the name of the directory to store the debug information.

XRSL job description (ex2.xrsl):

&(executable=run.sh)
(cpuTime="5 minutes")
(stdout="stdout.txt")
(stderr="stderr.txt")
(inputFiles=("run.sh" ""))
(outputFiles=("/" ""))
(gmlog="grid.debug")

Executable shell script (run.sh):

#!/bin/sh
echo "Hello, grid"

Usage:

arcsub ex2.xrsl

Example 3 - Job sweep

This example illustrates how to do a simple job sweep and manage the sweep using ARC job lists. The implementation of the the job sweep will be done using Python, but could also easily be implemented using bash-scripts or any other scripting language.

To enable dynamic generation of XRSL description a template XRSL is defined in the jobDescription string variable.

#!/usr/bin/python

import os, sys

jobDescription = '''&(executable=run.sh)
(cpuTime='5 minutes')
(stdout=stdout.txt)
(stderr=stderr.txt)
(inputFiles=('run.sh' ''))
(jobName=job%04d)'''

In the string template, "job%04d" will be substituted with an integer value and the job names will have the format "job0000"-"job000n-1".

Next, a variable totalJobs is set to the total number of jobs to be submitted.

totalJobs = 4

The job submission loop is implemented using a standard Python loop.

for i in range(totalJobs):

In the job submission we are going to pass the XRSL as a string to the arcsub command. To do this we have to remove the line breaks in the template. This is done using the following statement:

	
	jobDescriptionString = "".join(jobDescription.split("\n"))

To be able to keep track of the jobs we submit we are going to instruct the arcsub command to store the submitted jobs in the joblistfile ex3.list. This file can the be used with the other ARC commands for managing these specific jobs. A job list is defined int the ARC commands using the "-j" or "--joblist" command line parameters.

Finally the arcsub command is called using the os.system command which executes a system command and blocks until the command has finished. There are several other ways of executing external commands which provide more control over the execution, but for this example os.system is enough. Substitution of variables is done using the "%" operator and will create the correct names for the jobs.

	os.system('arcsub --joblist=ex3.list --jobdescrstring="%s"' % (jobDescriptionString % i))

Python-script (submit.py):

#!/usr/bin/python

import os, sys

jobDescription = '''&(executable=run.sh)
(cpuTime='5 minutes')
(stdout=stdout.txt)
(stderr=stderr.txt)
(inputFiles=('run.sh' ''))
(jobName=job%04d)'''

totalJobs = 4

for i in range(totalJobs):
	
	# Removing newlines from jobDescription and convert
	# to a string for use with arcsub
	
	jobDescriptionString = "".join(jobDescription.split("\n"))
	os.system('arcsub --joblist=ex3.list --jobdescrstring="%s"' % (jobDescriptionString % i))

Shell script:

#!/bin/sh
echo "Hello, grid"

Usage:

python submit.py

Monitoring status of grid jobs in job list:

arcstat -j ex3.list

Retrieving results from grid jobs:

arcget -j ex3.list

Octave / MATLAB examples

SciPy / Numpy examples

ARC Python binding examples

Installing the ARC Python-binding

RHEL/CentOS:

yum install nordugrid-arc-python

Ubunut/Debian:

apt-get install nordugrid-arc-python

Importing ARC and logging

The ARC python binding is located in the module *arc*. Before using any ARC functions and classes logging should be setup. In the following example the *arc* module is imported and logging is setup. Application specific logging is also setup tagged with *Ex1*.

import arc, sys

# ----- Setup logging 

logcout = arc.LogStream(sys.stdout)
arc.Logger_getRootLogger().removeDestinations()       
arc.Logger_getRootLogger().addDestination(logcout)

# ----- Setup logging threshold (optional)

#arc.Logger_getRootLogger().setThreshold(arc.INFO)

# ----- Setup application specific logging

logger = arc.Logger(arc.Logger_getRootLogger(), "Ex1")

# ----- Appliation logging

logger.msg(arc.INFO, "Example 1 starting")

This code will produce the following output:

$ python ex1.py
[2011-11-08 15:05:28] [Arc.Ex1] [INFO] [954/157611936] Example 1 starting