Difference between revisions of "Accessing Swestore with the ARC client"

From SNIC Documentation
Jump to: navigation, search
(FAQ)
(35 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
[[Category:Grid computing]]
 
[[Category:Grid computing]]
[[Category:Storage]]
 
[[Category:SweStore]]
 
 
[[Category:SweGrid user guide]]
 
[[Category:SweGrid user guide]]
[[Getting started with SweGrid|< Getting started with SweGrid]]<br>
+
[[Category:Swestore]]
[[Swestore|< SweStore]]
+
[[Category:Swestore user guide]]
  
(PAGE UNDER DEVELOPMENT)
+
[[Swestore-dCache|< Swestore-dCache]]
  
= The ARC client =
+
This guide describes how to use the [http://www.nordugrid.org Nordugrid] ''ARC'' client for storing and retrieving files from Swestore. The ARC client is usually used for sending grid jobs to grid clusters, but it also contains commands for data management. A complete user guide for the ARC client can be found in http://www.nordugrid.org/documents/arc-ui.pdf.
  
This guide describes how to use the [http://www.nordugrid.org Nordugrid]
+
= Requirements =
''ARC'' client for storing and retrieving files from SweStore National
+
To access Swestore using the ARC client you need to have done the [[Certificate Setup for Swestore]] and a be a member of a Swestore storage project, see [[Swestore#Getting access to Swestore]].
Storage. The ARC client is usually used for sending grid jobs to grid
 
clusters, but it also contain commands for data management.
 
  
If your system doesn't have the ARC client installed you can look at
+
You also need to have the certificate installed on the resource where you want to run the ARC commands. For SNIC resources this process includes [[Exporting_a_client_certificate|exporting the certificate from your browser]], transfering it to the intended SNIC resource and [[Preparing_a_client_certificate|prepare it for use with grid tools]].
the [[ARC_client_installation|ARC client installation]] page in this
 
wiki or the official Nordugrid
 
[http://www.nordugrid.org/documents/arc-client-install.html ARC
 
installation] page for more information. Please note that the windows
 
and MacOS clients are a bit unstable.
 
  
A complete user guide for the ARC client can be found in
+
All SNIC HPC systems should have the ARC client installed. If yours doesn't, please contact support at your centre so they can fix this as soon as possible. To install the ARC client on your own computer, please follow instructions [[ARC_client_installation|here]], or see the official Nordugrid [http://www.nordugrid.org/documents/arc-client-install.html ARC installation] page for more information.
http://www.nordugrid.org/documents/arc-ui.pdf.
 
  
ARC can handle many different file transfer protocols, such as:
+
= Quickstart =
''http, https, httpg, ftp, gsiftp, lfc, rls, srm''. But only the srm
 
protocol is currently is supported when accessing SweStore using the
 
ARC client.
 
  
== ARC 0.8 versus 1.0 ==
+
== Basic commands ==
 +
: <code>arcproxy</code> - unlock your certificate so you can use it. See [[Grid_certificates#Proxy_certificates|Proxy certificates]] for details.
 +
: <code>arcls</code> - for listing files. Works similarly to <code>ls</code>. Example <code><nowiki>arcls gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME</nowiki></code>
 +
: <code>arcmkdir</code> - for creating directories. Works similarly to <code>mkdir</code>. Example <code><nowiki>arcmkdir gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir</nowiki></code>
 +
: <code>arccp</code> - for copying files. Works similarly to <code>cp</code>. Example <code><nowiki>arccp myfile.txt gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/myfile.txt</nowiki></code>
 +
: <code>arcrm</code> - for deleting files. Works similarly to <code>rm</code>. Example <code><nowiki>arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/whoops.txt</nowiki></code>
  
In late spring 2011 Nordugrid release the 1.0 version of ARC
+
Use <code>man</code> and <code>--help</code> to get more info on each command. Examples: <code>man arcrm</code> or <code>arcls --help</code>
(sometimes called 11.05). One of the new features of 1.0 compared to
 
the previous 0.8 release was a new command set. Basically most of the
 
ng* commands was replaced with the new arc* commands. Some
 
functionality moved between commands (ngstat became arcinfo and
 
arcstat) and some new commands was introduced (arcproxy as an
 
replacement for grid-proxy-init, which wasn't an arc command at all
 
but a part of the Globus Toolkit). There are still legacy compatibility
 
binaries in place for the old ng* commands, but I strongly suggest
 
that you use arc* when available.
 
  
If you on the same local account switch between ng* and arc* commands you may get warnings:
+
== Paths ==
 +
The ARC commands supports multiple storage protocols, we recommend using GridFTP with paths on the form <code><nowiki>gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/...</nowiki></code>.
  
Bad format detected in file /home/jens/.arc/srms.conf, in line srm.swegrid.se 8443 2.2
+
= Unlock your certificate =
Unwrapped data does not fit into buffer
 
Connection to server failed: Connection refused
 
Connection to server failed: Connection refused
 
or
 
WARNING: Bad or old format detected in file /home/jens/.arc/srms.conf, in line srm.swegrid.se 8443 gsi 2.2
 
WARNING: Bad or old format detected in file /home/jens/.arc/srms.conf, in line srm.swegrid.se 8443 gsi 2.2
 
  
There is a file, srm.conf, that gets automatically updated when
+
Your certificate needs to be unlocked before you can do anything. Think of the process as logging in. When successful, a ''proxy certificate'' is the result.
accessing a resource. ngls and arcls does not agree on the content of
 
that file. There are bug reports about it. That warning is just
 
confusing and shouldn't be displayed. Another attempt using the same
 
command will probably not display those errors again.
 
  
== arcproxy 1.0.1 ==
+
arcproxy
  
There us a bug in dCache which makes proxy certificates from arcproxy
+
To see the lifetime of your session, use:
1.0.1 unusable. The error you get is:
 
  
  ERROR: Failed listing files
+
  arcproxy -I
 
 
All other version of arcproxy should be fine. If you encounter this
 
version av arcproxy, please use grid-proxy-init if available. The
 
generated proxy certificates should be equivalent.
 
 
 
= Preparations =
 
 
 
In addition to the installed ARC client you need the proper access rights and a valid grid proxy certificate. Please check the [[Grid certificates]] page for more information.
 
  
 
= Copying files =  
 
= Copying files =  
Line 83: Line 48:
 
normal '''cp''' command as shown in the following example:
 
normal '''cp''' command as shown in the following example:
  
  $ arccp archive.tar.gz srm://srm.swegrid.se/ops/
+
  arccp archive.tar.gz gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/
  
 
Please note the trailing / which marks the destination as a directory.
 
Please note the trailing / which marks the destination as a directory.
Line 94: Line 59:
 
Recursive copying is accomplished using the '''--recursive''' option
 
Recursive copying is accomplished using the '''--recursive''' option
 
to arccp. The argument to the option determines the depth of the
 
to arccp. The argument to the option determines the depth of the
recursive copy.
+
recursive copy, just supply a really big number like <code>999</code> if
 +
you want the entire source directory tree.
 +
 
 +
Example:
 +
 
 +
arccp --recursive=999 foobar/ gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/
 +
 
 +
'''NOTE:''' The above example will copy all files in the directory <code>foobar</code> into
 +
the destination directory <code>YOUR_PROJECT_NAME</code>. If you want the directory <code>foobar</code>
 +
to be part of the destination path you have to explicitly supply it as shown in the example below:
 +
 
 +
arccp --recursive=999 foobar/ gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/foobar/
 +
 
 +
== Long-running operations ==
 +
 
 +
Note that copying large directory trees can take quite some time, and might fail if you're not aware of the following:
  
$ arccp --recursive=3 foobar/ srm://srm.swegrid.se/ops/foobar/
+
* Your login session created with the <code>arcproxy</code> command has a limited lifetime. Use <code>arcproxy -I</code> to show the remaining time. Use <code>arcproxy -c validityPeriod=xxH</code> to initiate a session with longer lifetime.
 +
* The command will abort if you lose your network connection with the computer where you are running arccp. A utility such as <code>screen</code> or <code>tmux</code> can be used to create a terminal session you can reattach to.
 +
* Transfer rates are largely dependent on the average file size, if you have a lot of small files the transfer will be slower than if you have large files.
 +
* We recommend to limit your transfer sessions (ie. the directory tree copied with each arccp command) to 1TB if you have mostly large (100+MB) files and to 100GB if you have smaller files.
  
 
= Listing files =  
 
= Listing files =  
Line 105: Line 88:
 
following example:
 
following example:
  
  $ arcls srm://srm.swegrid.se/ops/
+
  arcls gsiftp://gsiftp.swestore.se/snic/bils/db/uniprot/2012_05
  motd.1
+
 
  bla
+
Example output:
  generated
+
reldate.txt
  ops
+
  speclist.txt
  motd.f343
+
  uniprot_sprot.dat.gz
 +
  uniprot_sprot.fasta.gz
 +
  uniprot_trembl.dat.gz
 +
  uniprot_trembl.fasta.gz
  
 
Additional information can be listed by adding the '''--long''' option:
 
Additional information can be listed by adding the '''--long''' option:
  
  $ arcls --long srm://srm.swegrid.se/ops
+
  arcls --long gsiftp://gsiftp.swestore.se/snic/bils/db/uniprot/2012_05
 +
 
 +
Example output:
 
  <Name> <Type> <Size> <Creation> <Validity> <CheckSum> <Latency>
 
  <Name> <Type> <Size> <Creation> <Validity> <CheckSum> <Latency>
  motd.1 file 354 2008-06-05 12:28:23 (n/a) adler32:582d7718 NEARLINE
+
  reldate.txt file 151 2012-05-23 03:00:19 (n/a) adler32:f3f52f1d (n/a)
  bla dir 512 2008-08-22 12:23:49 (n/a) (n/a) NEARLINE
+
speclist.txt file 1715169 2012-05-23 03:00:17 (n/a) adler32:91e59dae (n/a)
  foobar dir 512 2008-11-17 15:07:39 (n/a) (n/a) NEARLINE
+
  uniprot_sprot.dat.gz file 462895141 2012-05-23 02:57:18 (n/a) adler32:0f131bb2 (n/a)
  ops dir 512 2010-01-21 11:26:00 (n/a) (n/a) NEARLINE
+
  uniprot_sprot.fasta.gz file 79935897 2012-05-23 03:00:20 (n/a) adler32:89844c57 (n/a)
  motd.f343 file 436 2010-01-08 14:35:40 (n/a) adler32:0fed94f2 ONLINE
+
  uniprot_trembl.dat.gz file 9162678278 2012-05-23 02:52:01 (n/a) adler32:b2d7cfd5 (n/a)
 +
  uniprot_trembl.fasta.gz file 4456514443 2012-05-23 02:57:34 (n/a) adler32:2b73b2a1 (n/a)
 +
 
 +
== Metadata ==
 +
 
 +
Metadatainformation on a specific file can be listed by specifying the '''-m''' or '''--metadata''' option. Worth noting is that the amount of metadata available differs depending on which protocol is used.
 +
 
 +
Examples:
 +
 
 +
arcls --metadata gsiftp://gsiftp.swestore.se/ops/nikke/smallfile
 +
 
 +
Example output:
 +
/ops/nikke/smallfile
 +
checksum:adler32:762606eb
 +
mtime:2013-04-12 11:06:56
 +
path:/ops/nikke/smallfile
 +
size:30
 +
type:file
  
Metadatainformation on a specific file can be listed by specifying the '''-m''' or '''--metadata''' option. In the following example the metadata information of the '''motd.1''' file is shown:
+
arcls --metadata srm://srm.swegrid.se/ops/nikke/smallfile
  
$ arcls --metadata srm://srm.swegrid.se/ops/motd.1
+
Example output:
  /ops/motd.1
+
  /ops/nikke/smallfile
  accessperm:rw-r--r--
+
  accessperm:rw-r-----
  checksum:adler32:582d7718
+
  checksum:adler32:762606eb
  ctime:2008-06-05 12:28:23
+
  ctime:2013-04-12 11:06:56
 
  filestoragetype:PERMANENT
 
  filestoragetype:PERMANENT
 
  group:25001
 
  group:25001
  latency:NEARLINE
+
  latency:ONLINE
 
  lifetimeassigned:PT1S
 
  lifetimeassigned:PT1S
 
  lifetimeleft:PT1S
 
  lifetimeleft:PT1S
  mtime:2011-03-24 12:57:42
+
  mtime:2013-04-12 11:06:56
 
  owner:25001
 
  owner:25001
  path:/ops/motd.1
+
  path:/ops/nikke/smallfile
  size:354
+
  size:30
 
  spacetokens:
 
  spacetokens:
 
  type:file
 
  type:file
  
 
= Creating directories =  
 
= Creating directories =  
 +
Directories are generally created on demand. If you copy a file with the destination /snic/YOUR_PROJECT_NAME/newdir/dummyfile the newdir directory will be created if missing. But you can explicitly create directories using the arcmkdir command.
 +
 +
arcmkdir gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir
 +
 +
= Removing files or directories =
 +
 +
arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir/dummyfile
  
There is no command line tool for creating directories. This command
+
arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir/
will probably be added in an upcoming release. In the meantime the
 
following procedure can be used:
 
  
$ touch dummyfile
+
To remove directories they have to be empty.
$ arccp dummyfile srm://srm.swegrid.se/ops/newdir/
 
$ arcls srm://srm.swegrid.se/ops/newdir/
 
dummyfile
 
  
The dummmyfile can be removed using the arcrm command described in the following section.
+
= FAQ =
  
= Removing files or directories =
+
1) I get this message when I try to list files:
 +
  $ arcls gsiftp://gsiftp.swestore.se/snic/
 +
  ERROR: Unsupported URL given
 +
* The nordugrid-arc-plugins-globus package is missing. Without it ARC is not able to use the gsiftp protocol.
  
$ arcrm srm://srm.swegrid.se/ops/newdir/dummyfile
+
2) <code>arcproxy</code> gives WARNING or ERROR messages.
$ arcrm srm://srm.swegrid.se/ops/newdir/
+
* The most common reason is a missing certificate file. See [[#Requirements]]
  
To remove directories they have to be empty.
+
3) I get this warning when using the gsiftp:// protocol:
 +
  WARNING: Can not find voms service configuration file (vomses) in default locations: ~/.arc/vomses, ~/.voms/vomses, $ARC_LOCATION/etc/vomses, $ARC_LOCATION/etc/grid-security/vomses, $PWD/vomses, /etc/vomses, /etc/grid-security/vomses
 +
* This warning is benign and due to a bug in the ARC client.
 +
* A workaround to silence the warning is to create an empty vomses file, for example: <code>mkdir -p ~/.arc ; touch ~/.arc/vomses</code>

Revision as of 07:45, 10 March 2022


< Swestore-dCache

This guide describes how to use the Nordugrid ARC client for storing and retrieving files from Swestore. The ARC client is usually used for sending grid jobs to grid clusters, but it also contains commands for data management. A complete user guide for the ARC client can be found in http://www.nordugrid.org/documents/arc-ui.pdf.

Requirements

To access Swestore using the ARC client you need to have done the Certificate Setup for Swestore and a be a member of a Swestore storage project, see Swestore#Getting access to Swestore.

You also need to have the certificate installed on the resource where you want to run the ARC commands. For SNIC resources this process includes exporting the certificate from your browser, transfering it to the intended SNIC resource and prepare it for use with grid tools.

All SNIC HPC systems should have the ARC client installed. If yours doesn't, please contact support at your centre so they can fix this as soon as possible. To install the ARC client on your own computer, please follow instructions here, or see the official Nordugrid ARC installation page for more information.

Quickstart

Basic commands

arcproxy - unlock your certificate so you can use it. See Proxy certificates for details.
arcls - for listing files. Works similarly to ls. Example arcls gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME
arcmkdir - for creating directories. Works similarly to mkdir. Example arcmkdir gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir
arccp - for copying files. Works similarly to cp. Example arccp myfile.txt gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/myfile.txt
arcrm - for deleting files. Works similarly to rm. Example arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/whoops.txt

Use man and --help to get more info on each command. Examples: man arcrm or arcls --help

Paths

The ARC commands supports multiple storage protocols, we recommend using GridFTP with paths on the form gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/....

Unlock your certificate

Your certificate needs to be unlocked before you can do anything. Think of the process as logging in. When successful, a proxy certificate is the result.

arcproxy

To see the lifetime of your session, use:

arcproxy -I

Copying files

Copying files to and from resources is accomplished using the arccp command.

Copying single files

Copying single files is accomplished in the same way as using the normal cp command as shown in the following example:

arccp archive.tar.gz gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/

Please note the trailing / which marks the destination as a directory. Without a / the destination will be a file, which may or may not be what you wanted. All required directories are created when needed so the destination may be a nonexisting directory.

Recursive copying

Recursive copying is accomplished using the --recursive option to arccp. The argument to the option determines the depth of the recursive copy, just supply a really big number like 999 if you want the entire source directory tree.

Example:

arccp --recursive=999 foobar/ gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/

NOTE: The above example will copy all files in the directory foobar into the destination directory YOUR_PROJECT_NAME. If you want the directory foobar to be part of the destination path you have to explicitly supply it as shown in the example below:

arccp --recursive=999 foobar/ gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/foobar/

Long-running operations

Note that copying large directory trees can take quite some time, and might fail if you're not aware of the following:

  • Your login session created with the arcproxy command has a limited lifetime. Use arcproxy -I to show the remaining time. Use arcproxy -c validityPeriod=xxH to initiate a session with longer lifetime.
  • The command will abort if you lose your network connection with the computer where you are running arccp. A utility such as screen or tmux can be used to create a terminal session you can reattach to.
  • Transfer rates are largely dependent on the average file size, if you have a lot of small files the transfer will be slower than if you have large files.
  • We recommend to limit your transfer sessions (ie. the directory tree copied with each arccp command) to 1TB if you have mostly large (100+MB) files and to 100GB if you have smaller files.

Listing files

Listing files on a resources is done using the arcls command. In the simplest form the command just takes a URL as input and displays names and directories without any extra information as shown in the following example:

arcls gsiftp://gsiftp.swestore.se/snic/bils/db/uniprot/2012_05

Example output:

reldate.txt
speclist.txt
uniprot_sprot.dat.gz
uniprot_sprot.fasta.gz
uniprot_trembl.dat.gz
uniprot_trembl.fasta.gz

Additional information can be listed by adding the --long option:

arcls --long gsiftp://gsiftp.swestore.se/snic/bils/db/uniprot/2012_05

Example output:

<Name> <Type> <Size> <Creation> <Validity> <CheckSum> <Latency>
reldate.txt file 151 2012-05-23 03:00:19 (n/a) adler32:f3f52f1d (n/a)
speclist.txt file 1715169 2012-05-23 03:00:17 (n/a) adler32:91e59dae (n/a)
uniprot_sprot.dat.gz file 462895141 2012-05-23 02:57:18 (n/a) adler32:0f131bb2 (n/a)
uniprot_sprot.fasta.gz file 79935897 2012-05-23 03:00:20 (n/a) adler32:89844c57 (n/a)
uniprot_trembl.dat.gz file 9162678278 2012-05-23 02:52:01 (n/a) adler32:b2d7cfd5 (n/a)
uniprot_trembl.fasta.gz file 4456514443 2012-05-23 02:57:34 (n/a) adler32:2b73b2a1 (n/a)

Metadata

Metadatainformation on a specific file can be listed by specifying the -m or --metadata option. Worth noting is that the amount of metadata available differs depending on which protocol is used.

Examples:

arcls --metadata gsiftp://gsiftp.swestore.se/ops/nikke/smallfile

Example output:

/ops/nikke/smallfile
checksum:adler32:762606eb
mtime:2013-04-12 11:06:56
path:/ops/nikke/smallfile
size:30
type:file
arcls --metadata srm://srm.swegrid.se/ops/nikke/smallfile

Example output:

/ops/nikke/smallfile
accessperm:rw-r-----
checksum:adler32:762606eb
ctime:2013-04-12 11:06:56
filestoragetype:PERMANENT
group:25001
latency:ONLINE
lifetimeassigned:PT1S
lifetimeleft:PT1S
mtime:2013-04-12 11:06:56
owner:25001
path:/ops/nikke/smallfile
size:30
spacetokens:
type:file

Creating directories

Directories are generally created on demand. If you copy a file with the destination /snic/YOUR_PROJECT_NAME/newdir/dummyfile the newdir directory will be created if missing. But you can explicitly create directories using the arcmkdir command.

arcmkdir gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir

Removing files or directories

arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir/dummyfile
arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir/

To remove directories they have to be empty.

FAQ

1) I get this message when I try to list files:

 $ arcls gsiftp://gsiftp.swestore.se/snic/
 ERROR: Unsupported URL given
  • The nordugrid-arc-plugins-globus package is missing. Without it ARC is not able to use the gsiftp protocol.

2) arcproxy gives WARNING or ERROR messages.

  • The most common reason is a missing certificate file. See #Requirements

3) I get this warning when using the gsiftp:// protocol:

 WARNING: Can not find voms service configuration file (vomses) in default locations: ~/.arc/vomses, ~/.voms/vomses, $ARC_LOCATION/etc/vomses, $ARC_LOCATION/etc/grid-security/vomses, $PWD/vomses, /etc/vomses, /etc/grid-security/vomses
  • This warning is benign and due to a bug in the ARC client.
  • A workaround to silence the warning is to create an empty vomses file, for example: mkdir -p ~/.arc ; touch ~/.arc/vomses