Service Card for Nordugrid Arc CE

It's the entry point for Nordugrid jobs, very common within Switzerland

Definition

Operations

Client tools

arcinfo allows you to see the status of an ARC cluster.
arcstat gives you information on a specific job submitted to ARC.
arckill cancels a job
arcget fetches the output of a job.

ldapsearch gives you information on the LDAP tree of an ARC server.

ldapsearch -LLL -x -h pparc01.lcg.cscs.ch:2135 -b 'Mds-Vo-name=resource,o=grid'

arcsub submits an xrsl job to an ARC server.

Note: In https://git.cscs.ch/miguelgi/jobs, under arc-lcg there are some examples that can be used to test any ARC system.

Installation/Upgrade

Site modifications

The following modifications have been tested with the following ARC version:

nordugrid-release-15.03-1.el6.noarch
nordugrid-arc-compute-element-1.0.7-1.el6.noarch
nordugrid-arc-gridmap-utils-5.0.0-2.el6.noarch
nordugrid-arc-plugins-needed-5.0.0-2.el6.x86_64
nordugrid-arc-gridftpd-5.0.0-2.el6.x86_64
nordugrid-arc-hed-5.0.0-2.el6.x86_64
nordugrid-arc-plugins-globus-5.0.0-2.el6.x86_64
nordugrid-arc-plugins-xrootd-5.0.0-2.el6.x86_64
nordugrid-arc-arex-5.0.0-2.el6.x86_64
nordugrid-arc-python-5.0.0-2.el6.x86_64
nordugrid-arc-doc-2.0.3-1.el6.noarch
nordugrid-arc-aris-5.0.0-2.el6.noarch
nordugrid-arc-ldap-infosys-5.0.0-2.el6.noarch
nordugrid-arc-5.0.0-2.el6.x86_64

Infosys: ARC does not supply correct Glue 1.2 values when different not all VOs have access to all queues. However, GLUE2 is OK. CMS and LHCb currently use only Glue 1.2, so the following hack needs to be put in place to make those VOs see these ARC services:

# diff /usr/share/arc/glue-generator.pl.orig /usr/share/arc/glue-generator.pl.modif
465a466,475
>             # MG/CSCS
>             # Can be done better, but its a proof of concept
>             if ($queue_attributes{'nordugrid-queue-name'} =~ /lhcb/ )
>             {
>                 @vos= ('lhcb');
>             }
>             if ($queue_attributes{'nordugrid-queue-name'} =~ /cms/ )
>             {
>                 @vos= ('cms');
>             }

SLURM:

Job submission to reservations: In order to make certain jobs run in specific reservations (ops or dteam), the following modification needs to be put in place:

# diff /usr/share/arc/submit-SLURM-job.orig /usr/share/arc/submit-SLURM-job
79c79
<   #We set the priority as 100 - arc priority.
---
>   #We set the priority as 100 - arc piority.
112a113,144
> ###############################################################
> # CSCS specific - added comment to jobs for accounting purpose
> #
> PRIORITY_JOBS_RESERVATION="priority_jobs"
> USER=`whoami`
> MYUSERDN=$(/usr/bin/openssl x509 -in ${X509_USER_PROXY} -subject -noout | sed -r 's/.*= (.*)/\1/g' 2>&1)
> MYHN=$(hostname -s)
> COMMENT="\"$MYHN,$MYUSERDN\""
> echo "#SBATCH --comment=$COMMENT" >> $LRMS_JOB_SCRIPT
> #
> # CSCS specific - added reservation direction for specific users
> #
>
> # OPS jobs
> REGEX="ops[0-9][0-9][0-9]"
> if [[ ( $USER =~ $REGEX ) ]] ; then
>         echo "#SBATCH --reservation=${PRIORITY_JOBS_RESERVATION}">> $LRMS_JOB_SCRIPT
> fi
> # SGM jobs
> if [[ ( $USER == 'opssgm' ) || ( $USER == 'cmssgm' ) || ( $USER == 'atlassgm' ) || ( $USER == 'lhcbsgm' ) || ( $USER == 'honesgm' ) ]] ; then
>         echo "#SBATCH --reservation=${PRIORITY_JOBS_RESERVATION}" >> $LRMS_JOB_SCRIPT
> else
>    [[ ! -z ${MYUSERDN} ]] && [[ ( ${MYUSERDN} =~ 'Andrea Sciaba' ) ]] && echo "#SBATCH --reservation=${PRIORITY_JOBS_RESERVATION}" >> $LRMS_JOB_SCRIPT
> fi
> # dteam jobs
> REGEX="dteam[0-9][0-9][0-9]"
> if [[ ( $USER =~ $REGEX ) ]] ; then
>         #echo "#SBATCH --reservation=${PRIORITY_JOBS_RESERVATION}" >> $LRMS_JOB_SCRIPT
>         echo "#SBATCH --reservation=Docker" >> $LRMS_JOB_SCRIPT
> fi
> #### Done CSCS specific ######################################
>

Links:

# find /usr/share/arc/ -type l -ls
2776448    0 lrwxrwxrwx   1 root     root           29 Jun  4 14:57 /usr/share/arc/scan-slurm-job -> /usr/share/arc/scan-SLURM-job
2769747    0 lrwxrwxrwx   1 root     root           31 Jun  4 14:57 /usr/share/arc/cancel-slurm-job -> /usr/share/arc/cancel-SLURM-job
2776509    0 lrwxrwxrwx   1 root     root           26 Jun  4 14:59 /usr/share/arc/slurmmod.pm -> /usr/share/arc/SLURMmod.pm
2776510    0 lrwxrwxrwx   1 root     root           23 Jun  4 14:59 /usr/share/arc/slurm.pm -> /usr/share/arc/SLURM.pm
2769674    0 lrwxrwxrwx   1 root     root           31 Jun  4 14:30 /usr/share/arc/submit-slurm-job -> /usr/share/arc/submit-SLURM-job
2776508    0 lrwxrwxrwx   1 root     root           37 Jun  4 14:58 /usr/share/arc/configure-slurm-env.sh -> /usr/share/arc/configure-SLURM-env.sh

Docker:

# diff /usr/share/arc/submit_common.sh.orig  /usr/share/arc/submit_common.sh.modif
709a710,712
> # CSCS modifications to make ARC work also with Docker
> BLAH_AUX_JOBWRAPPER=/opt/cscs/bin/blah_aux_jobwrapper
> if [ -e \${BLAH_AUX_JOBWRAPPER} ]; then
711a715,730
>   \$BLAH_AUX_JOBWRAPPER $joboption_args $input_redirect $output_redirect
> else
>   \$GNU_TIME -o "\$RUNTIME_JOB_DIAG" -a -f '\
> WallTime=%es\nKernelTime=%Ss\nUserTime=%Us\nCPUUsage=%P\n\
> MaxResidentMemory=%MkB\nAverageResidentMemory=%tkB\n\
> AverageTotalMemory=%KkB\nAverageUnsharedMemory=%DkB\n\
> AverageUnsharedStack=%pkB\nAverageSharedMemory=%XkB\n\
> PageSize=%ZB\nMajorPageFaults=%F\nMinorPageFaults=%R\n\
> Swaps=%W\nForcedSwitches=%c\nWaitSwitches=%w\n\
> Inputs=%I\nOutputs=%O\nSocketReceived=%r\nSocketSent=%s\n\
> Signals=%k\n' \
> \$BLAH_AUX_JOBWRAPPER $joboption_args $input_redirect $output_redirect
>
> fi
> else
> if [ -z "\$GNU_TIME" ] ; then
725a745
> fi

Config

Register to the Swiss level GIIS.

Testing

Start/stop procedures

Logs

[old] Operations

Interesting information like how to deal with the service.

Client tools

ngsub to submit a task
ngstat to obtain the status of jobs and clusters
ngcat to display the stdout or stderr of a running job
ngget to retrieve the result from a finished job
ngkill to cancel a job request
ngclean to delete a job from a remote cluster
ngrenew to renew user's proxy
ngsync to synchronize the local job info with the MDS
ngcopy to transfer files to, from and between clusters
ngremove to remove file

Installation/Upgrade

Installation of the EMI-3 release of ARC CE is done by CFEngine using the category ARC_CE3. Please check that emi-release package is installed; after running cfagent make sure that the directory /var/run/bdii exists and belongs to user ldap.

Then there are some changes to the arc.conf file:

changing all NULL values to *, in voms mapping.
commenting out all log files and debug levels, save for joblog, that needs to be there.
controldir variable needs to be in [grid-manager], not [common]. Otherwise some piece of Infosys doesn't work well.

Config

If you see something in the config file that is not listed here please add it.

Register to the Swiss level GIIS.

   [infosys/cluster/registration/ClusterToSwitzerland]
   targethostname="giis.lhep.unibe.ch"
   targetport="2135"
   targetsuffix="mds-vo-name=Switzerland,o=grid"
   regperiod="40"

Testing

From ui.lcg.cscs.ch, with a valid grid proxy (atlas, dech or dteam), the most simple yet complete test can be this:

ngsub -c arc01.lcg.cscs.ch -e '&("executable" = "env2" )("stdout" = "stdout" )("queue" = "cscs")(inputfiles=("env2" "file:///bin/env"))'
ngget gsiftp://arc01.lcg.cscs.ch:2811/jobs/6246129786993110697454

Also, you can retrieve some information about other arc sites with =ngtest -R=311069745492

When submitting a job using a dteam certificate the lcgadmin queue can be used.

Another way to test an ARC CE is to run from a UI:

$ arctest -c arc01.lcg.cscs.ch -J 1
Test submitted with jobid: gsiftp://arc01.lcg.cscs.ch:2811/jobs/gHHODmRf7zinZOuIepQ9oyOmABFKDmABFKDmRWFKDmABFKDmqRqWsn

$ arcstat gsiftp://arc01.lcg.cscs.ch:2811/jobs/gHHODmRf7zinZOuIepQ9oyOmABFKDmABFKDmRWFKDmABFKDmqRqWsn

Start/stop procedures

All three services can be individually restarted, but if you want to do it at once you can use grid-service restart.

Note: In the EMI 3 release grid-infosys no longer exists and the functionality is split between service nordugrid-arc-slapd and service nordugrid-arc-bdii. These two need to be started in this order (after a-rex has started), and stopped in reverse. There is also a new servicd nordugrid-arc-inforeg which needs to be started last. The grid-service2 script has been modified in PPARC_CE/files/opt/cscs/... as we still have older arc machines in production that use grid-infosys.

Checking logs

ARC logs can be complex to analyze. We have developed a tool, ngtracejob, in the path, that can do it for you. Just give an arc jobid as an argument.

[old] Manuals

Grid Manager and Gridftpd services: http://www.nordugrid.org/documents/GM.pdf
http://www.nordugrid.org/documents/ng-server-install.html
http://www.nordugrid.org/documents/ng-client-install.html
Client tools: http://wiki.nordugrid.org/index.php/NOX/Client_tutorial
http://www.nordugrid.org/NorduGridVO/
YUM configuration: http://download.nordugrid.org/repos.html
arc.conf configuration template: http://svn.nordugrid.org/trac/nordugrid/browser/arc0/trunk/doc/arc.conf.template
Notes on OPS, APEL, and how to integrate with EGI stuff: http://wiki.nordugrid.org/index.php/How_to_plug_an_ARC_site_into_EGI

[old] Issues

Information about issues found with this service, and how to deal with them.

Max memory increased by a factor of 1.5

Note: This is no longer an issue in the current Slurm setup running in production. The below is kept for historical purposes.

ATLAS needs their hard limits to be on 3 GB of memory instead of 2. For that a hack needs to be in place for /usr/share/arc/submit-pbs-job after upgrading the software

if [ ! -z "$joboption_memory" ] ; then
#  echo "#PBS -l pvmem=${joboption_memory}mb" >> $LRMS_JOB_SCRIPT
  joboption_memory_hard=`echo  $joboption_memory \* 1.5  | bc | awk -F . '{print $1}'`
  echo "#PBS -l pvmem=${joboption_memory_hard}mb" >> $LRMS_JOB_SCRIPT
  echo "#PBS -l pmem=${joboption_memory}mb" >> $LRMS_JOB_SCRIPT
fi

SubCluster publishing PhysicalCPUs and LogicalCPUs

grid-infosys was publishing the GlueSubCluster information, but our CreamCEs are the ones supposed to publish the amount of physical and logical CPUs for GSTAT.

For that, you can edit /usr/share/arc/glue-generator.pl and change the values by hand:

GlueSubClusterPhysicalCPUs: 0
GlueSubClusterLogicalCPUs: 0

OPS jobs getting into the cscs internal queue

ARC clients relay only on the information system (not glue, but the nordugrid part) to choose the queue where to send, and the CSCS queue was publishing all users in grid-mapfile as authorized. There are a few tricks to prevent this (because arc.conf is not enough) and one of them is this:

Oct 15 14:03 [root@arc02:arc]# pwd
/usr/share/arc
Oct 15 14:03 [root@arc02:arc]# diff -C3 ARC0ClusterInfo.pm.backup ARC0ClusterInfo.pm
*** ARC0ClusterInfo.pm.backup   2012-10-15 13:15:44.000000000 +0200
--- ARC0ClusterInfo.pm  2012-10-15 14:03:43.000000000 +0200
***************
*** 441,446 ****
--- 441,447 ----
                  while (1) {
                      return undef unless ($sn, $localid) = each %$usermap;
                      $lrms_user = $qinfo->{users}{$localid};
+                   if ($q->{'name'} eq "cscs" and $sn !~ m/Pablo Fern/) { next; }
                      last if not exists $qinfo->{acl_users};
                      last if grep { $_ eq $localid } @{$qinfo->{acl_users}};
                  }

Publish accounting records using APEL (with a custom blah parser)

[To be updated] Publishing accounting information with APEL has never worked within Nordugrid software (they use a different acounting method). We have developed, though, a script that converts the nordugrid usage records into blahp format. This is how the script works:

It looks into /var/spool/nordugrid/usagerecords/archive for last month's job records (one xml file per job)
Selects only those that have a PBS entry (correctly submitted) and generates a blahp record for each job (one line per job, on a file per full day)
At the end, you have a directory /var/spool/nordugrid/usagerecords/blahp with the right format (as CreamCE /var/log/cream/accounting)

Then, if you configure a blah parser that check that directory, and run it every day (like any other CreamCE) it will populate the APEL database correctly. The script does not have special requirements, but it needs to be modified to match the site parameters (it does some assumptions, but there may be no changes at all)

This is what is needed for the whole accounting to work:

Get the apel_parser_fake.py script (attached to this document) and create a CRON job to run it every day.
Install the APEL parser: get the UMD1 base/updates repository, and yum install glite-apel-pbs. You may need to work on the dependencies a bit.
- To fix the geronimo dependency, you need to downgrade log4j: rpm -U ftp://mirror.switch.ch/pool/1/mirror/scientificlinux/5rolling/x86_64/SL/log4j-1.2.13-3jpp.2.x86_64.rpm --oldpackage; rpm -e geronimo-jms-1.1-api-1.2-13.jpp5.noarch
- Copy the /etc/glite-apel-pbs/parser-config-yaim.xml file from a CreamCE (generated by yaim) and modify it:
- Modify the parser-config-yaim.xml parameters to match the specific server you are configuring. Specially /var/spool/nordugrid/usagerecords/blahp should be in the BlahdLogProcessor directory section. Also, BDII port could be 2135 instead of 2170 (in the GIIS section)
- Remove the EventsRecord section if the PBS entries are created already by another CE (to avoid duplication, but it does not hurt having it)
- Create a cron job that runs it every day (after the apel_parser_fake.py). You could probably take it from a CreamCE too.
Create a mysql username in the APEL server for this purpose (find instructions in the ApelServiceCard) that matches what you specified in the parser-config-yaim.xml file.
Have ARC publishing GLUE information with a BDII, with right CPU performance information (should have the GlueCEUniqueID entry). It should be visible by APEL publisher too.
Add your ARC server to GOCDB as an APEL entry.

apel_parser_fake.py.txt: apel_parser_fake.py.txt

How to disable job submissions on an ARC CE

To disable the job submission it is enough to modify the arc.conf file through CFEngine:

/etc/arc.conf

[...]
[gridftpd/jobs]
path="/jobs"
plugin="jobplugin.so"
allownew="yes"                      # set this to "no"
[...]

make the CFEngine agent run and then restart the grid services:

# grid-service2 restart

ServiceCardForm
Service name	Arc CE
Machines this service is installed in	arc[01,02]
Is Grid service	Yes
Depends on the following services	lrms, nas, gpfs
Expert	Gianni Ricciardi
CM	Puppet
Provisioning	PuppetForeman