Tags:
create new tag
view all tags

Service Card for Nordugrid Arc CE

It's the entry point for Nordugrid jobs, very common within Switzerland

Definition

Operations

Client tools

  • arcinfo allows you to see the status of an ARC cluster.
  • arcstat gives you information on a specific job submitted to ARC.
  • arckill cancels a job
  • arcget fetches the output of a job.
  • ldapsearch gives you information on the LDAP tree of an ARC server.
    ldapsearch -LLL -x -h pparc01.lcg.cscs.ch:2135 -b 'Mds-Vo-name=resource,o=grid'
  • arcsub submits an xrsl job to an ARC server.

Note: In https://git.cscs.ch/miguelgi/jobs, under arc-lcg there are some examples that can be used to test any ARC system.

Installation/Upgrade

  • arc01 is partially puppetised as some things are not in Puppet config yet. However this are the packages to install:
    # yum install nordugrid-arc-compute-element
    # yum install nordugrid-arc-client nordugrid-arc-doc nordugrid-arc-devel
  • arc02 is not installed yet.

Config

cachedir="/tmpdir_slurm/arc_cache"
sessiondir="/tmpdir_slurm/arc_sessiondir"
controldir="/var/spool/nordugrid/jobstatus"
runtimedir="/experiment_software/atlas/nordugrid/runtime"

Site modifications

The following modifications have been tested with the following ARC version:
nordugrid-release-15.03-1.el6.noarch
nordugrid-arc-compute-element-1.0.7-1.el6.noarch
nordugrid-arc-gridmap-utils-5.0.0-2.el6.noarch
nordugrid-arc-plugins-needed-5.0.0-2.el6.x86_64
nordugrid-arc-gridftpd-5.0.0-2.el6.x86_64
nordugrid-arc-hed-5.0.0-2.el6.x86_64
nordugrid-arc-plugins-globus-5.0.0-2.el6.x86_64
nordugrid-arc-plugins-xrootd-5.0.0-2.el6.x86_64
nordugrid-arc-arex-5.0.0-2.el6.x86_64
nordugrid-arc-python-5.0.0-2.el6.x86_64
nordugrid-arc-doc-2.0.3-1.el6.noarch
nordugrid-arc-aris-5.0.0-2.el6.noarch
nordugrid-arc-ldap-infosys-5.0.0-2.el6.noarch
nordugrid-arc-5.0.0-2.el6.x86_64
  • Infosys: ARC does not supply correct Glue 1.2 values when different not all VOs have access to all queues. However, GLUE2 is OK. CMS and LHCb currently use only Glue 1.2, so the following hack needs to be put in place to make those VOs see these ARC services:
    # diff /usr/share/arc/glue-generator.pl.orig /usr/share/arc/glue-generator.pl.modif
    465a466,475
    >             # MG/CSCS
    >             # Can be done better, but its a proof of concept
    >             if ($queue_attributes{'nordugrid-queue-name'} =~ /lhcb/ )
    >             {
    >                 @vos= ('lhcb');
    >             }
    >             if ($queue_attributes{'nordugrid-queue-name'} =~ /cms/ )
    >             {
    >                 @vos= ('cms');
    >             }
  • SLURM:
    • Job submission to reservations: In order to make certain jobs run in specific reservations (ops or dteam), the following modification needs to be put in place:
      # diff /usr/share/arc/submit-SLURM-job.orig /usr/share/arc/submit-SLURM-job
      79c79
      <   #We set the priority as 100 - arc priority.
      ---
      >   #We set the priority as 100 - arc piority.
      112a113,144
      > ###############################################################
      > # CSCS specific - added comment to jobs for accounting purpose
      > #
      > PRIORITY_JOBS_RESERVATION="priority_jobs"
      > USER=`whoami`
      > MYUSERDN=$(/usr/bin/openssl x509 -in ${X509_USER_PROXY} -subject -noout | sed -r 's/.*= (.*)/\1/g' 2>&1)
      > MYHN=$(hostname -s)
      > COMMENT="\"$MYHN,$MYUSERDN\""
      > echo "#SBATCH --comment=$COMMENT" >> $LRMS_JOB_SCRIPT
      > #
      > # CSCS specific - added reservation direction for specific users
      > #
      >
      > # OPS jobs
      > REGEX="ops[0-9][0-9][0-9]"
      > if [[ ( $USER =~ $REGEX ) ]] ; then
      >         echo "#SBATCH --reservation=${PRIORITY_JOBS_RESERVATION}">> $LRMS_JOB_SCRIPT
      > fi
      > # SGM jobs
      > if [[ ( $USER == 'opssgm' ) || ( $USER == 'cmssgm' ) || ( $USER == 'atlassgm' ) || ( $USER == 'lhcbsgm' ) || ( $USER == 'honesgm' ) ]] ; then
      >         echo "#SBATCH --reservation=${PRIORITY_JOBS_RESERVATION}" >> $LRMS_JOB_SCRIPT
      > else
      >    [[ ! -z ${MYUSERDN} ]] && [[ ( ${MYUSERDN} =~ 'Andrea Sciaba' ) ]] && echo "#SBATCH --reservation=${PRIORITY_JOBS_RESERVATION}" >> $LRMS_JOB_SCRIPT
      > fi
      > # dteam jobs
      > REGEX="dteam[0-9][0-9][0-9]"
      > if [[ ( $USER =~ $REGEX ) ]] ; then
      >         #echo "#SBATCH --reservation=${PRIORITY_JOBS_RESERVATION}" >> $LRMS_JOB_SCRIPT
      >         echo "#SBATCH --reservation=Docker" >> $LRMS_JOB_SCRIPT
      > fi
      > #### Done CSCS specific ######################################
      >
    • Links:
      # find /usr/share/arc/ -type l -ls
      2776448    0 lrwxrwxrwx   1 root     root           29 Jun  4 14:57 /usr/share/arc/scan-slurm-job -> /usr/share/arc/scan-SLURM-job
      2769747    0 lrwxrwxrwx   1 root     root           31 Jun  4 14:57 /usr/share/arc/cancel-slurm-job -> /usr/share/arc/cancel-SLURM-job
      2776509    0 lrwxrwxrwx   1 root     root           26 Jun  4 14:59 /usr/share/arc/slurmmod.pm -> /usr/share/arc/SLURMmod.pm
      2776510    0 lrwxrwxrwx   1 root     root           23 Jun  4 14:59 /usr/share/arc/slurm.pm -> /usr/share/arc/SLURM.pm
      2769674    0 lrwxrwxrwx   1 root     root           31 Jun  4 14:30 /usr/share/arc/submit-slurm-job -> /usr/share/arc/submit-SLURM-job
      2776508    0 lrwxrwxrwx   1 root     root           37 Jun  4 14:58 /usr/share/arc/configure-slurm-env.sh -> /usr/share/arc/configure-SLURM-env.sh
  • Docker:
    # diff /usr/share/arc/submit_common.sh.orig  /usr/share/arc/submit_common.sh.modif
    709a710,712
    > # CSCS modifications to make ARC work also with Docker
    > BLAH_AUX_JOBWRAPPER=/opt/cscs/bin/blah_aux_jobwrapper
    > if [ -e \${BLAH_AUX_JOBWRAPPER} ]; then
    711a715,730
    >   \$BLAH_AUX_JOBWRAPPER $joboption_args $input_redirect $output_redirect
    > else
    >   \$GNU_TIME -o "\$RUNTIME_JOB_DIAG" -a -f '\
    > WallTime=%es\nKernelTime=%Ss\nUserTime=%Us\nCPUUsage=%P\n\
    > MaxResidentMemory=%MkB\nAverageResidentMemory=%tkB\n\
    > AverageTotalMemory=%KkB\nAverageUnsharedMemory=%DkB\n\
    > AverageUnsharedStack=%pkB\nAverageSharedMemory=%XkB\n\
    > PageSize=%ZB\nMajorPageFaults=%F\nMinorPageFaults=%R\n\
    > Swaps=%W\nForcedSwitches=%c\nWaitSwitches=%w\n\
    > Inputs=%I\nOutputs=%O\nSocketReceived=%r\nSocketSent=%s\n\
    > Signals=%k\n' \
    > \$BLAH_AUX_JOBWRAPPER $joboption_args $input_redirect $output_redirect
    >
    > fi
    > else
    > if [ -z "\$GNU_TIME" ] ; then
    725a745
    > fi
  • Currently LHCb cannot submit jobs to ARC including RTEs needed (ENV/PROXY mostly), so the following workaround to allow default RTEs has been put in place (RT #21357 / https://www.gridpp.ac.uk/wiki/ARC_CE_Tips):
    • Added to /etc/arc.conf
      [grid-manager]
      authplugin="PREPARING timeout=60,onfailure=pass,onsuccess=pass /usr/local/bin/default_rte_plugin.py %S %C %I ENV/PROXY" 
    • Added to /usr/local/bin/default_rte_plugin.py
      #!/usr/bin/python
      
      """Usage: default_rte_plugin.py <status> <control dir> <jobid> <runtime environment>
      
      Authplugin for PREPARING STATE
      
      Example:
      
        authplugin="PREPARING timeout=60,onfailure=pass,onsuccess=pass /usr/local/bin/default_rte_plugin.py %S %C %I <rte>"
      
      """
      
      def ExitError(msg,code):
          """Print error message and exit"""
          from sys import exit
          print(msg)
          exit(code)
      
      def SetDefaultRTE(control_dir, jobid, default_rte):
      
          from os.path import isfile
      
          desc_file = '%s/job.%s.description' %(control_dir,jobid)
      
          if not isfile(desc_file):
             ExitError("No such description file: %s"%desc_file,1)
      
          f = open(desc_file)
          desc = f.read()
          f.close()
      
          if default_rte not in desc:
             with open(desc_file, "a") as myfile:
                myfile.write("( runtimeenvironment = \"" + default_rte + "\" )")
      
          return 0
      
      def main():
          """Main"""
      
          import sys
      
          # Parse arguments
      
          if len(sys.argv) == 5:
              (exe, status, control_dir, jobid, default_rte) = sys.argv
          else:
              ExitError("Wrong number of arguments\n"+__doc__,1)
      
          if status == "PREPARING":
              SetDefaultRTE(control_dir, jobid, default_rte)
              sys.exit(0)
      
          sys.exit(1)
      
      if __name__ == "__main__":
          main()
    • Modified /experiment_software/atlas/nordugrid/runtime/ENV/PROXY to include this:
      #!/bin/bash
      
      x509_cert_dir="/etc/grid-security/certificates"
      
      case $1 in
        0) mkdir -pv $joboption_directory/arc/certificates/
           cp -rv $x509_cert_dir/ $joboption_directory/arc
           cat ${joboption_controldir}/job.${joboption_gridid}.proxy >$joboption_directory/user.proxy
           ;;
        1) export X509_USER_PROXY=$RUNTIME_JOB_DIR/user.proxy
           export X509_USER_CERT=$RUNTIME_JOB_DIR/user.proxy
           # export X509_CERT_DIR=`pwd`/arc/certificates
           export X509_CERT_DIR=$RUNTIME_JOB_DIR/arc/certificates
           ;;
        2) :
           ;;
      esac
      
      if [ "x$1" = "x0" ]; then
        # Set environment variable containing queue name
        env_idx=0
        env_var="joboption_env_$env_idx"
        while [ -n "${!env_var}" ]; do
           env_idx=$((env_idx+1))
           env_var="joboption_env_$env_idx"
        done
        eval joboption_env_$env_idx="NORDUGRID_ARC_QUEUE=$joboption_queue"
      fi

Register to the Swiss level GIIS.

Start/stop procedures

# /etc/init.d/gridftpd start
# /etc/init.d/a-rex start
# /etc/init.d/nordugrid-arc-ldap-infosys start
And this can be used to test that LDAP is OK:
# ldapsearch -LLL -x -h arc01.lcg.cscs.ch:2135 -b o=grid

Logs

  • ARC logs are located in /var/log/arc. Some logs may have a timestamp at the end of the file.
    • gm-jobs.log
    • grid-manager.log
    • gridftpd.log: Note that this log is empty after a while. In order to see output, the service gridftpd needs to be restarted.
    • cache-clean.log
    • infoprovider.log
    • inforegistration.log
  • ARC spooldir /var/spool/nordugrid/jobstatus/ contains information on the jobs in the system:
    • job.JOBID.errors is the file containing the SBATCH submitted to SLURM

Testing

Documentation

[old] Operations

Interesting information like how to deal with the service.

Client tools

  • ngsub to submit a task
  • ngstat to obtain the status of jobs and clusters
  • ngcat to display the stdout or stderr of a running job
  • ngget to retrieve the result from a finished job
  • ngkill to cancel a job request
  • ngclean to delete a job from a remote cluster
  • ngrenew to renew user's proxy
  • ngsync to synchronize the local job info with the MDS
  • ngcopy to transfer files to, from and between clusters
  • ngremove to remove file

Installation/Upgrade

Installation of the EMI-3 release of ARC CE is done by CFEngine using the category ARC_CE3. Please check that emi-release package is installed; after running cfagent make sure that the directory /var/run/bdii exists and belongs to user ldap.

Then there are some changes to the arc.conf file:

  • changing all NULL values to *, in voms mapping.
  • commenting out all log files and debug levels, save for joblog, that needs to be there.
  • controldir variable needs to be in [grid-manager], not [common]. Otherwise some piece of Infosys doesn't work well.

Config

If you see something in the config file that is not listed here please add it.

Register to the Swiss level GIIS.

   [infosys/cluster/registration/ClusterToSwitzerland]
   targethostname="giis.lhep.unibe.ch"
   targetport="2135"
   targetsuffix="mds-vo-name=Switzerland,o=grid"
   regperiod="40"

Testing

From ui.lcg.cscs.ch, with a valid grid proxy (atlas, dech or dteam), the most simple yet complete test can be this:

ngsub -c arc01.lcg.cscs.ch -e '&("executable" = "env2" )("stdout" = "stdout" )("queue" = "cscs")(inputfiles=("env2" "file:///bin/env"))'
ngget gsiftp://arc01.lcg.cscs.ch:2811/jobs/6246129786993110697454

Also, you can retrieve some information about other arc sites with =ngtest -R=311069745492

When submitting a job using a dteam certificate the lcgadmin queue can be used.

Another way to test an ARC CE is to run from a UI:

$ arctest -c arc01.lcg.cscs.ch -J 1
Test submitted with jobid: gsiftp://arc01.lcg.cscs.ch:2811/jobs/gHHODmRf7zinZOuIepQ9oyOmABFKDmABFKDmRWFKDmABFKDmqRqWsn

$ arcstat gsiftp://arc01.lcg.cscs.ch:2811/jobs/gHHODmRf7zinZOuIepQ9oyOmABFKDmABFKDmRWFKDmABFKDmqRqWsn

Start/stop procedures

All three services can be individually restarted, but if you want to do it at once you can use grid-service restart.

Note: In the EMI 3 release grid-infosys no longer exists and the functionality is split between service nordugrid-arc-slapd and service nordugrid-arc-bdii. These two need to be started in this order (after a-rex has started), and stopped in reverse. There is also a new servicd nordugrid-arc-inforeg which needs to be started last. The grid-service2 script has been modified in PPARC_CE/files/opt/cscs/... as we still have older arc machines in production that use grid-infosys.

Checking logs

  • ARC logs can be complex to analyze. We have developed a tool, ngtracejob, in the path, that can do it for you. Just give an arc jobid as an argument.

[old] Manuals

[old] Issues

Information about issues found with this service, and how to deal with them.

Max memory increased by a factor of 1.5

Note: This is no longer an issue in the current Slurm setup running in production. The below is kept for historical purposes.

ATLAS needs their hard limits to be on 3 GB of memory instead of 2. For that a hack needs to be in place for /usr/share/arc/submit-pbs-job after upgrading the software

if [ ! -z "$joboption_memory" ] ; then
#  echo "#PBS -l pvmem=${joboption_memory}mb" >> $LRMS_JOB_SCRIPT
  joboption_memory_hard=`echo  $joboption_memory \* 1.5  | bc | awk -F . '{print $1}'`
  echo "#PBS -l pvmem=${joboption_memory_hard}mb" >> $LRMS_JOB_SCRIPT
  echo "#PBS -l pmem=${joboption_memory}mb" >> $LRMS_JOB_SCRIPT
fi

SubCluster publishing PhysicalCPUs and LogicalCPUs

grid-infosys was publishing the GlueSubCluster information, but our CreamCEs are the ones supposed to publish the amount of physical and logical CPUs for GSTAT.

For that, you can edit /usr/share/arc/glue-generator.pl and change the values by hand:

GlueSubClusterPhysicalCPUs: 0
GlueSubClusterLogicalCPUs: 0 

OPS jobs getting into the cscs internal queue

ARC clients relay only on the information system (not glue, but the nordugrid part) to choose the queue where to send, and the CSCS queue was publishing all users in grid-mapfile as authorized. There are a few tricks to prevent this (because arc.conf is not enough) and one of them is this:

Oct 15 14:03 [root@arc02:arc]# pwd
/usr/share/arc
Oct 15 14:03 [root@arc02:arc]# diff -C3 ARC0ClusterInfo.pm.backup ARC0ClusterInfo.pm
*** ARC0ClusterInfo.pm.backup   2012-10-15 13:15:44.000000000 +0200
--- ARC0ClusterInfo.pm  2012-10-15 14:03:43.000000000 +0200
***************
*** 441,446 ****
--- 441,447 ----
                  while (1) {
                      return undef unless ($sn, $localid) = each %$usermap;
                      $lrms_user = $qinfo->{users}{$localid};
+                   if ($q->{'name'} eq "cscs" and $sn !~ m/Pablo Fern/) { next; }
                      last if not exists $qinfo->{acl_users};
                      last if grep { $_ eq $localid } @{$qinfo->{acl_users}};
                  } 

Publish accounting records using APEL (with a custom blah parser)

[To be updated] Publishing accounting information with APEL has never worked within Nordugrid software (they use a different acounting method). We have developed, though, a script that converts the nordugrid usage records into blahp format. This is how the script works:

  • It looks into /var/spool/nordugrid/usagerecords/archive for last month's job records (one xml file per job)
  • Selects only those that have a PBS entry (correctly submitted) and generates a blahp record for each job (one line per job, on a file per full day)
  • At the end, you have a directory /var/spool/nordugrid/usagerecords/blahp with the right format (as CreamCE /var/log/cream/accounting)
Then, if you configure a blah parser that check that directory, and run it every day (like any other CreamCE) it will populate the APEL database correctly. The script does not have special requirements, but it needs to be modified to match the site parameters (it does some assumptions, but there may be no changes at all)

This is what is needed for the whole accounting to work:

  • Get the apel_parser_fake.py script (attached to this document) and create a CRON job to run it every day.
  • Install the APEL parser: get the UMD1 base/updates repository, and yum install glite-apel-pbs. You may need to work on the dependencies a bit.
    • To fix the geronimo dependency, you need to downgrade log4j: rpm -U ftp://mirror.switch.ch/pool/1/mirror/scientificlinux/5rolling/x86_64/SL/log4j-1.2.13-3jpp.2.x86_64.rpm --oldpackage; rpm -e geronimo-jms-1.1-api-1.2-13.jpp5.noarch
    • Copy the /etc/glite-apel-pbs/parser-config-yaim.xml file from a CreamCE (generated by yaim) and modify it:
    • Modify the parser-config-yaim.xml parameters to match the specific server you are configuring. Specially /var/spool/nordugrid/usagerecords/blahp should be in the BlahdLogProcessor directory section. Also, BDII port could be 2135 instead of 2170 (in the GIIS section)
    • Remove the EventsRecord section if the PBS entries are created already by another CE (to avoid duplication, but it does not hurt having it)
    • Create a cron job that runs it every day (after the apel_parser_fake.py). You could probably take it from a CreamCE too.
  • Create a mysql username in the APEL server for this purpose (find instructions in the ApelServiceCard) that matches what you specified in the parser-config-yaim.xml file.
  • Have ARC publishing GLUE information with a BDII, with right CPU performance information (should have the GlueCEUniqueID entry). It should be visible by APEL publisher too.
  • Add your ARC server to GOCDB as an APEL entry.

How to disable job submissions on an ARC CE

To disable the job submission it is enough to modify the arc.conf file through CFEngine:

/etc/arc.conf

[...]
[gridftpd/jobs]
path="/jobs"
plugin="jobplugin.so"
allownew="yes"                      # set this to "no"
[...]

make the CFEngine agent run and then restart the grid services:

# grid-service2 restart
ServiceCardForm
Service name Arc CE
Machines this service is installed in arc[01,02]
Is Grid service Yes
Depends on the following services lrms, nas, gpfs
Expert Gianni Ricciardi
CM Puppet
Provisioning PuppetForeman
Edit | Attach | Watch | Print version | History: r25 < r24 < r23 < r22 < r21 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r25 - 2015-12-09 - MiguelGila
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback