How to access, set up, and test your account
Compulsory Setups
All the documentation is maintained in the T3 twiki pages:
https://wiki.chipp.ch/twiki/bin/view/CmsTier3/WebHome . Please bookmark it and explore the T3's capabilities.
Information about the two T3 mailing-lists:
- Subscribe to the
cms-tier3-users@lists.psi.ch
mailing list using its web interface (list archives): this mailing list is used to communicate information on Tier-3 matters (downtimes, outages, news, upgrades, etc.) or for discussions among users and admins.
- To privately contact the CMS Tier-3 administrators write to
cms-tier3@lists.psi.ch
instead ; no subscription is needed for this mailing list.
- Both lists are read by the administrators and are archived. You can submit support requests to either of them but emails addressed to the
cms-tier3-users@lists.psi.ch
are read by everyone so they could get answered better and sooner, especially if you ask about specific CMS software ( CRAB3, CMSSW, Xrootd, ... )
T3 policies
Read and respect the
Tier3Policies
Linux groups
Each T3 account belongs to a primary group and to a common secondary group
cms that's used for shared files like the ones downloaded on demand by the
PhEDEx service ; the primary groups are :
ETHZ |
UniZ |
PSI |
ethz-bphys |
uniz-pixel |
psi-pixel |
ethz-ecal |
uniz-higgs |
psi-bphys |
ethz-ewk |
uniz-bphys |
|
ethz-higgs |
|
|
ethz-susy |
|
|
These primary groups allow :
- an easy
/pnfs /shome /scratch /tmp
space monitoring
- an easy batch system monitoring
- to protect the user
/pnfs
files since only him/her will be able to delete his/her own files ; that's usually NOT guaranteed by the others CMS T1/T2/T3 !
- to protect the T3 "group dirs"
/pnfs/psi.ch/cms/trivcat/store/t3groups/
since only the group members will be able to upload/delete their own files
For instance these are the
primary and the
secondary group of a generic T3 account :
$ id auser
uid=571(auser) gid=532(ethz-higgs) groups=532(ethz-higgs),500(cms)
The following is a partial list of the private user dirs
/pnfs/psi.ch/cms/trivcat/store/user/
on which it's well visible the dirs protection :
$ ls -l /pnfs/psi.ch/cms/trivcat/store/user | grep -v cms
total 56
drwxr-xr-x 2 alschmid uniz-bphys 512 Feb 21 2013 alschmid
drwxr-xr-x 5 amarini ethz-ewk 512 Nov 7 15:37 amarini
drwxr-xr-x 2 arizzi ethz-bphys 512 Sep 16 17:49 arizzi
drwxr-xr-x 5 bean psi-bphys 512 Aug 24 2010 bean
drwxr-xr-x 5 bianchi ethz-higgs 512 Sep 9 09:40 bianchi
drwxr-xr-x 98 buchmann ethz-susy 512 Nov 5 20:36 buchmann
...
The following are the T3 "group dirs"
/pnfs/psi.ch/cms/trivcat/store/t3groups/
; since they're meant to serve a group they belong to the
root
account, accordingly no user will be able to remove these dirs :
$ ls -l /pnfs/psi.ch/cms/trivcat/store/t3groups/
total 5
drwxrwxr-x 2 root ethz-bphys 512 Nov 8 15:18 ethz-bphys
drwxrwxr-x 2 root ethz-ecal 512 Nov 8 15:18 ethz-ecal
drwxrwxr-x 2 root ethz-ewk 512 Nov 8 15:18 ethz-ewk
drwxrwxr-x 2 root ethz-higgs 512 Nov 8 15:18 ethz-higgs
drwxrwxr-x 2 root ethz-susy 512 Nov 8 15:18 ethz-susy
drwxrwxr-x 2 root psi-bphys 512 Nov 8 15:18 psi-bphys
drwxrwxr-x 2 root psi-pixel 512 Nov 8 15:18 psi-pixel
drwxrwxr-x 2 root uniz-bphys 512 Nov 8 15:18 uniz-bphys
drwxrwxr-x 2 root uniz-higgs 512 Nov 8 15:18 uniz-higgs
drwxrwxr-x 2 root uniz-pixel 512 Nov 8 15:18 uniz-pixel
User Interfaces ( UI )
Three similar User Interfaces ( UI ) servers are available to both develop the programs and to send the computational jobs to the T3 batch system by the
qsub
command :
Access to Login nodes is based on the institution
The access is not restricted to allow for some freedom, but you are requested to use the UI dedicated to your institution.
UI Login node |
for institution |
HW specs |
t3ui01.psi.ch |
ETHZ, PSI |
132GB RAM , 72 CPUs core (HT), 5TB /scratch |
t3ui02.psi.ch |
All |
132GB RAM , 72 CPUs core (HT), 5TB /scratch |
t3ui03.psi.ch |
UNIZ |
132GB RAM , 72 CPUs core (HT), 5TB /scratch |
- Login into a
t3ui0*
machine by ssh
; use -Y
or -X
flag for working with X applications; you might also try to connect by NX client, which allows to work efficiently with your graphical applications
ssh -Y username@t3ui02.psi.ch
- If you are an external PSI user you'll have to change your initial password the first time you log in; simply use the standard
passwd
tool.
- Copy your grid credentials to the standard places, i.e. to
~/.globus/userkey.pem
and ~/.globus/usercert.pem
and make sure that their files permissions are properly set :
-rw-r--r-- 1 feichtinger cms 2961 Mar 17 2008 usercert.pem
-r-------- 1 feichtinger cms 1917 Mar 17 2008 userkey.pem
For details about how to extract those .pem
files from your CERN User Grid-Certificate please read https://gridca.cern.ch/gridca/Help/?kbid=024010.
- Source the grid environment associated to your login shell:
source /swshare/psit3/etc/profile.d/cms_ui_env.sh # for bash
source /swshare/psit3/etc/profile.d/cms_ui_env.csh # for tcsh
- ( Optional ) Modify your shell init files in order to automatically load the grid environment ; for BASH that means placing :
[ `echo $HOSTNAME | grep t3ui` ] && [ -r /swshare/psit3/etc/profile.d/cms_ui_env.sh ] && source /swshare/psit3/etc/profile.d/cms_ui_env.sh && echo "UI features enabled"
into your ~/.bash_profile
file.
- Run
env|sort
and verify that /swshare/psit3/etc/profile.d/cms_ui_env.{sh,csh}
has properly activated the setting X509_USER_PROXY=/shome/$(id -un)/.x509up_u$(id -u)"
; that setting is crucial to access a CMS Grid SE from your T3 jobs.
- You must register to the CMS "Virtual Organization" service or the following command
voms-proxy-init -voms cms
won't work. CERN details about that, e.g. who is your representative.
- Create a proxy certificate for CMS by:
voms-proxy-init -voms cms
If the command voms-proxy-init -voms cms
will fail then run the command with an additional -debug
flag, the error message will be usually sufficient for the T3 Admins to troubleshoot the problem.
- Test your access to the PSI Storage element by the
test-dCacheProtocols
command ; you should get an output like this (possibly without failed test) ; sometime the XROOTD-WAN-* tests might get stuck due to a I/O traffic coming from Internet but as a local T3 user you're actually supposed to use the XROOTD-LAN-* I/O doors that are protected from the Internet users, so you can simply skip the XROOTD-WAN-* tests by either pressing Ctrl-C or by passing the option : -i "XROOTD-LAN-write" ( see below ) More... Close
$ test-dCacheProtocols
Test directory: /tmp/dcachetest-20150529-1449-14476
TEST: GFTP-write ...... [OK] <-- vs gsiftp://t3se01.psi.ch:2811/
TEST: GFTP-ls ...... [OK]
TEST: GFTP-read ...... [OK]
TEST: DCAP-read ...... [OK] <-- vs dcap://t3se01.psi.ch:22125/
TEST: SRMv2-write ...... [OK] <-- vs srm://t3se01.psi.ch:8443/
TEST: SRMv2-ls ...... [OK]
TEST: SRMv2-read ...... [OK]
TEST: SRMv2-rm ...... [OK]
TEST: XROOTD-LAN-write ...... [OK] <-- vs root://t3dcachedb03.psi.ch:1094/ <-- Use this if you run LOCAL jobs at T3 and you need root:// access to the T3 files
TEST: XROOTD-LAN-ls ...... [OK]
TEST: XROOTD-LAN-read ...... [OK]
TEST: XROOTD-LAN-rm ...... [OK]
TEST: XROOTD-WAN-write ...... [OK] <-- vs root://t3se01.psi.ch:1094/ <-- Use this if you run REMOTE jobs and you need root:// access to the T3 files ; e.g. you're working on lxplus
TEST: XROOTD-WAN-ls ...... [OK]
TEST: XROOTD-WAN-read ...... [OK]
TEST: XROOTD-WAN-rm ...... [OK]
- The
test-dCacheProtocols
tool can be also used to check a remote storage element (use the -h
flag to get more info about it): e.g. to check the CSCS storage element storage01.lcg.cscs.ch
: More... Close
$ test-dCacheProtocols -s storage01.lcg.cscs.ch -x storage01.lcg.cscs.ch -p /pnfs/lcg.cscs.ch/cms/trivcat/store/user/martinel -i "DCAP-read XROOTD-LAN-write XROOTD-WAN-write"
Test directory: /tmp/dcachetest-20150529-1545-16302
TEST: GFTP-write ...... [OK]
TEST: GFTP-ls ...... [OK]
TEST: GFTP-read ...... [OK]
TEST: DCAP-read ...... [IGNORE]
TEST: SRMv2-write ...... [OK]
TEST: SRMv2-ls ...... [OK]
TEST: SRMv2-read ...... [OK]
TEST: SRMv2-rm ...... [OK]
TEST: XROOTD-LAN-write ...... [IGNORE]
TEST: XROOTD-LAN-ls ...... [SKIPPED] (dependencies did not run: XROOTD-LAN-write)
TEST: XROOTD-LAN-read ...... [SKIPPED] (dependencies did not run: XROOTD-LAN-write)
TEST: XROOTD-LAN-rm ...... [SKIPPED] (dependencies did not run: XROOTD-LAN-write)
TEST: XROOTD-WAN-write ...... [IGNORE]
TEST: XROOTD-WAN-ls ...... [SKIPPED] (dependencies did not run: XROOTD-WAN-write)
TEST: XROOTD-WAN-read ...... [SKIPPED] (dependencies did not run: XROOTD-WAN-write)
TEST: XROOTD-WAN-rm ...... [SKIPPED] (dependencies did not run: XROOTD-WAN-write)
Optional Setups
Installing the CERN CA files into your Web Browser
Install in your Web Browser any
CERN CA file, conversely your Web Browser might constantly bother you about all the CERN
https://
URLs ; typically the Web Browsers feature many well known
CA files by default but not the CERN CA files.
Applying for the VOMS Group /cms/chcms
membership
It's available a 'Swiss' VOMS Group
/cms/chcms
to assign more CPU/Storage priority to the community of LHC physicist in Switzerland ;
All the CMS Swiss users should apply for the VOMS Group /cms/chcms
membership in order to get :
- a higher priority on the T2_CH_CSCS batch queues
- more Jobs slots on the T2_CH_CSCS batch queues
- more
/pnfs
space in the T2_CH_CSCS grid storage
- in the future, the same file protection mechanism featured by the PSI T3
Once the
/cms/chcms
membership will be granted the
voms-proxy-init --voms cms
command output will get :
$ voms-proxy-info --all | grep /cms
attribute : /cms/Role=NULL/Capability=NULL
attribute : /cms/chcms/Role=NULL/Capability=NULL
To apply for the
/cms/chcms
membership load your X509 into your daily Web Browser ( but probably the X509 is already loaded there ), then click on
https://voms2.cern.ch:8443/voms/cms/group/edit.action?groupId=5 and ask for the
/cms/chcms
membership ; be aware that the port
:8443
might be blocked by your Insitute Firewall, if that's the case contact your Institute network team or simply try from another network.
Saving the UIs SSH pub host keys
The hackers are constantly waiting for user mistakes, even a simple misspelled
letter like in this case occurred in 2015 :
$ ssh t3ui02.psi.sh
The authenticity of host 't3ui02.psi.sh (62.210.217.195)' can't be established.
RSA key fingerprint is c0:c5:af:36:4b:2d:1f:88:0d:f3:9c:08:cc:87:df:42.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 't3ui02.psi.sh,62.210.217.195' (RSA) to the list of known hosts.
at3user@t3ui02.psi.sh's password:
The T3 Admins can't prevent a T3 user from confusing a .ch
with a .sh
so pay attention to these cases ! To avoid mistaking the T3 hostnames you can define the following aliases in your shell :
$ grep alias ~/.bash_profile | grep t3ui
alias ui01='ssh -X at3user@t3ui01.psi.ch'
alias ui02='ssh -X at3user@t3ui02.psi.ch'
alias ui03='ssh -X at3user@t3ui03.psi.ch'
Another hackers attack is the
SSH man in the middle attack ; to prevent it proactively save in
/$HOME/.ssh/known_hosts
each
t3ui0*
SSH RSA public key by running these commands on each of your daily laptop/PC/server ( also on
lxplus
! ) :
cp -p /$HOME/.ssh/known_hosts /$HOME/.ssh/known_hosts.`date +"%d-%m-%Y"`
mkdir /tmp/t3ssh/
for X in 01 02 03 ; do TMPFILE=`mktemp /tmp/t3ssh/XXXXXX` && ssh-keyscan -t rsa t3ui$X.psi.ch,t3ui$X,`host t3ui$X.psi.ch| awk '{ print $4}'` | cat - /$HOME/.ssh/known_hosts | grep -v 'psi\.sh' > $TMPFILE && mv $TMPFILE /$HOME/.ssh/known_hosts ; done
rm -rf /tmp/t3ssh
for X in 01 02 03 ; do echo -n "# entries for t3ui$X = " ; grep -c t3ui$X /$HOME/.ssh/known_hosts ; grep -Hn --color t3ui$X /$HOME/.ssh/known_hosts ; echo ; done
echo done
the last
for
reports if there are duplicated rows in
/$HOME/.ssh/known_hosts
for a
t3ui0*
server ; and if there are then you've to preserve the correct occurrence and delete the others ; to delete you can either use
sed -i
or an editor like
vim
/
emacs
/
nano
/
nedit
; once you'll get just one row per
t3ui0*
server run this command and carefully compare your output with this output:
More... Close
$ ssh-keygen -l -f /$HOME/.ssh/known_hosts | grep t3ui
2048 SHA256:0Z8Su5R4aZthbePGMM14mEKxYFOuKyrnUe9GjU0m6vM t3ui01.psi.ch,192.33.123.23 (RSA)
2048 SHA256:2qA9YDNeOEbGYjIdpRdBJpywQDne5gRbRvN/myL5P8o t3ui02.psi.ch,192.33.123.29 (RSA)
2048 SHA256:SoIL0H0ueyASNkyYID3a16AIHuAEP7AQ5iaQ6vrvzfk t3ui03.psi.ch,192.33.123.85 (RSA)
modify your client
/$HOME/.ssh/config
in order to force the ssh command to
always check if the server you're connecting to is already reported in the
/$HOME/.ssh/known_hosts
file and to ask for your 'OK' for all the servers that are missing :
StrictHostKeychecking ask
your
/$HOME/.ssh/config
can be more complex than just that line, study the
ssh_config man page or contact the T3 Admins; ideally you should put
StrictHostKeychecking yes
but in real life that's impractical.
now your ssh client will be able to detect the
SSH man in the middle attacks and if so it will report :
WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the RSA host key has just been changed.
The
t3ui0*
SSH RSA public/private keys will
never change, so the case
It is also possible that the RSA host key has just been changed
will actually
never occurs.
Creating an AFS CERN Ticket
To access the CERN
/afs
protected dirs ( e.g. your CERN home on AFS ) you'll need to create a ticket from CERN AFS :
kinit ${Your_CERN_Username}@CERN.CH
aklog cern.ch
The first command will provide you a Kerberos ticket while the second command will use the Kerberos ticket to obtain an authentication token from CERN's AFS service
The T3 Admins Skype Accounts
To both help the T3 users with their daily T3/T2 errors and to support their 'what-if' T3/T2 future plans add the principal T3 Administrator 'Fabio Martinelli'
fabio.martinelli_2
to your professional Skype account.
Please notice that this Skype account is meant as a 2nd level of support ; 1st of all
always write an email to
cms-tier3@lists.psi.ch
describing your error, possibly how to reproduce it, on which UI you're working and providing as many meaningful logs you can.
Backup policies
The user
/shome
files are backuped
every hour, max 36h, and
every day, max 10 days ; recover a file is as simple as running a
cp
command ; Further details are here
HowToRetrieveBackupFiles. There are NO backups for the
/scratch
or
/pnfs
files instead, be careful !
Web browsing your /shome
files on demand
We don't provide a
http{s}://
URL to browse your
/shome
logs/errors/programs because there was always a modest interest about a such web portal but you can turn on a private website rooted on an arbitrary
dir of yours by simply using SSH + Python like in the following example ( replace
t3ui02
with your daily
t3ui
server and the
dir with a dir meaningful for your case, for instance
~ ):
ssh -L 8000:t3ui02.psi.ch:8000 t3user@t3ui02.psi.ch "killall python ; cd /mnt/t3nfs01/data01/shome/ytakahas/work/TauTau/SFrameAnalysis/Scripts/ && python -m SimpleHTTPServer"
open your Web browser to the page
http://localhost:8000/ . That's it.
the preliminary
killall python;
command is meant to kill a previous
python -m SimpleHTTPServer
run that might be still active but if you've other
python
programs
they will be also killed ; in that case delete the initial
killall python;
command and kill a previous
python -m SimpleHTTPServer
command by :
t3ui02 $ kill -9 `pgrep -f "^python -m SimpleHTTPServer*"`
if some other T3 user is already using the
t3ui02.psi.ch:8000
TCP port then use another port like
8001,
8002, etc.. :
ssh -L 8000:t3ui02.psi.ch:8001 t3user@t3ui02.psi.ch "killall python ; cd /mnt/t3nfs01/data01/shome/ytakahas/work/TauTau/SFrameAnalysis/Scripts/ && python -m SimpleHTTPServer 8001"