How to access, set up, and test your account
Preliminary steps
All the documentation is maintained in the T3 twiki pages:
https://wiki.chipp.ch/twiki/bin/view/CmsTier3/WebHome . Please bookmark it and explore the T3's capabilities.
Information about the two T3 mailing-lists:
- Subscribe to the
cms-tier3-users@lists.psi.ch
mailing list using its web interface (list archives): this mailing list is used to communicate information on Tier-3 matters (downtimes, outages, news, upgrades, etc.) or for discussions among users and admins.
- To privately contact the CMS Tier-3 administrators write to
cms-tier3@lists.psi.ch
instead ; no subscription is needed for this mailing list.
- Both lists are read by the administrators and are archived. You can submit support requests to either of them but emails addressed to the
cms-tier3-users@lists.psi.ch
are read by everyone so they could get answered better and sooner, especially if you ask about specific CMS software ( CRAB3, CMSSW, Xrootd, ... )
T3 policies
Read and follow the
Tier3Policies
Specific primary groups + a common secondary group
Each T3 account belongs to a primary group and to the general secondary group
cms that's used for common files like the ones uploaded at T3 by
PhEDEx, the CMS dataset transfer service installed in each T1/T2/T3 ; the primary groups are :
ETHZ |
UniZ |
PSI |
ethz-ewk |
uniz-bphys |
|
ethz-higgs |
|
|
ethz-susy |
|
|
ethz-ecal |
uniz-higgs |
psi-bphys |
ethz-bphys |
uniz-pixel |
psi-pixel |
The primary groups allow :
- an intuitive
/pnfs /shome /scratch /tmp
space usage monitoring and batch system accounting
-
/pnfs
file and dir locking since only the file owner can delete his/her files ; this protection is usually NOT guaranteed by any CMS T1/T2/T3 !
- a safe management of the T3 "group dirs"
/pnfs/psi.ch/cms/trivcat/store/t3groups/
where only the group members can upload/delete files
As an example these are the
primary and the
secondary group of a generic T3 account :
$ id auser
uid=571(auser) gid=532(ethz-higgs) groups=532(ethz-higgs),500(cms)
The following is an overview of the T3 accounts dirs
/pnfs/psi.ch/cms/trivcat/store/user/
where it's well visible the dirs protection :
$ ls -l /pnfs/psi.ch/cms/trivcat/store/user | grep -v cms
total 56
drwxr-xr-x 2 alschmid uniz-bphys 512 Feb 21 2013 alschmid
drwxr-xr-x 5 amarini ethz-ewk 512 Nov 7 15:37 amarini
drwxr-xr-x 2 arizzi ethz-bphys 512 Sep 16 17:49 arizzi
drwxr-xr-x 5 bean psi-bphys 512 Aug 24 2010 bean
drwxr-xr-x 5 bianchi ethz-higgs 512 Sep 9 09:40 bianchi
drwxr-xr-x 98 buchmann ethz-susy 512 Nov 5 20:36 buchmann
...
The following are instead the T3 "group dirs"
/pnfs/psi.ch/cms/trivcat/store/t3groups/
and since they're "group dirs" there don't have a specific owner and belong to the
root
account :
$ ls -l /pnfs/psi.ch/cms/trivcat/store/t3groups/
total 5
drwxrwxr-x 2 root ethz-bphys 512 Nov 8 15:18 ethz-bphys
drwxrwxr-x 2 root ethz-ecal 512 Nov 8 15:18 ethz-ecal
drwxrwxr-x 2 root ethz-ewk 512 Nov 8 15:18 ethz-ewk
drwxrwxr-x 2 root ethz-higgs 512 Nov 8 15:18 ethz-higgs
drwxrwxr-x 2 root ethz-susy 512 Nov 8 15:18 ethz-susy
drwxrwxr-x 2 root psi-bphys 512 Nov 8 15:18 psi-bphys
drwxrwxr-x 2 root psi-pixel 512 Nov 8 15:18 psi-pixel
drwxrwxr-x 2 root uniz-bphys 512 Nov 8 15:18 uniz-bphys
drwxrwxr-x 2 root uniz-higgs 512 Nov 8 15:18 uniz-higgs
drwxrwxr-x 2 root uniz-pixel 512 Nov 8 15:18 uniz-pixel
Your T3 account and the UIs
We provide the following SL6 user interfaces ( UIs ) meant to both develop/test your programs and to send them as jobs to the batch system :
Access to Login nodes is based on the institution
The access is not restricted to allow for some freedom, but you are requested to use the UI dedicated to your institution.
UI Login node |
for institution |
HW specs |
t3ui01.psi.ch |
ETHZ, PSI |
132GB RAM , 72 CPUs core (HT), 5TB /scratch |
t3ui02.psi.ch |
All |
132GB RAM , 72 CPUs core (HT), 5TB /scratch |
t3ui03.psi.ch |
UNIZ |
132GB RAM , 72 CPUs core (HT), 5TB /scratch |
- Login into a
t3ui1*
machine by ssh
; use -Y
or -X
flag for working with X applications; you might also try to connect by NX client, which allows to work efficiently with your graphical applications
ssh -Y username@t3ui12.psi.ch
- If you are an external PSI user you'll have to change your initial password the first time you log in; simply use the standard
passwd
tool.
- Copy your grid credentials to the standard places, i.e. to
~/.globus/userkey.pem
and ~/.globus/usercert.pem
and make sure that their files permissions are properly set :
-rw-r--r-- 1 feichtinger cms 2961 Mar 17 2008 usercert.pem
-r-------- 1 feichtinger cms 1917 Mar 17 2008 userkey.pem
For details about how to extract those .pem
files from your CERN User Grid-Certificate please read https://gridca.cern.ch/gridca/Help/?kbid=024010.
- Source the grid environment associated to your login shell:
source /swshare/psit3/etc/profile.d/cms_ui_env.sh # for bash
source /swshare/psit3/etc/profile.d/cms_ui_env.csh # for tcsh
- ( Optional ) Modify your shell init files in order to automatically load the grid environment ; for BASH that means placing :
[ `echo $HOSTNAME | grep t3ui` ] && [ -r /swshare/psit3/etc/profile.d/cms_ui_env.sh ] && source /swshare/psit3/etc/profile.d/cms_ui_env.sh && echo "UI features enabled"
into your ~/.bash_profile
file.
- Run
env|sort
and verify that /swshare/psit3/etc/profile.d/cms_ui_env.{sh,csh}
has properly activated the setting X509_USER_PROXY=/shome/$(id -un)/.x509up_u$(id -u)"
; that setting is crucial to access a CMS Grid SE from your T3 jobs.
- You must register to the CMS "Virtual Organization" service or the following command
voms-proxy-init -voms cms
won't work. CERN details about that, e.g. who is your representative.
- Create a proxy certificate for CMS by:
voms-proxy-init -voms cms
If the command voms-proxy-init -voms cms
will fail then run the command with an additional -debug
flag, the error message will be usually sufficient for the T3 Admins to troubleshoot the problem.
- Test your access to the PSI Storage element by the
test-dCacheProtocols
command ; you should get an output like this (possibly without failed test) ; sometime the XROOTD-WAN-* tests might get stuck due to a I/O traffic coming from Internet but as a local T3 user you're actually supposed to use the XROOTD-LAN-* I/O doors that are protected from the Internet users, so you can simply skip the XROOTD-WAN-* tests by either pressing Ctrl-C or by passing the option : -i "XROOTD-LAN-write" ( see below )
$ test-dCacheProtocols
Test directory: /tmp/dcachetest-20150529-1449-14476
TEST: GFTP-write ...... [OK] <-- vs gsiftp://t3se01.psi.ch:2811/
TEST: GFTP-ls ...... [OK]
TEST: GFTP-read ...... [OK]
TEST: DCAP-read ...... [OK] <-- vs dcap://t3se01.psi.ch:22125/
TEST: SRMv2-write ...... [OK] <-- vs srm://t3se01.psi.ch:8443/
TEST: SRMv2-ls ...... [OK]
TEST: SRMv2-read ...... [OK]
TEST: SRMv2-rm ...... [OK]
TEST: XROOTD-LAN-write ...... [OK] <-- vs root://t3dcachedb03.psi.ch:1094/ <-- Use this if you run LOCAL jobs at T3 and you need root:// access to the T3 files
TEST: XROOTD-LAN-ls ...... [OK]
TEST: XROOTD-LAN-read ...... [OK]
TEST: XROOTD-LAN-rm ...... [OK]
TEST: XROOTD-WAN-write ...... [OK] <-- vs root://t3se01.psi.ch:1094/ <-- Use this if you run REMOTE jobs and you need root:// access to the T3 files ; e.g. you're working on lxplus
TEST: XROOTD-WAN-ls ...... [OK]
TEST: XROOTD-WAN-read ...... [OK]
TEST: XROOTD-WAN-rm ...... [OK]
- The
test-dCacheProtocols
tool can be also used to check a remote storage element (use the -h
flag to get more info about it): e.g. to check the CSCS storage element storage01.lcg.cscs.ch
:
$ test-dCacheProtocols -s storage01.lcg.cscs.ch -x storage01.lcg.cscs.ch -p /pnfs/lcg.cscs.ch/cms/trivcat/store/user/martinel -i "DCAP-read XROOTD-LAN-write XROOTD-WAN-write"
Test directory: /tmp/dcachetest-20150529-1545-16302
TEST: GFTP-write ...... [OK]
TEST: GFTP-ls ...... [OK]
TEST: GFTP-read ...... [OK]
TEST: DCAP-read ...... [IGNORE]
TEST: SRMv2-write ...... [OK]
TEST: SRMv2-ls ...... [OK]
TEST: SRMv2-read ...... [OK]
TEST: SRMv2-rm ...... [OK]
TEST: XROOTD-LAN-write ...... [IGNORE]
TEST: XROOTD-LAN-ls ...... [SKIPPED] (dependencies did not run: XROOTD-LAN-write)
TEST: XROOTD-LAN-read ...... [SKIPPED] (dependencies did not run: XROOTD-LAN-write)
TEST: XROOTD-LAN-rm ...... [SKIPPED] (dependencies did not run: XROOTD-LAN-write)
TEST: XROOTD-WAN-write ...... [IGNORE]
TEST: XROOTD-WAN-ls ...... [SKIPPED] (dependencies did not run: XROOTD-WAN-write)
TEST: XROOTD-WAN-read ...... [SKIPPED] (dependencies did not run: XROOTD-WAN-write)
TEST: XROOTD-WAN-rm ...... [SKIPPED] (dependencies did not run: XROOTD-WAN-write)
Creating an AFS CERN Ticket
In order to access the CERN
/afs
protected dirs ( e.g. your home ) you'll need to create a ticket from CERN AFS :
kinit ${Your_CERN_Username}@CERN.CH
aklog cern.ch
The first command will provide you a Kerberos ticket while the second command will use the Kerberos ticket to obtain an authentication token from CERN's AFS service
Saving the t3ui1* SSH pub keys into your daily laptop/desktop/server
On Internet the hackers are constantly waiting for user mistakes, even a simple misspelled
letter like this case occurred in 2015 :
$ ssh t3ui02.psi.sh
The authenticity of host 't3ui02.psi.sh (62.210.217.195)' can't be established.
RSA key fingerprint is c0:c5:af:36:4b:2d:1f:88:0d:f3:9c:08:cc:87:df:42.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 't3ui02.psi.sh,62.210.217.195' (RSA) to the list of known hosts.
at3user@t3ui02.psi.sh's password:
The T3 Admins can't prevent a T3 user from confusing a .ch
with a .sh
so pay attention to these cases ! To avoid mistaking the T3 hostnames you can define these aliases into your shell files, for instance for BASH :
More... Close
$ grep alias ~/.bash_profile | grep t3ui
alias ui12='ssh -X at3user@t3ui12.psi.ch'
alias ui15='ssh -X at3user@t3ui15.psi.ch'
alias ui16='ssh -X at3user@t3ui16.psi.ch'
alias ui17='ssh -X at3user@t3ui17.psi.ch'
alias ui18='ssh -X at3user@t3ui18.psi.ch'
alias ui19='ssh -X at3user@t3ui19.psi.ch'
Further attacks are the
SSH man in the middle attacks ; in oder to detect them you have to register in
/$HOME/.ssh/known_hosts
each
t3ui1*
SSH RSA public key by running these steps on each laptop/desktop/server ( also
lxplus
! ) that you'll usually use to login at T3:
cp -p /$HOME/.ssh/known_hosts /$HOME/.ssh/known_hosts.`date +"%d-%m-%Y"`
mkdir /tmp/t3ssh/
for X in 19 18 17 16 15 12 ; do TMPFILE=`mktemp /tmp/t3ssh/XXXXXX` && ssh-keyscan -t rsa t3ui$X.psi.ch,t3ui$X,`host t3ui$X.psi.ch| awk '{ print $4}'` | cat - /$HOME/.ssh/known_hosts | grep -v 'psi\.sh' > $TMPFILE && mv $TMPFILE /$HOME/.ssh/known_hosts ; done
rm -rf /tmp/t3ssh
for X in 12 15 16 17 18 19 ; do echo -n "# entries for t3ui$X = " ; grep -c t3ui$X /$HOME/.ssh/known_hosts ; grep -Hn --color t3ui$X /$HOME/.ssh/known_hosts ; echo ; done
echo done
last
for
reports if there are duplicated rows in
/$HOME/.ssh/known_hosts
for a
t3ui1*
server ; if there are you're suppose to preserve the correct occurrence and delete the others ; to delete you can either use a tool like
sed -i
or simply an editor like
vim
or
emacs
; once you'll get just one row per
t3ui1*
server run this command and carefully compare your output with this output:
More... Close
$ ssh-keygen -l -f /$HOME/.ssh/known_hosts | grep t3ui
2048 |
d0:9c:a0:e9:8f:9c:3f:b2:f1:88:6c:15:32:07:fc:a0 |
t3ui12.psi.ch,t3ui12,192.33.123.132 (RSA) |
2048 |
77:1b:27:5e:c8:74:64:86:f8:50:f6:58:e6:6f:41:65 |
t3ui15.psi.ch,t3ui15,192.33.123.135 (RSA) |
2048 |
35:bb:d6:be:64:86:8d:db:1d:57:43:ef:05:39:72:c8 |
t3ui16.psi.ch,t3ui16,192.33.123.136 (RSA) |
2048 |
27:d1:57:f0:ac:da:1d:db:54:11:5c:46:4d:93:63:59 |
t3ui17.psi.ch,t3ui17,192.33.123.137 (RSA) |
2048 |
b1:56:06:5b:d3:da:1a:79:60:e9:02:16:be:82:fe:f7 |
t3ui18.psi.ch,t3ui18,192.33.123.138 (RSA) |
2048 |
73:fe:97:b2:e7:54:df:99:50:dc:19:3d:6f:cd:01:11 |
t3ui19.psi.ch,t3ui19,192.33.123.139 (RSA) |
modify your client
/$HOME/.ssh/config
to force the ssh command to
always check if the server you're connecting to is already reported in the
/$HOME/.ssh/known_hosts
file and to ask your 'ok' for all the servers that are absent :
StrictHostKeychecking ask
your
/$HOME/.ssh/config
can be more complex than just that line, study the
ssh_config man page or contact the T3 Admins; ideally you should put
StrictHostKeychecking yes
but in real life that's impractical.
now your ssh client will be able to detect the
SSH man in the middle attacks and if so it will report :
WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the RSA host key has just been changed.
The
t3ui1*
SSH RSA public and private keys will be
never changed, so the case
It is also possible that the RSA host key has just been changed
will be
never true.
Installing the CERN CA files into your Web Browser
Install and 'trust' any CERN CA file into the Web Browser where it's also loaded your X509 Digital Certificate ( that, in turn, you probably got from CERN as well )
https://cafiles.cern.ch/cafiles/
Applying for the VOMS Group /cms/chcms
membership
It's available a 'Swiss' VOMS Group
/cms/chcms
to assign more CPU/Storage priority to the community of LHC physicist in Switzerland ;
all the Swiss CMS users have to apply for the VOMS Group /cms/chcms
membership in order to automatically get :
- higher priority on the T2_CH_CSCS batch queues
- additional Jobs slots on the T2_CH_CSCS batch queues
- additional
/pnfs
space in the T2_CH_CSCS grid storage
- maybe in the future, the same file locking mechanism offered by the PSI T3
When the
/cms/chcms
membership will be granted the
voms-proxy-init --voms cms
command will
transparently request both the general
/cms/
role and the specific
/cms/chcms
role ; the command output will be :
$ voms-proxy-info --all | grep /cms
attribute : /cms/Role=NULL/Capability=NULL
attribute : /cms/chcms/Role=NULL/Capability=NULL
To apply for the
/cms/chcms
attribute load your X509 into your Web Browser ( but probably it's already there ), click on
https://voms2.cern.ch:8443/voms/cms/group/edit.action?groupId=5 and ask for the
/cms/chcms
membership ; be aware of that port
:8443=because your Institute network policies might prevent the outgoing traffic vs a such strange TCP port ; if that's the case then escalate the problem to your Institute network team or simply require the =/cms/chcms
membership from another network ( very simple, from your DSL at home ) or also from
lxplus
The T3 Admins Skype Accounts ( Optional )
In order to both help the users with their daily T3/T2 errors also to support their 'what-if' T3/T2 plans the principal T3 Administrator 'Fabio Martinelli' has created the Skype account
fabio.martinelli_2
;
all the users are kindly invited to create a Skype account and add him as a contact
Nevertheless his Skype account is meant as a 2nd level of support ; 1st of all
always send an email to
cms-tier3@lists.psi.ch
describing your error, possibly how to reproduce it and from which UI and providing logs + all the necessary info.
Backup policies
Snapshots of the
/shome
files are taken
every hour, max 36h, and every day, max 10 days. Further details here
HowToRetrieveBackupFiles ; NO backups are instead available for the
/scratch
or
/pnfs
files, be careful.
Browsing your /shome
files on the Web
Currently it's not available a
http{s}://
URL to browse your
/shome
logs/errors/programs on the Web but you can autonomously turn on a private website rooted on an arbitrary
dir by simply using SSH and Python like in the following example ( replace
t3ui12
with your
t3ui
server ):
ssh -L 8000:t3ui12.psi.ch:8000 t3ui12.psi.ch "killall python ; cd /mnt/t3nfs01/data01/shome/ytakahas/work/TauTau/SFrameAnalysis/Scripts/ && python -m SimpleHTTPServer"
now open your Web browser to the page :
http://localhost:8000/ and you'll be browsing the
/mnt/t3nfs01/data01/shome/ytakahas/work/TauTau/SFrameAnalysis/Scripts/
files. That's it.
the preliminary
killall python;
command is meant to kill your previous
python -m SimpleHTTPServer
invocation that might be still active but if you've other
python
programs running on the same
t3ui
server that might be too aggressive ; in that case remove the initial
killall python;
command and kill the
python -m SimpleHTTPServer
command by hand.
if somebody else is using the
t3ui12.psi.ch:8000
port then use another port like
8001 or 8002 etc.. :
ssh -L 8000:t3ui12.psi.ch:8001 t3ui12.psi.ch "killall python ; cd /mnt/t3nfs01/data01/shome/ytakahas/work/TauTau/SFrameAnalysis/Scripts/ && python -m SimpleHTTPServer 8001"