Tags:
tag this topic
create new tag
view all tags
<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup --> ---+!! Node Type: %CALC{"$SUBSTITUTE(%TOPIC%,NodeType,)"}% ---++!! Firewall requirements | *local port* | *open to* | *reason* | <!-- Example line #| 22/tcp | * | Example entry for ssh | --> ---------------- %TOC{title="Table of contents"}% ---+ Emergency Measures ---++ Too many frequent or heavy writes %N% You can identify which user files are opened in write mode by : %TWISTY% <pre> %BLUE%[root@t3admin01 ~]# salt 't3*' cmd.run " lsof -w -N | grep shome | grep REG | egrep ' [0-9]*u | [0-9]*w '| awk '{ print \$9}' | xargs -I {} -i bash -c 'ls -lh {}' "%ENDCOLOR% t3bdii02.psi.ch: t3ldap02.psi.ch: t3frontier01.psi.ch: t3ldap01: t3wn42.psi.ch: t3ce01.psi.ch: t3wn18.psi.ch: t3wn28.psi.ch: t3wn38.psi.ch: t3wn40.psi.ch: t3wn12.psi.ch: t3wn23.psi.ch: t3wn26.psi.ch: t3wn17.psi.ch: t3wn19.psi.ch: t3fs14.psi.ch: t3bdii01.psi.ch: t3nagios.psi.ch: t3wn31.psi.ch: t3wn21.psi.ch: t3ui17.psi.ch: -rw-r--r-- 1 mquittna ethz-ecal 16K Jul 8 09:18 /shome/mquittna/CMSSW/EXO_7_4_0_pre9/src/diphotons/Analysis/macros/.combine_maker.py.swp -rw-r--r-- 1 gaperrin ethz-susy 20K Jul 8 13:31 /shome/gaperrin/tnp_gael/SSDLBkgEstimationTP/TandP/.FitInvMassBkg.C.swp -rw-r--r-- 1 gaperrin ethz-susy 12K Jul 8 13:34 /shome/gaperrin/tnp_gael/SSDLBkgEstimationTP/TandP/.start_job2.sh.swp -rw-r--r-- 1 gaperrin ethz-susy 12K Jul 8 11:43 /shome/gaperrin/tnp_gael/SSDLBkgEstimationTP/TandP/.job2.sh.swp -rw-r--r-- 1 gaperrin ethz-susy 48K Jul 8 13:50 /shome/gaperrin/tnp_gael/SSDLBkgEstimationTP/TandP/.DrawInvMassBkg_combi.cc.swp -rw-r--r-- 1 gaperrin ethz-susy 52K Jul 8 13:45 /shome/gaperrin/tnp_gael/SSDLBkgEstimationTP/TandP/.MC_Ratio.C.swp -rw-r--r-- 1 gaperrin ethz-susy 48K Jul 8 13:47 /shome/gaperrin/tnp_gael/SSDLBkgEstimationTP/TandP/.TandP.C.swp -rw-r--r-- 1 gaperrin ethz-susy 36K Jul 8 13:50 /shome/gaperrin/tnp_gael/SSDLBkgEstimationTP/TandP/.CompareMCvsTandP.cc.swp -rw-r--r-- 1 gaperrin ethz-susy 12K Jul 8 13:52 /shome/gaperrin/tnp_gael/SSDLBkgEstimationTP/TandP/.start_job.sh.swp -rw-r--r-- 1 mdunser ethz-susy 88K May 5 10:33 /shome/mdunser/FakeLeptonFW/macros/.closure.py.swo ls: cannot access /shome/bianchi/TTH-72X-heppy/CMSSW/src/TTH/MEIntegratorStandalone/test/validate_^W7^A: No such file or directory -rw-r--r-- 1 mdunser ethz-susy 102K May 5 16:21 /shome/mdunser/.ipython/profile_default/history.sqlite -rw-r--r-- 1 mdunser ethz-susy 102K May 5 16:21 /shome/mdunser/.ipython/profile_default/history.sqlite t3wn27.psi.ch: t3wn22.psi.ch: t3wn16.psi.ch: t3wn11.psi.ch: t3wn10.psi.ch: t3wn33.psi.ch: t3ui05.psi.ch: -rw-r--r-- 1 casal ethz-susy 16K May 18 15:22 /shome/casal/CMSSW/sms_prod/CMSSW_5_3_7_patch5/src/MT2analysis/Code/MT2AnalysisCode/RootMacros/.treeConversion.py.swp t3wn32.psi.ch: t3wn34.psi.ch: t3service01: t3ce02.psi.ch: t3vmui01.psi.ch: t3wn20.psi.ch: t3cmsvobox01.psi.ch: t3wn39.psi.ch: t3wn43.psi.ch: t3wn35.psi.ch: t3fs13.psi.ch: t3ui19.psi.ch: t3ui12.psi.ch: -rw-rw-r-- 1 jngadiub uniz-higgs 0 Jul 8 14:05 /shome/jngadiub/EXOVVAnalysisRunII/CMSSW_7_4_3/tmp/slc6_amd64_gcc491/src/CalibMuon/DTCalibration/plugins/CalibMuonDTCalibrationPlugins/SealModule.o -rw-rw-r-- 1 jngadiub uniz-higgs 0 Jul 8 14:05 /shome/jngadiub/EXOVVAnalysisRunII/CMSSW_7_4_3/tmp/slc6_amd64_gcc491/src/TrackingTools/TrackAssociator/test/testTrackingToolsTrackAssociator/TestTrackAssociator.o -rw-rw-r-- 1 jngadiub uniz-higgs 0 Jul 8 14:05 /shome/jngadiub/EXOVVAnalysisRunII/CMSSW_7_4_3/tmp/slc6_amd64_gcc491/src/TrackingTools/TrackAssociator/test/testCaloMatchingExample/CaloMatchingExample.o -rw-rw-r-- 1 jngadiub uniz-higgs 0 Jul 8 14:05 /shome/jngadiub/EXOVVAnalysisRunII/CMSSW_7_4_3/tmp/slc6_amd64_gcc491/src/TrackingTools/TrackAssociator/plugins/TrackingToolsTrackAssociatorPlugins/DetIdAssociatorESProducer.o -rw-rw-r-- 1 jngadiub uniz-higgs 0 Jul 8 14:05 /shome/jngadiub/EXOVVAnalysisRunII/CMSSW_7_4_3/tmp/slc6_amd64_gcc491/src/TrackingTools/TrackAssociator/plugins/TrackingToolsTrackAssociatorPlugins/MuonDetIdAssociator.o -rw-rw-r-- 1 jngadiub uniz-higgs 0 Jul 8 14:05 /shome/jngadiub/EXOVVAnalysisRunII/CMSSW_7_4_3/tmp/slc6_amd64_gcc491/src/TrackingTools/TrackAssociator/plugins/TrackingToolsTrackAssociatorPlugins/modules.o -rw-rw-r-- 1 jngadiub uniz-higgs 0 Jul 8 14:05 /shome/jngadiub/EXOVVAnalysisRunII/CMSSW_7_4_3/tmp/slc6_amd64_gcc491/src/SimTracker/TrackerHitAssociation/plugins/SimTrackerTrackerHitAssociationPlugins/ClusterTPAssociationProducer.o -rw-rw-r-- 1 jngadiub uniz-higgs 0 Jul 8 14:05 /shome/jngadiub/EXOVVAnalysisRunII/CMSSW_7_4_3/tmp/slc6_amd64_gcc491/src/SimTracker/VertexAssociatorESProducer/src/SimTrackerVertexAssociatorESProducer/SealModules.o -rw-rw-r-- 1 jngadiub uniz-higgs 0 Jul 8 14:05 /shome/jngadiub/EXOVVAnalysisRunII/CMSSW_7_4_3/tmp/slc6_amd64_gcc491/src/SimTracker/VertexAssociatorESProducer/src/SimTrackerVertexAssociatorESProducer/VertexAssociatorByTracksESProducer.o -rw-r--r-- 1 jpata ethz-higgs 0 Jul 8 13:40 /shome/jpata/TTH-72X-heppy-dev/CMSSW/src/VHbbAnalysis/Heppy/test/test/log.txt -rw-r--r-- 1 jpata ethz-higgs 228 Jul 8 13:40 /shome/jpata/TTH-72X-heppy-dev/CMSSW/src/VHbbAnalysis/Heppy/test/test/pileup.root -rw-r--r-- 1 jpata ethz-higgs 14M Jul 8 14:05 /shome/jpata/TTH-72X-heppy-dev/CMSSW/src/VHbbAnalysis/Heppy/test/test/tree.root t3wn13.psi.ch: t3wn24.psi.ch: t3mon01: t3wn37.psi.ch: t3wn41.psi.ch: t3ui18.psi.ch: t3wn15.psi.ch: t3se01.psi.ch: t3wn29.psi.ch: t3wn25.psi.ch: t3wn30.psi.ch: t3wn44.psi.ch: t3wn50.psi.ch: t3ui16.psi.ch: t3ui15.psi.ch: t3wn14.psi.ch: t3wn36.psi.ch: t3dcachedb03.psi.ch: </pre> %ENDTWISTY% <!-- #List any measures that must be taken in case of some major incident, e.g. whether a mailing #list must be contacted or whether other services need to be shut down, etc. --> ---++ RPC program nfs version 3 tcp is not running In Nov 2014 we got this CMSTier3Log67 case * [[https://t3nagios.psi.ch/check_mk/view.py?view_name=host&site=&host=t3fs06][check nagios]] * If =t3fs06= will fail then the =t3ui1*= and the =t3wn*= servers that mount =t3fs06:/shome= will be immediately affected ; if you can't quickly recover =t3fs06:/shome= ( e.g. due to a failed motherboard ) you'll have to umount =/shome= from those servers and mount =t3fs05:/shome2= that is suppose to be an identical copy of =t3fs06:/shome= ; probably you'll need to make symbolic links =/shome2 -> /shome= * On =t3fs05= obviously stop the cron sending by rsync =/swshare= to =t3fs06=. * Tweak =t3nagios= to forget about =t3fs06= ---+ Regular Maintenance work <!-- #List any regular activities which do not run automatically and need an administrator's action. --> ---++ Nagios [[https://t3nagios.psi.ch/check_mk/view.py?view_name=host&site=&host=t3fs06][check nagios]] ---+ Installation <!-- #Comment here on any peculiarities of the installation, e.g. on special packages needed, special setup #procedures which are not obvious --> ---++ crontab -l root <pre> #ident "@(#)root 1.21 04/03/23 SMI" # # The root crontab should be used to perform accounting data collection. # # 10 3 * * * /usr/sbin/logadm 15 3 * * 0 /usr/lib/fs/nfs/nfsfind 30 3 * * * [ -x /usr/lib/gss/gsscred_clean ] && /usr/lib/gss/gsscred_clean # # The rtc command is run to adjust the real time clock if and when # daylight savings time changes. # 1 2 * * * [ -x /usr/sbin/rtc ] && /usr/sbin/rtc -c > /dev/null 2>&1 # # create regular snapshots of the shome file system # #20 00 * * * /root/psit3-tools/regular-snapshot-new -f shome -v -s t3fs05 -r shome2/shomebup 2>&1 | /usr/bin/tee /var/cron/lastsnap.txt 2>&1 ; [[ $? -ne 0 ]] && /usr/bin/mail cms-tier3@lists.psi.ch < /var/cron/lastsnap.txt # # Added by cswcrontab for CSWlogwatch 02 4 * * * /opt/csw/bin/logwatch # # for ganglia monitoring of shome space 53 * * * * /root/gmetric/gmetric_partition_space-cron.sh # # for detailed local monitoring of user space 44 01 * * * /shome/monuser/shome-du.cron.sh # 43 3 * * * [ -x /opt/csw/bin/gupdatedb ] && /opt/csw/bin/gupdatedb --prunepaths="/shome /dev /devices /proc /tmp /var/tmp" 1>/dev/null 2>&1 # Added by CSWfindutils # 09/03/2015 - F.Martinelli 22 03 * * * %RED%/opt/zfsnap/zfssnap -v shome && /opt/csw/bin/rsync --progress -v --delete -a -e "ssh -c arcfour" /shome/ t3fs05:/shome2 2>&1 | /usr/bin/tee /var/cron/zfssnap.shome.log 2>&1 %ENDCOLOR% </pre> ---++ Shared File Systems on ZFS - OLD * %GREEN% =/shome=: %ENDCOLOR% Two 9 disk raidz2 sets are used for shome. * %BLUE% =/vmshare=: %ENDCOLOR% raidz2 set and spares. Hosts some of the older VMs * %BLACK% *spare disks* %ENDCOLOR% <pre> ---------------------SunFireX4500------Rear---------------------------- 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: c6t3 c6t7 c5t3 c5t7 c8t3 %BLUE%c8t7 c7t3 c7t7 c1t3 c1t7%ENDCOLOR% %BLACK%c0t3 c0t7%ENDCOLOR% ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: c6t2 c6t6 c5t2 c5t6 c8t2 c8t6 c7t2 c7t6 c1t2 c1t6 c0t2 c0t6 ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: %GREEN%c6t1 c6t5 c5t1 c5t5 c8t1 c8t5 c7t1 c7t5%ENDCOLOR% %BLACK%c1t1%ENDCOLOR% c1t5 %BLACK%c0t1%ENDCOLOR% c0t5 ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ 0: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: %RED%c6t0 c6t4%ENDCOLOR% %GREEN%c5t0 c5t4 c8t0 c8t4 c7t0 c7t4 c1t0 c1t4 c0t0%ENDCOLOR% c0t4 %RED%^b+ ^b+%ENDCOLOR% ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ -------*-----------*-SunFireX4500--*---Front-----*-----------*---------- </pre> ---+++ User quotas After upgrading the ZFS version, it was necessary to initialise the accounting information. This can take quite some time... <pre> zfs userspace shome </pre> User quotas can be set and viewed in the following way (can use name or id for users) <pre> zfs set userquota@3896=500G shome zfs get userquota@3896 shome </pre> The current usage of all users can be seen with <pre> zfs userspace shome zfs userspace -p -s used shome # exact value and sorted </pre> ---++ zfs list -t snapshot <pre> NAME USED AVAIL REFER MOUNTPOINT rpool/ROOT/s10x_u8wos_08a@2011-Feb-18_11-51 46.3M - 3.70G - rpool/ROOT/s10x_u8wos_08a@python-20110303 49.7M - 3.77G - rpool/ROOT/s10x_u8wos_08a@31-May-2012 142M - 4.11G - rpool/ROOT/s10x_u8wos_08a@09-Apr-2013 118M - 5.35G - rpool/ROOT/s10x_u8wos_08a@05-Jun-2013 133M - 5.43G - rpool/ROOT/s10x_u8wos_08a@28-Nov-2013 121M - 5.53G - rpool/ROOT/s10x_u8wos_08a@21-03-2014 131M - 5.55G - rpool/ROOT/s10x_u8wos_08a@24-Jun-2014 144M - 5.62G - rpool/ROOT/s10x_u8wos_08a@11-Sep-2014 155M - 5.64G - rpool/ROOT/s10x_u8wos_08a@20-01-2015 163M - 5.66G - rpool/ROOT/s10x_u8wos_08a@30-01-2015 164M - 5.66G - rpool/ROOT/s10x_u8wos_08a@06-03-2015 165M - 5.66G - rpool/ROOT/s10x_u8wos_08a@03-06-2015 0 - 5.67G - shome@%RED%zfssnap%ENDCOLOR%_2015-05-25_03.22.00--10d 3.08G - 4.98T - <-- /opt/zfsnap/zfssnap -v shome && /opt/csw/bin/rsync --progress -v --delete -a -e "ssh -c arcfour" /shome/ t3fs05:/shome2 shome@%RED%zfssnap%ENDCOLOR%_2015-05-26_03.22.00--10d 3.12G - 4.98T - shome@%RED%zfssnap%ENDCOLOR%_2015-05-27_03.22.00--10d 6.70G - 4.94T - shome@%RED%zfssnap%ENDCOLOR%_2015-05-28_03.22.00--10d 6.00G - 4.95T - shome@%RED%zfssnap%ENDCOLOR%_2015-05-29_03.22.00--10d 4.08G - 4.95T - shome@%RED%zfssnap%ENDCOLOR%_2015-05-30_03.22.00--10d 2.87G - 4.94T - shome@%RED%zfssnap%ENDCOLOR%_2015-05-31_03.22.00--10d 2.87G - 4.94T - shome@%RED%zfssnap%ENDCOLOR%_2015-06-01_03.22.00--10d 6.48G - 4.94T - shome@%RED%zfssnap%ENDCOLOR%_2015-06-02_03.22.00--10d 3.06G - 4.96T - shome@%RED%zfssnap%ENDCOLOR%_2015-06-03_03.22.00--10d 3.22G - 4.95T - swshare2/swsharebup@auto2015-05-27_06:00:00 2.36G - 560G - swshare2/swsharebup@auto2015-05-28_06:00:01 1.07M - 560G - swshare2/swsharebup@auto2015-05-29_06:00:00 950K - 559G - swshare2/swsharebup@auto2015-05-30_06:00:00 1.47M - 559G - swshare2/swsharebup@auto2015-05-31_06:00:00 1.36M - 559G - swshare2/swsharebup@auto2015-06-01_06:00:00 725K - 560G - swshare2/swsharebup@auto2015-06-02_06:00:00 1.00M - 559G - swshare2/swsharebup@auto2015-06-03_06:00:00 0 - 560G - </pre> ---++ daily snapshots and backup A script =/root/psit3-tools/regular-snapshot= is called by root's crontab to make a daily incremental snapshot of the =shome= ZFS file system to t3fs05. Users can retrieve files from these snapshots by themselves, as documented in HowToRetrieveBackupFiles. The script also deletes the older snapshots. The script is run by cron. Look also at the tests for doing incremental snapshot transfers in CMSTier3Log12. ---+++ ZFS Backup server on t3fs05 - OLD * %GREEN% *shome2* %ENDCOLOR%: Backup of shome area * %BLUE% *vmshare* %ENDCOLOR%: Backup of virtual machine area (for the older vmware-server based machines) * %RED% *swshare* %ENDCOLOR%: cluster's shared software space (e.g. experiment SW) * %BLACK% *spare disks* %ENDCOLOR% <pre> ---------------------SunFireX4500------Rear---------------------------- 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: %BLUE%c6t3%ENDCOLOR% %GREEN%c6t7 c5t3%ENDCOLOR% %BLUE%c5t7 c8t3 c8t7%ENDCOLOR% %GREEN%c7t3 c7t7%ENDCOLOR% %BLUE%c1t3%ENDCOLOR% %RED%c1t7 c0t3%ENDCOLOR% %BLACK%c0t7%ENDCOLOR% ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: %BLACK%c6t2 c6t6%ENDCOLOR% %RED%c5t2%ENDCOLOR% %GREEN%c5t6 c8t2%ENDCOLOR% %RED%c8t6 c7t2 c7t6%ENDCOLOR% %GREEN%c1t2 c1t6%ENDCOLOR% %BLACK%c0t2%ENDCOLOR% c0t6 ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: %GREEN%c6t1 c6t5%ENDCOLOR% c5t1 c5t5 c8t1 %GREEN%c8t5 c7t1%ENDCOLOR% c7t5 c1t1 c1t5 %GREEN%c0t1 c0t5%ENDCOLOR% ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ 0: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: c6t0 c6t4 %GREEN%c5t0 c5t4%ENDCOLOR% c8t0 c8t4 c7t0 %GREEN%c7t4 c1t0%ENDCOLOR% c1t4 c0t0 c0t4 ^b+ ^b+ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ -------*-----------*-SunFireX4500--*---Front-----*-----------*---------- </pre>
NodeTypeForm
Hostnames
t3fs06 - OUTDATED !
Services
NFS (user home area), backup on t3fs05
Hardware
SUN X4500 (2*Opt 290, 16GB RAM, 48*500GB SATA)
Install Profile
none
Guarantee/maintenance until
t3fs05,06: 2011-02-14
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r20
<
r19
<
r18
<
r17
<
r16
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r20 - 2016-11-04
-
FabioMartinelli
CmsTier3
Log In
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
User Pages
Main Page
Policies
Monitoring Storage Space
Monitoring Slurm Usage
Physics Groups
Steering Board Meetings
Admin Pages
AdminArea
Cluster Specs
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback