<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup --> %TOC% %ICON{arrowleft}% Go to [[CMSTier3LogXX][previous page]] / [[CMSTier3LogXX][next page]] of Tier3 site log %M% ---+ 09. 06. 2016 dCache 2.15 stuck on =t3se01= BE AWARE OF THE LATEST 2.15 Derek's tools https://github.com/fabiomartinelli/dcache-shellutils </br></br> * today =t3se01= had a high load * by running =lsof | grep java= I've seen >100 =gsiftp= connections coming from =t3ui17= * I've connected to =t3ui17= * again by =lsof= I've realized it was =gperrin= and his =gfalFS= vs =srm://t3se01.psi.ch= * I've killed his =gfalFS= mount point * but =t3se01= was still stuck because dCache already went in Out of Memory and it was not able to recover :( * =t3se01= dCache logs are : * <pre>[root@t3se01 ~]# dcache services DOMAIN SERVICE CELL LOG t3se01-Domain-dcap dcap DCap-t3se01 /var/log/dcache/t3se01-Domain-dcap.log t3se01-Domain-gsidcap dcap DCap-gsi-t3se01 /var/log/dcache/t3se01-Domain-gsidcap.log t3se01-Domain-gsiftp ftp GFTP-t3se01 /var/log/dcache/t3se01-Domain-gsiftp.log t3se01-Domain-srm srm SRM-t3se01 /var/log/dcache/t3se01-Domain-srm.log t3se01-Domain-srm spacemanager SpaceManager /var/log/dcache/t3se01-Domain-srm.log t3se01-Domain-srm transfermanagers RemoteTransferManager /var/log/dcache/t3se01-Domain-srm.log t3se01-Domain-utility pinmanager PinManager /var/log/dcache/t3se01-Domain-utility.log t3se01-Domain-info info info /var/log/dcache/t3se01-Domain-info.log t3se01-Domain-xrootd xrootd Xrootd-t3se01 /var/log/dcache/t3se01-Domain-xrootd.log dCacheDomain poolmanager PoolManager /var/log/dcache/dCacheDomain.log dCacheDomain topo topo /var/log/dcache/dCacheDomain.log</pre> * you check those logs in parallel by : * <pre>[root@t3se01 ~]# dcache services | grep log | awk '{print $4}' | xargs -iI tail %BLUE%-v%ENDCOLOR% I ==> %BLUE%/var/log/dcache/t3se01-Domain-dcap.log%ENDCOLOR% <== 09 Jun 2016 10:51:30 (DCap-t3se01-<unknown>-AAU01IboGLA) [door:DCap-t3se01-<unknown>-AAU01IboGLA] Executing command: 3 0 client open "dcap://t3se01.psi.ch:22125//pnfs/psi.ch/cms/trivcat/store/user/cheidegg/sea/11/2016-06-01-17-17-00/QCD_Pt80to120_EMEnriched.root" r t3ui17.psi.ch 45550 -timeout=-1 -onerror=default -passive -uid=609 ... ==> %BLUE%/var/log/dcache/t3se01-Domain-gsidcap.log%ENDCOLOR% <== 09 Jun 2016 10:47:27 (System) [info] Message arrived : <CM: S=[>info@t3se01-Domain-info:*@t3se01-Domain-info:*@dCacheDomain];D=[>System@t3se01-Domain-gsidcap];C=java.lang.String;O=<1465462047010:54631>;LO=<1465462047010:54630>;TTL=1000> ... </pre> * %RED%eventually I've restarted dCache on =t3se01= by <pre>dache restart</pre>%ENDCOLOR% * recall that you can check the live file transfers by : * <pre>lynx --dump -width=200 http://t3dcachedb.psi.ch:2288/context/transfers.html</pre> * <pre>Door Domain Seq Prot Owner Proc PnfsId Pool Host Status Since S Trans. (KB) Speed (KB/s) DCap-t3se01--AAU00-Wza3A t3se01-Domain-dcap 3 dcap-3 521 7033 00005283B084FFAD4845B6719436AAF7DC2A t3fs14_cms_1 192.33.123.93 WaitingForDoorTransferOk 00:35:08 RUNNING 10850796 5145 DCap-t3se01--AAU00-XuScA t3se01-Domain-dcap 829 dcap-3 621 18767 00002FB6919E4FBB4FBBB9C0FAD2BC8BDA33 t3fs03_cms 192.33.123.139 WaitingForDoorTransferOk 00:00:20 RUNNING 315556 15398 ... </pre> %ICON{arrowleft}% Go to [[CMSTier3LogXX][previous page]] / [[CMSTier3LogXX][next page]] of Tier3 site log %M%
This topic: CmsTier3
>
WebHome
>
CMSTier3Log
>
CMSTier3Log74
Topic revision: r1 - 2016-06-09 - FabioMartinelli
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback