<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups # * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup # * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup,Main.CMSAdminReaderGroup --> %INCLUDE{"ListNews" NEWSROWS="1"}% ---+!! Monitoring %CALC{"$SET(jobint,$IF($EXACT(%URLPARAM{"jobint"}%,),day,%URLPARAM{"jobint"}%))"}% %CALC{"$SET(freestorageint,$IF($EXACT(%URLPARAM{"freestorageint"}%,),week,%URLPARAM{"freestorageint"}%))"}% %CALC{"$SET(netwint,$IF($EXACT(%URLPARAM{"netwint"}%,),day,%URLPARAM{"netwint"}%))"}% <a href="http://t3mon.psi.ch/ganglia/?c=PSI%20Tier3%20fileservers&m=&r=hour&s=descending&hc=4"><img src="http://t3mon.psi.ch/ganglia/PSIT3-custom/PSI%2520Tier3%2520fileservers-pie.png"/></a> <a href="http://t3mon.psi.ch/ganglia/?c=PSI%20Tier3%20services&m=&r=hour&s=descending&hc=4"><img src="http://t3mon.psi.ch/ganglia/PSIT3-custom/PSI%2520Tier3%2520services-pie.png"/></a> <a href="http://t3mon.psi.ch/ganglia/?c=PSI%20Tier3%20workers&m=&r=hour&s=descending&hc=4"><img src="http://t3mon.psi.ch/ganglia/PSIT3-custom/PSI%2520Tier3%2520workers-pie.png"/></a> %TOC% ---++ Batch jobs (queuing system) [[http://t3mon.psi.ch/ganglia/PSIT3-custom/qstat.txt][Current queue]] */* [[http://t3mon.psi.ch/ganglia/PSIT3-custom/accounting.txt][accounting]] */* [[http://dashb-cms-job.cern.ch/dashboard/request.py/jobsummary#site=T3_CH_PSI&sortby=user][CMS Dashboard]] Number of running and queued jobs: <form name="formJobint" action="%TOPICURL%?#Batch_jobs_queuing_system" method=GET> <select name="jobint" onchange="formJobint.submit()"> <option %CALC{"$IF($EXACT($GET(jobint),hour),selected,)"}%>hour</option> <option %CALC{"$IF($EXACT($GET(jobint),day),selected,)"}%>day</option> <option %CALC{"$IF($EXACT($GET(jobint),week),selected,)"}%>week</option> <option %CALC{"$IF($EXACT($GET(jobint),month),selected,)"}%>month</option> <option %CALC{"$IF($EXACT($GET(jobint),year),selected,)"}%>year</option> </select> <input type="hidden" name="freestorageint" value=%CALC{"$GET(freestorageint)"}%> <input type="hidden" name="netwint" value=%CALC{"$GET(netwint)"}%> </form> <img src="%GANGLIABASE%/PSIT3-custom/running-%CALC{"$GET(jobint)"}%.gif" /> <br><img src="%GANGLIABASE%/PSIT3-custom/waiting-%CALC{"$GET(jobint)"}%.gif" /> [[%GANGLIABASE%/?c=PSI%20Tier3%20workers&m=&r=day&s=descending&hc=4][Ganglia WN page]] <img src="%GANGLIABASE%/graph.php?g=load_report&z=medium&c=PSI%20Tier3%20workers&m=&r=%CALC{"$GET(jobint)"}%&s=descending&hc=4&st=now" /> <img src="%GANGLIABASE%/graph.php?g=cpu_report&z=medium&c=PSI%20Tier3%20workers&m=&r=%CALC{"$GET(jobint)"}%&s=descending&hc=4&st=now" /> ---++ Storage ---+++ =/pnfs= dir Show space graphs for <form name="formFreestorageint" action="%TOPICURL%?#Storage_Element" method=GET> <select name="freestorageint" onchange="formFreestorageint.submit()"> <option %CALC{"$IF($EXACT($GET(freestorageint),hour),selected,)"}%>hour</option> <option %CALC{"$IF($EXACT($GET(freestorageint),day),selected,)"}%>day</option> <option %CALC{"$IF($EXACT($GET(freestorageint),week),selected,)"}%>week</option> <option %CALC{"$IF($EXACT($GET(freestorageint),month),selected,)"}%>month</option> <option %CALC{"$IF($EXACT($GET(freestorageint),year),selected,)"}%>year</option> </select> <input type="hidden" name="jobint" value=%CALC{"$GET(jobint)"}%> <input type="hidden" name="netwint" value=%CALC{"$GET(netwint)"}%> </form> Links: * List all [[https://cmsweb.cern.ch/das/request?view=list&limit=100&instance=cms_dbs_prod_global&input=dataset+site%3DT3_CH_PSI][hosted datasets]] / [[https://cmsweb.cern.ch/phedex/prod/Request::View?type=any&nodes=T3_CH_PSI&state=any&.submit=Submit][requests]] */* [[https://cmsweb.cern.ch/phedex/prod/Reports::SiteUsage?node=T3_CH_PSI][accounting per phys. group]] * [[http://t3mon.psi.ch/ganglia/PSIT3-custom/cms_space.txt][ /pnfs user/dataset overview]] * [[http://t3mon.psi.ch/ganglia/PSIT3-custom/v_pnfs_top_dirs.txt][/pnfs precomputed dirs size]] * [[FastSearchingIntoSlashPNFS][/pnfs by dc_find]] * [[http://t3mon.psi.ch/ganglia/PSIT3-custom/transfers.txt][/pnfs current files transfers]] <!-- * show user/dataset [[http://t3mon.psi.ch/ganglia/PSIT3-custom/sespace.txt][space usage]] --> <img src="%GANGLIABASE%/PSIT3-custom/sespace-%CALC{"$GET(freestorageint)"}%.gif" /> <img src="%GANGLIABASE%/graph.php?c=PSI%20Tier3%20services&h=%SEHOST%&m=t3se01_free_cms&r=%CALC{"$GET(freestorageint)"}%&z=medium&jr=&js=&vl=GB&st=now" /> ---+++ =/pnfs= dir I/O queues * =regular= I/O queue movers = *dcap/gsidcap* movers (heavy random IO for *internal* analysis) ; MAX 100 %GREEN%ACTIVE%ENDCOLOR% movers per file server, others requests will get %ORANGE%QUEUED%ENDCOLOR% * =wan= I/O queue movers = *SRM/gridftp* movers (transfers of whole files also from outside) ; MAX 2 %GREEN%ACTIVE%ENDCOLOR% movers per file server, others requests will get %ORANGE%QUEUED%ENDCOLOR% * =xrootd= I/O queue movers = transfers of files by [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookXrootdService][xrootd]] ; MAX 2 %GREEN%ACTIVE%ENDCOLOR% movers per file server, others requests will get %ORANGE%QUEUED%ENDCOLOR% * =[t3uiXY]$ elinks 'http://t3dcachedb:2288/queueInfo'= to check by CLI the =regular=, =wan= , =xrootd= I/O queues status ( though this is seldomly needed ) </br> %GREEN%ACTIVE%ENDCOLOR% movers: <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_movers_regular_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_movers_wan_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> <img src="%GANGLIABASE%/graph.php?c=PSI%20Tier3%20services&h=t3se01.psi.ch&v=5&m=t3se01_movers_xrootd_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> %ORANGE%QUEUED%ENDCOLOR% movers ( the associated I/O queue is exceeding the max amount of allowed %GREEN%ACTIVE%ENDCOLOR% movers ) : <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_moversQ_regular_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_moversQ_wan_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&v=0&m=t3se01_moversQ_xrootd_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> %RED%PENDING%ENDCOLOR% requests (these are hanging file transfers, almost always an error state if they persist): <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_pending_requests&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now&z=medium"/> ---+++ =/shome= and =/swshare= dirs [[http://t3mon.psi.ch/ganglia/PSIT3-custom/shome-du.txt][/shome space usage]] <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3fs06.psi.ch&m=t3fs06.psi.ch_diskfull_prcnt_shome&r=%CALC{"$GET(freestorageint)"}%&z=medium&jr=&js=&st=now"/> <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3fs05.psi.ch&m=t3fs05.psi.ch_diskfull_prcnt_swshare&r=%CALC{"$GET(freestorageint)"}%&z=medium&jr=&js=&st=now"/> ---++ Networking and File Transfers (+ PhEDEx) Links: * [[%GANGLIABASE%/?m=network_report&r=day&s=descending&c=PSI+Tier3+fileservers&h=&sh=1&hc=4][Ganglia fileserver page]] * show [[http://t3mon.psi.ch/ganglia/PSIT3-custom/phedex-statistics.txt][phedex summary statistics]] */* [[http://cmsweb.cern.ch/phedex/prod/Components::Links?from_filter=T3_CH_PSI&andor=or&to_filter=T3_CH_PSI&Update=Update#][PhEDEx links between PSI and other centers]] */* [[https://cmsweb.cern.ch/phedex/prod/Request::View?type=delete&nodes=T3_CH_PSI&state=any&.submit=Submit][deletion request state]] Plotting interval: <form name="formNetwint" action="%TOPICURL%?#Networking_and_File_Transfers_Ph" method=GET> <select name="netwint" onchange="formNetwint.submit()"> <option %CALC{"$IF($EXACT($GET(netwint),hour),selected,)"}%>hour</option> <option %CALC{"$IF($EXACT($GET(netwint),day),selected,)"}%>day</option> <option %CALC{"$IF($EXACT($GET(netwint),week),selected,)"}%>week</option> <option %CALC{"$IF($EXACT($GET(netwint),month),selected,)"}%>month</option> <option %CALC{"$IF($EXACT($GET(netwint),year),selected,)"}%>year</option> </select> <input type="hidden" name="jobint" value=%CALC{"$GET(jobint)"}%> <input type="hidden" name="freestorageint" value=%CALC{"$GET(freestorageint)"}%> </form> <br><img src="%GANGLIABASE%/graph.php?g=network_report&z=medium&c=PSI%20Tier3%20workers&m=&r=%CALC{"$GET(netwint)"}%&s=descending&hc=4&st=now" /> <img src="%GANGLIABASE%/graph.php?g=network_report&z=medium&c=PSI%20Tier3%20fileservers&m=&r=%CALC{"$GET(netwint)"}%&s=descending&hc=4&st=now" /> <img src="%GANGLIABASE%/graph.php?g=network_report&z=medium&c=PSI%20Tier3%20services&m=&r=%CALC{"$GET(netwint)"}%&s=descending&hc=4&st=now" /> ---++ Availability reports These tests are run by the centralized Grid monitoring services and they determine whether the T3 or the T2 are considered to be working correctly: * CMS Nagios : [[https://sam-cms-prod.cern.ch/nagios/cgi-bin/status.cgi?hostgroup=site-T3_CH_PSI&style=detail][T3]] , [[https://sam-cms-prod.cern.ch/nagios/cgi-bin/status.cgi?hostgroup=site-CSCS-LCG2&style=detail][T2]] * German Nagios : [[https://ngi-de-nagios.gridka.de/nagios/cgi-bin/status.cgi?hostgroup=site-T3_CH_PSI&style=detail][T3]] , [[https://ngi-de-nagios.gridka.de/nagios/cgi-bin/status.cgi?hostgroup=site-CSCS-LCG2&style=detail][T2]] * Gstat : [[http://goc.grid.sinica.edu.tw/gstat/T3_CH_PSI/][T3]] , [[http://gstat2.grid.sinica.edu.tw/gstat/site/CSCS-LCG2/][T2]] ---++ Computer Room Temps *private link* </br> <img src=https://ganglia03.psi.ch/ganglia/graph.php?g=temperature_report&c=rztemp&h=T_18&r=&z=medium&st=now><img src=https://ganglia03.psi.ch/ganglia/graph.php?g=temperature_report&c=rztemp&h=T_19&r=&z=medium&st=now>
This topic: CmsTier3
>
WebHome
>
Tier3Monitoring
Topic revision: r65 - 2015-03-12 - FabioMartinelli
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback