<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups # * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup # * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup,Main.CMSAdminReaderGroup --> %INCLUDE{"ListNews" NEWSROWS="1"}% ---+!! PSI Tier-3 Monitoring %CALC{"$SET(jobint,$IF($EXACT(%URLPARAM{"jobint"}%,),day,%URLPARAM{"jobint"}%))"}% %CALC{"$SET(freestorageint,$IF($EXACT(%URLPARAM{"freestorageint"}%,),week,%URLPARAM{"freestorageint"}%))"}% %CALC{"$SET(netwint,$IF($EXACT(%URLPARAM{"netwint"}%,),day,%URLPARAM{"netwint"}%))"}% <a href="http://t3mon.psi.ch/ganglia/?c=PSI%20Tier3%20fileservers&m=&r=hour&s=descending&hc=4"><img src="http://t3mon.psi.ch/ganglia/PSIT3-custom/PSI%2520Tier3%2520fileservers-pie.png"/></a> <a href="http://t3mon.psi.ch/ganglia/?c=PSI%20Tier3%20services&m=&r=hour&s=descending&hc=4"><img src="http://t3mon.psi.ch/ganglia/PSIT3-custom/PSI%2520Tier3%2520services-pie.png"/></a> <a href="http://t3mon.psi.ch/ganglia/?c=PSI%20Tier3%20workers&m=&r=hour&s=descending&hc=4"><img src="http://t3mon.psi.ch/ganglia/PSIT3-custom/PSI%2520Tier3%2520workers-pie.png"/></a> %TOC% ---++ Availability tests These tests are run by the centralized Grid monitoring services and they determine whether our site is considered to be working correctly for the VOs. * [[https://operations-portal.egi.eu/dashboard][EGI Operations Dashboard]] * [[https://sam-cms-prod.cern.ch/nagios/cgi-bin/status.cgi?hostgroup=site-T3_CH_PSI&style=detail][CMS Nagios vs PSI]] * [[https://ngi-de-nagios.gridka.de/nagios/cgi-bin/status.cgi?hostgroup=site-T3_CH_PSI&style=detail][NGI-DE Nagios vs PSI]] * [[http://goc.grid.sinica.edu.tw/gstat/T3_CH_PSI/][Gstat]] ---++ Batch jobs (queuing system) [[http://t3mon.psi.ch/ganglia/PSIT3-custom/qstat.txt][Current queue]] */* [[http://t3mon.psi.ch/ganglia/PSIT3-custom/accounting.txt][accounting]] Number of running and queued jobs: <form name="formJobint" action="%TOPICURL%?#Batch_jobs_queuing_system" method=GET> <select name="jobint" onchange="formJobint.submit()"> <option %CALC{"$IF($EXACT($GET(jobint),hour),selected,)"}%>hour</option> <option %CALC{"$IF($EXACT($GET(jobint),day),selected,)"}%>day</option> <option %CALC{"$IF($EXACT($GET(jobint),week),selected,)"}%>week</option> <option %CALC{"$IF($EXACT($GET(jobint),month),selected,)"}%>month</option> <option %CALC{"$IF($EXACT($GET(jobint),year),selected,)"}%>year</option> </select> <input type="hidden" name="freestorageint" value=%CALC{"$GET(freestorageint)"}%> <input type="hidden" name="netwint" value=%CALC{"$GET(netwint)"}%> </form> <img src="%GANGLIABASE%/PSIT3-custom/running-%CALC{"$GET(jobint)"}%.gif" /> <br><img src="%GANGLIABASE%/PSIT3-custom/waiting-%CALC{"$GET(jobint)"}%.gif" /> [[http://dashb-cms-job.cern.ch/dashboard/request.py/jobsummary#site=T3_CH_PSI&sortby=user][CMS Dashboard]] ---+++ Worker nodes load monitoring [[%GANGLIABASE%/?c=PSI%20Tier3%20workers&m=&r=day&s=descending&hc=4][Ganglia WN page]] <img src="%GANGLIABASE%/graph.php?g=load_report&z=medium&c=PSI%20Tier3%20workers&m=&r=%CALC{"$GET(jobint)"}%&s=descending&hc=4&st=now" /> <img src="%GANGLIABASE%/graph.php?g=cpu_report&z=medium&c=PSI%20Tier3%20workers&m=&r=%CALC{"$GET(jobint)"}%&s=descending&hc=4&st=now" /> ---++ Storage Show space graphs for <form name="formFreestorageint" action="%TOPICURL%?#Storage_Element" method=GET> <select name="freestorageint" onchange="formFreestorageint.submit()"> <option %CALC{"$IF($EXACT($GET(freestorageint),hour),selected,)"}%>hour</option> <option %CALC{"$IF($EXACT($GET(freestorageint),day),selected,)"}%>day</option> <option %CALC{"$IF($EXACT($GET(freestorageint),week),selected,)"}%>week</option> <option %CALC{"$IF($EXACT($GET(freestorageint),month),selected,)"}%>month</option> <option %CALC{"$IF($EXACT($GET(freestorageint),year),selected,)"}%>year</option> </select> <input type="hidden" name="jobint" value=%CALC{"$GET(jobint)"}%> <input type="hidden" name="netwint" value=%CALC{"$GET(netwint)"}%> </form> ---+++ Storage Element Links: * [[http://t3dcachedb.psi.ch:2288/][dCache GUI]] *Restricted* * List all [[https://cmsweb.cern.ch/das/request?view=list&limit=100&instance=cms_dbs_prod_global&input=dataset+site%3DT3_CH_PSI][hosted datasets]] / [[https://cmsweb.cern.ch/phedex/prod/Request::View?type=any&nodes=T3_CH_PSI&state=any&.submit=Submit][requests]] */* [[https://cmsweb.cern.ch/phedex/prod/Reports::SiteUsage?node=T3_CH_PSI][accounting per phys. group]] * show user/dataset [[http://t3mon.psi.ch/ganglia/PSIT3-custom/cms_space.txt][space usage]] * current Storage Element [[http://t3mon.psi.ch/ganglia/PSIT3-custom/transfers.txt][transfers]] <!-- * show user/dataset [[http://t3mon.psi.ch/ganglia/PSIT3-custom/sespace.txt][space usage]] --> <img src="%GANGLIABASE%/PSIT3-custom/sespace-%CALC{"$GET(freestorageint)"}%.gif" /> <img src="%GANGLIABASE%/graph.php?c=PSI%20Tier3%20services&h=%SEHOST%&m=t3se01_free_cms&r=%CALC{"$GET(freestorageint)"}%&z=medium&jr=&js=&vl=GB&st=now" /> ---+++ Home and Software areas * show detailed home area [[http://t3mon.psi.ch/ganglia/PSIT3-custom/shome-du.txt][space usage]] <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3fs06.psi.ch&m=t3fs06.psi.ch_diskfull_prcnt_shome&r=%CALC{"$GET(freestorageint)"}%&z=medium&jr=&js=&st=now"/> <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3fs05.psi.ch&m=t3fs05.psi.ch_diskfull_prcnt_swshare&r=%CALC{"$GET(freestorageint)"}%&z=medium&jr=&js=&st=now"/> ---++ Networking and File Transfers (+ PhEDEx) Links: * [[%GANGLIABASE%/?m=network_report&r=day&s=descending&c=PSI+Tier3+fileservers&h=&sh=1&hc=4][Ganglia fileserver page]] * show [[http://t3mon.psi.ch/ganglia/PSIT3-custom/phedex-statistics.txt][phedex summary statistics]] */* [[http://cmsweb.cern.ch/phedex/prod/Components::Links?from_filter=T3_CH_PSI&andor=or&to_filter=T3_CH_PSI&Update=Update#][PhEDEx links between PSI and other centers]] */* [[https://cmsweb.cern.ch/phedex/prod/Request::View?type=delete&nodes=T3_CH_PSI&state=any&.submit=Submit][deletion request state]] Plotting interval: <form name="formNetwint" action="%TOPICURL%?#Networking_and_File_Transfers_Ph" method=GET> <select name="netwint" onchange="formNetwint.submit()"> <option %CALC{"$IF($EXACT($GET(netwint),hour),selected,)"}%>hour</option> <option %CALC{"$IF($EXACT($GET(netwint),day),selected,)"}%>day</option> <option %CALC{"$IF($EXACT($GET(netwint),week),selected,)"}%>week</option> <option %CALC{"$IF($EXACT($GET(netwint),month),selected,)"}%>month</option> <option %CALC{"$IF($EXACT($GET(netwint),year),selected,)"}%>year</option> </select> <input type="hidden" name="jobint" value=%CALC{"$GET(jobint)"}%> <input type="hidden" name="freestorageint" value=%CALC{"$GET(freestorageint)"}%> </form> <br><img src="%GANGLIABASE%/graph.php?g=network_report&z=medium&c=PSI%20Tier3%20workers&m=&r=%CALC{"$GET(netwint)"}%&s=descending&hc=4&st=now" /> <img src="%GANGLIABASE%/graph.php?g=network_report&z=medium&c=PSI%20Tier3%20fileservers&m=&r=%CALC{"$GET(netwint)"}%&s=descending&hc=4&st=now" /> <img src="%GANGLIABASE%/graph.php?g=network_report&z=medium&c=PSI%20Tier3%20services&m=&r=%CALC{"$GET(netwint)"}%&s=descending&hc=4&st=now" /> dCache %GREEN%ACTIVE%ENDCOLOR% movers: *regular* movers = *dcap/gsidcap* movers (heavy random IO for internal analysis), *WAN* movers = *SRM/gridftp* movers (transfers of whole files also from outside): *xrootd* movers = LAN/WAN transfers of files by [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookXrootdService][xrootd]] <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_movers_regular_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_movers_wan_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> <img src="%GANGLIABASE%/graph.php?c=PSI%20Tier3%20services&h=t3se01.psi.ch&v=5&m=t3se01_movers_xrootd_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> dCache %ORANGE%QUEUED%ENDCOLOR% movers: *regular* movers = *dcap/gsidcap* movers (heavy random IO for internal analysis), *WAN* movers = *SRM/gridftp* movers (transfers of whole files also from outside): *xrootd* movers = LAN/WAN transfers of files by [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookXrootdService][xrootd]] <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_moversQ_regular_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_moversQ_wan_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&v=0&m=t3se01_moversQ_xrootd_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> dCache %RED%PENDING%ENDCOLOR% requests (these are hanging transfers, almost always an error state if they persist): <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_pending_requests&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now&z=medium"/> ---++ Temps ( protected ) <img src=https://ganglia03.psi.ch/ganglia/graph.php?g=temperature_report&c=rztemp&h=T_18&r=&z=medium&st=now><img src=https://ganglia03.psi.ch/ganglia/graph.php?g=temperature_report&c=rztemp&h=T_19&r=&z=medium&st=now>
Attachments
Attachments
Topic attachments
I
Attachment
History
Action
Size
Date
Who
Comment
png
jobs-20111025.png
r1
manage
14.9 K
2011-10-25 - 08:01
DerekFeichtinger
This topic: CmsTier3
>
WebHome
>
Tier3Monitoring
Topic revision: r53 - 2014-04-03 - FabioMartinelli
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback