<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups # * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup # * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup,Main.CMSAdminReaderGroup --> %INCLUDE{"ListNews" NEWSROWS="1"}% ---+!! PSI Tier-3 Monitoring %CALC{"$SET(jobint,$IF($EXACT(%URLPARAM{"jobint"}%,),day,%URLPARAM{"jobint"}%))"}% %CALC{"$SET(freestorageint,$IF($EXACT(%URLPARAM{"freestorageint"}%,),week,%URLPARAM{"freestorageint"}%))"}% %CALC{"$SET(netwint,$IF($EXACT(%URLPARAM{"netwint"}%,),day,%URLPARAM{"netwint"}%))"}% <a href="http://t3mon.psi.ch/ganglia/?c=PSI%20Tier3%20fileservers&m=&r=hour&s=descending&hc=4"><img src="http://t3mon.psi.ch/ganglia/PSIT3-custom/PSI%2520Tier3%2520fileservers-pie.png"/></a> <a href="http://t3mon.psi.ch/ganglia/?c=PSI%20Tier3%20services&m=&r=hour&s=descending&hc=4"><img src="http://t3mon.psi.ch/ganglia/PSIT3-custom/PSI%2520Tier3%2520services-pie.png"/></a> <a href="http://t3mon.psi.ch/ganglia/?c=PSI%20Tier3%20workers&m=&r=hour&s=descending&hc=4"><img src="http://t3mon.psi.ch/ganglia/PSIT3-custom/PSI%2520Tier3%2520workers-pie.png"/></a> %TOC% ---++ Availability tests These tests are run by the centralized Grid monitoring services and they determine whether our site is considered to be working correctly for the VOs. * [[https://operations-portal.egi.eu/dashboard][EGI Operations Dashboard]] * [[https://ngi-de-nagios.gridka.de/nagios/cgi-bin/status.cgi?hostgroup=site-T3_CH_PSI&style=detail][central Nagios]] */* [[https://ngi-de-nagios.gridka.de/myegi/history/?facelist_values_sites=479][Nagios history]] * [[https://forseti.switch.ch/nagios/cgi-bin/status.cgi?hostgroup=site-T3_CH_PSI&style=detail][Swiss Nagios]] */* [[https://forseti.switch.ch/myegi/history/][Swiss Nagios history]] * *SAM:* [[https://lcg-sam.cern.ch:8443/sam/sam.py?funct=ShowHistory&sensors=sBDII&vo=ops&nodename=t3bdii.psi.ch][BDII]], [[https://lcg-sam.cern.ch:8443/sam/sam.py?funct=ShowHistory&sensors=SRMv2&vo=ops&nodename=t3se01.psi.ch][SE]], [[http://goc.grid.sinica.edu.tw/gstat/T3_CH_PSI/][Gstat]], [[http://pps-sam.cern.ch/gridview/regions/GermanySwitzerland.html][GermanySwitzerland overview]] */* <literal> <a href="https://gridview.cern.ch/GRIDVIEW/same_index.php?&Information=SiteDetail&DefVO=15&TestVO=-1&DurationOption=current&LComponent=-2&NodeID=-1&TestID=-1&Hour1=0&StartDay=-1&StartMonth=-1&StartYear=-1&Hour2=23&EndDay=28&EndMonth=1&EndYear=2011<ier1Site=-1&RelOrAvail=Availability&OnlyCritical=ON&SiteFullName=1&Report=0<ier2Site[]=3055006">Gridview</a> </literal> ---++ Batch jobs (queuing system) [[http://t3mon.psi.ch/ganglia/PSIT3-custom/qstat.txt][Current queue]] */* [[http://t3mon.psi.ch/ganglia/PSIT3-custom/accounting.txt][accounting]] Number of running and queued jobs: <form name="formJobint" action="%TOPICURL%?#Batch_jobs_queuing_system" method=GET> <select name="jobint" onchange="formJobint.submit()"> <option %CALC{"$IF($EXACT($GET(jobint),hour),selected,)"}%>hour</option> <option %CALC{"$IF($EXACT($GET(jobint),day),selected,)"}%>day</option> <option %CALC{"$IF($EXACT($GET(jobint),week),selected,)"}%>week</option> <option %CALC{"$IF($EXACT($GET(jobint),month),selected,)"}%>month</option> <option %CALC{"$IF($EXACT($GET(jobint),year),selected,)"}%>year</option> </select> <input type="hidden" name="freestorageint" value=%CALC{"$GET(freestorageint)"}%> <input type="hidden" name="netwint" value=%CALC{"$GET(netwint)"}%> </form> <img src="%GANGLIABASE%/PSIT3-custom/running-%CALC{"$GET(jobint)"}%.gif" /> <br><img src="%GANGLIABASE%/PSIT3-custom/waiting-%CALC{"$GET(jobint)"}%.gif" /> [[http://dashb-cms-job.cern.ch/dashboard/request.py/jobsummary#site=T3_CH_PSI&sortby=user][CMS Dashboard]] ---+++ Worker nodes load monitoring [[%GANGLIABASE%/?c=PSI%20Tier3%20workers&m=&r=day&s=descending&hc=4][Ganglia WN page]] <img src="%GANGLIABASE%/graph.php?g=load_report&z=medium&c=PSI%20Tier3%20workers&m=&r=%CALC{"$GET(jobint)"}%&s=descending&hc=4&st=now" /> <img src="%GANGLIABASE%/graph.php?g=cpu_report&z=medium&c=PSI%20Tier3%20workers&m=&r=%CALC{"$GET(jobint)"}%&s=descending&hc=4&st=now" /> ---++ Storage Show space graphs for <form name="formFreestorageint" action="%TOPICURL%?#Storage_Element" method=GET> <select name="freestorageint" onchange="formFreestorageint.submit()"> <option %CALC{"$IF($EXACT($GET(freestorageint),hour),selected,)"}%>hour</option> <option %CALC{"$IF($EXACT($GET(freestorageint),day),selected,)"}%>day</option> <option %CALC{"$IF($EXACT($GET(freestorageint),week),selected,)"}%>week</option> <option %CALC{"$IF($EXACT($GET(freestorageint),month),selected,)"}%>month</option> <option %CALC{"$IF($EXACT($GET(freestorageint),year),selected,)"}%>year</option> </select> <input type="hidden" name="jobint" value=%CALC{"$GET(jobint)"}%> <input type="hidden" name="netwint" value=%CALC{"$GET(netwint)"}%> </form> ---+++ Storage Element Links: * [[http://%SEHOST%:2288/][dCache GUI]] * List all [[https://cmsweb.cern.ch/phedex/prod/Data::Replicas?rcolumn=Name&dbs=1&node=27&node=821&nvalue=Node+bytes&rows=interesting&view=global&filter=.*][hosted datasets]] / [[https://cmsweb.cern.ch/phedex/prod/Request::View?type=xfer&nodes=T3_CH_PSI&state=pend&.submit=Submit][requests]] */* [[https://cmsweb.cern.ch/phedex/prod/Reports::SiteUsage?node=T3_CH_PSI][accounting per phys. group]] * show user/dataset [[http://t3mon.psi.ch/ganglia/PSIT3-custom/sespace.txt][space usage]] <img src="%GANGLIABASE%/PSIT3-custom/sespace-%CALC{"$GET(freestorageint)"}%.gif" /> <img src="%GANGLIABASE%/graph.php?c=PSI%20Tier3%20services&h=%SEHOST%&m=t3se01_free_cms&r=%CALC{"$GET(freestorageint)"}%&z=medium&jr=&js=&vl=GB&st=now" /> ---+++ Home and Software areas * show detailed home area [[http://t3mon.psi.ch/ganglia/PSIT3-custom/shome-du.txt][space usage]] <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3fs06.psi.ch&m=t3fs06.psi.ch_diskfull_prcnt_shome&r=%CALC{"$GET(freestorageint)"}%&z=medium&jr=&js=&st=now"/> <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3fs05.psi.ch&m=t3fs05.psi.ch_diskfull_prcnt_swshare&r=%CALC{"$GET(freestorageint)"}%&z=medium&jr=&js=&st=now"/> ---++ Networking and File Transfers (+ PhEDEx) Links: * [[%GANGLIABASE%/?m=network_report&r=day&s=descending&c=PSI+Tier3+fileservers&h=&sh=1&hc=4][Ganglia fileserver page]] * show [[http://t3mon.psi.ch/ganglia/PSIT3-custom/phedex-statistics.txt][phedex summary statistics]] */* [[http://cmsweb.cern.ch/phedex/prod/Components::Links?from_filter=T3_CH_PSI&andor=or&to_filter=T3_CH_PSI&Update=Update#][PhEDEx links between PSI and other centers]] */* [[http://cmsweb.cern.ch/phedex/prod/Activity::Deletions?reqfilter=*&node=T3_CH_PSI&state=any&blockfilter=.*&Update=Update#][deletion request state]] * [[http://gridse.sns.it/phedex/][Phedex summary view by S. Sarkar]] Plotting interval: <form name="formNetwint" action="%TOPICURL%?#Networking_and_File_Transfers_Ph" method=GET> <select name="netwint" onchange="formNetwint.submit()"> <option %CALC{"$IF($EXACT($GET(netwint),hour),selected,)"}%>hour</option> <option %CALC{"$IF($EXACT($GET(netwint),day),selected,)"}%>day</option> <option %CALC{"$IF($EXACT($GET(netwint),week),selected,)"}%>week</option> <option %CALC{"$IF($EXACT($GET(netwint),month),selected,)"}%>month</option> <option %CALC{"$IF($EXACT($GET(netwint),year),selected,)"}%>year</option> </select> <input type="hidden" name="jobint" value=%CALC{"$GET(jobint)"}%> <input type="hidden" name="freestorageint" value=%CALC{"$GET(freestorageint)"}%> </form> <br><img src="%GANGLIABASE%/graph.php?g=network_report&z=medium&c=PSI%20Tier3%20workers&m=&r=%CALC{"$GET(netwint)"}%&s=descending&hc=4&st=now" /> <img src="%GANGLIABASE%/graph.php?g=network_report&z=medium&c=PSI%20Tier3%20fileservers&m=&r=%CALC{"$GET(netwint)"}%&s=descending&hc=4&st=now" /> <img src="%GANGLIABASE%/graph.php?g=network_report&z=medium&c=PSI%20Tier3%20services&m=&r=%CALC{"$GET(netwint)"}%&s=descending&hc=4&st=now" /> dCache active movers (default plot for *dcap* related movers, wan plot for *SRM/gridftp* movers): <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_movers_default_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_movers_wan_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> dCache queued movers (default plot for *dcap* related movers, wan plot for *SRM/gridftp* movers): <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_moversQ_default_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_moversQ_wan_cms&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now"/> dCache pending requests (these are hanging transfers, almost always an error state if they persist): <img src="%GANGLIABASE%/graph.php?&c=PSI%20Tier3%20services&h=t3se01.psi.ch&m=t3se01_pending_requests&r=%CALC{"$GET(netwint)"}%&z=medium&jr=&js=&st=now&z=medium"/> -- Main.DerekFeichtinger - 12 Nov 2008
Attachments
Attachments
Topic attachments
I
Attachment
History
Action
Size
Date
Who
Comment
png
jobs-20111025.png
r1
manage
14.9 K
2011-10-25 - 08:01
DerekFeichtinger
This topic: CmsTier3
>
WebHome
>
Tier3Monitoring
Topic revision: r41 - 2012-10-11 - DerekFeichtinger
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback