Tags:
tag this topic
create new tag
view all tags
%TOC% https://github.com/jpata/cms-ch-ops <br /> ---+ Lists to follow * https://cms-logbook.cern.ch/elog/ _special account is needed_ * follow hn-cms-comp-ops@cern.ch and the minutes of the [[https://twiki.cern.ch/twiki/bin/view/CMS/CompOpsMeeting][CompOps meeting]] ---+ Monitoring Some central monitoring links from the CERN-kibana (modern, experimental) * CSCS job type fractions [[https://es-cms.cern.ch/app/kibana?#/visualize/create?type=histogram&indexPattern=%5Bcms-%5DYYYY-MM-DD&_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-7d,mode:quick,to:now))&_a=(filters:!(),linked:!f,query:(query_string:(analyze_wildcard:!t,query:'GLIDEIN_CMSSite:%20T2_CH_CSCS')),uiState:(),vis:(aggs:!((id:'1',params:(),schema:metric,type:count),(id:'2',params:(customInterval:'2h',extended_bounds:(),field:RecordTime,interval:auto,min_doc_count:1),schema:segment,type:date_histogram),(id:'4',params:(field:TaskType,order:desc,orderBy:'1',size:5),schema:group,type:terms)),listeners:(),params:(addLegend:!t,addTimeMarker:!f,addTooltip:!t,defaultYExtents:!f,mode:percentage,scale:linear,setYExtents:!f,shareYAxis:!t,times:!(),yAxis:()),title:'New%20Visualization',type:histogram))][7d]] [[https://es-cms.cern.ch/app/kibana?#/visualize/create?type=histogram&indexPattern=%5Bcms-%5DYYYY-MM-DD&_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-90d,mode:quick,to:now))&_a=(filters:!(),linked:!f,query:(query_string:(analyze_wildcard:!t,query:'GLIDEIN_CMSSite:%20T2_CH_CSCS')),uiState:(),vis:(aggs:!((id:'1',params:(),schema:metric,type:count),(id:'2',params:(customInterval:'2h',extended_bounds:(),field:RecordTime,interval:auto,min_doc_count:1),schema:segment,type:date_histogram),(id:'4',params:(field:TaskType,order:desc,orderBy:'1',size:5),schema:group,type:terms)),listeners:(),params:(addLegend:!t,addTimeMarker:!f,addTooltip:!t,defaultYExtents:!f,mode:percentage,scale:linear,setYExtents:!f,shareYAxis:!t,times:!(),yAxis:()),title:'New%20Visualization',type:histogram))][90d]] * CSCS job failure rate [[https://es-cms.cern.ch/app/kibana?#/visualize/create?type=histogram&indexPattern=%5Bcms-%5DYYYY-MM-DD&_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-7d,mode:quick,to:now))&_a=(filters:!(),linked:!f,query:(query_string:(analyze_wildcard:!t,query:'GLIDEIN_CMSSite:%20T2_CH_CSCS')),uiState:(),vis:(aggs:!((id:'1',params:(field:CpuTimeHr),schema:metric,type:sum),(id:'2',params:(customInterval:'2h',extended_bounds:(),field:RecordTime,interval:h,min_doc_count:1),schema:segment,type:date_histogram),(id:'4',params:(filters:!((input:(query:(query_string:(analyze_wildcard:!t,query:'ExitCode:%200'))),label:''),(input:(query:(query_string:(analyze_wildcard:!t,query:'ExitCode:%20%3E0')))))),schema:group,type:filters),(id:'3',params:(field:TaskType,order:desc,orderBy:_term,row:!t,size:10),schema:split,type:terms)),listeners:(),params:(addLegend:!t,addTimeMarker:!f,addTooltip:!t,defaultYExtents:!f,mode:stacked,scale:linear,setYExtents:!f,shareYAxis:!t,times:!(),yAxis:()),title:'New%20Visualization',type:histogram))][7d]] [[https://es-cms.cern.ch/app/kibana?#/visualize/create?type=histogram&indexPattern=%5Bcms-%5DYYYY-MM-DD&_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-90d,mode:quick,to:now))&_a=(filters:!(),linked:!f,query:(query_string:(analyze_wildcard:!t,query:'GLIDEIN_CMSSite:%20T2_CH_CSCS')),uiState:(),vis:(aggs:!((id:'1',params:(field:CpuTimeHr),schema:metric,type:sum),(id:'2',params:(customInterval:'2h',extended_bounds:(),field:RecordTime,interval:h,min_doc_count:1),schema:segment,type:date_histogram),(id:'4',params:(filters:!((input:(query:(query_string:(analyze_wildcard:!t,query:'ExitCode:%200'))),label:''),(input:(query:(query_string:(analyze_wildcard:!t,query:'ExitCode:%20%3E0')))))),schema:group,type:filters),(id:'3',params:(field:TaskType,order:desc,orderBy:_term,row:!t,size:10),schema:split,type:terms)),listeners:(),params:(addLegend:!t,addTimeMarker:!f,addTooltip:!t,defaultYExtents:!f,mode:stacked,scale:linear,setYExtents:!f,shareYAxis:!t,times:!(),yAxis:()),title:'New%20Visualization',type:histogram))][90d]] * CSCS CPU efficiency [[https://es-cms.cern.ch/app/kibana?#/visualize/create?type=histogram&indexPattern=%5Bcms-%5DYYYY-MM-DD&_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-7d,mode:quick,to:now))&_a=(filters:!(),linked:!f,query:(query_string:(analyze_wildcard:!t,query:'GLIDEIN_CMSSite:%20T2_CH_CSCS')),uiState:(),vis:(aggs:!((id:'1',params:(field:CpuEff,percents:!(50)),schema:metric,type:median),(id:'2',params:(customInterval:'2h',extended_bounds:(),field:RecordTime,interval:h,min_doc_count:1),schema:segment,type:date_histogram),(id:'3',params:(field:TaskType,order:desc,orderBy:_term,row:!t,size:10),schema:split,type:terms)),listeners:(),params:(addLegend:!t,addTimeMarker:!f,addTooltip:!t,defaultYExtents:!f,mode:grouped,scale:linear,setYExtents:!f,shareYAxis:!t,times:!(),yAxis:()),title:'New%20Visualization',type:histogram))][7d]] [[https://es-cms.cern.ch/app/kibana?#/visualize/create?type=histogram&indexPattern=%5Bcms-%5DYYYY-MM-DD&_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-90d,mode:quick,to:now))&_a=(filters:!(),linked:!f,query:(query_string:(analyze_wildcard:!t,query:'GLIDEIN_CMSSite:%20T2_CH_CSCS')),uiState:(),vis:(aggs:!((id:'1',params:(field:CpuEff,percents:!(50)),schema:metric,type:median),(id:'2',params:(customInterval:'2h',extended_bounds:(),field:RecordTime,interval:h,min_doc_count:1),schema:segment,type:date_histogram),(id:'3',params:(field:TaskType,order:desc,orderBy:_term,row:!t,size:10),schema:split,type:terms)),listeners:(),params:(addLegend:!t,addTimeMarker:!f,addTooltip:!t,defaultYExtents:!f,mode:grouped,scale:linear,setYExtents:!f,shareYAxis:!t,times:!(),yAxis:()),title:'New%20Visualization',type:histogram))][90d]] * Job exit code (production) [[https://es-cms.cern.ch/app/kibana?#/visualize/create?type=histogram&indexPattern=%5Bcms-%5DYYYY-MM-DD&_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-7d,mode:quick,to:now))&_a=(filters:!(),linked:!f,query:(query_string:(analyze_wildcard:!t,query:'GLIDEIN_CMSSite:%20T2_CH_CSCS')),uiState:(),vis:(aggs:!((id:'1',params:(),schema:metric,type:count),(id:'2',params:(customInterval:'2h',extended_bounds:(),field:RecordTime,interval:h,min_doc_count:1),schema:segment,type:date_histogram),(id:'4',params:(field:ExitCode,order:desc,orderBy:'1',size:10),schema:group,type:terms),(id:'3',params:(field:TaskType,order:desc,orderBy:_term,row:!t,size:10),schema:split,type:terms)),listeners:(),params:(addLegend:!t,addTimeMarker:!f,addTooltip:!t,defaultYExtents:!f,mode:percentage,scale:linear,setYExtents:!f,shareYAxis:!t,times:!(),yAxis:()),title:'New%20Visualization',type:histogram))][7d]] * CRAB exit code (analysis) [[https://es-cms.cern.ch/app/kibana?#/visualize/create?type=histogram&indexPattern=%5Bcms-%5DYYYY-MM-DD&_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-7d,mode:quick,to:now))&_a=(filters:!(),linked:!f,query:(query_string:(analyze_wildcard:!t,query:'GLIDEIN_CMSSite:%20T2_CH_CSCS')),uiState:(vis:(legendOpen:!f)),vis:(aggs:!((id:'1',params:(),schema:metric,type:count),(id:'2',params:(customInterval:'2h',extended_bounds:(),field:RecordTime,interval:h,min_doc_count:1),schema:segment,type:date_histogram),(id:'4',params:(field:Chirp_CRAB3_Job_ExitCode,order:desc,orderBy:'1',size:10),schema:group,type:terms)),listeners:(),params:(addLegend:!t,addTimeMarker:!f,addTooltip:!t,defaultYExtents:!f,mode:percentage,scale:linear,setYExtents:!f,shareYAxis:!t,times:!(),yAxis:()),title:'New%20Visualization',type:histogram))][7d]] * Compare 3 sites, job failures [[https://es-cms.cern.ch/app/kibana?#/visualize/create?type=histogram&indexPattern=%5Bcms-%5DYYYY-MM-DD&_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-7d,mode:quick,to:now))&_a=(filters:!(),linked:!f,query:(query_string:(analyze_wildcard:!t,query:'GLIDEIN_CMSSite:%20(T2_CH_CSCS%20OR%20T2_UK_SGrid_RALPP%20OR%20T2_EE_Estonia)')),uiState:(vis:(colors:('ExitCode:%200':%237EB26D,'ExitCode:%20%3E0':%23E24D42))),vis:(aggs:!((id:'1',params:(),schema:metric,type:count),(id:'2',params:(customInterval:'2h',extended_bounds:(),field:RecordTime,interval:auto,min_doc_count:1),schema:segment,type:date_histogram),(id:'3',params:(field:GLIDEIN_CMSSite,order:desc,orderBy:'1',row:!t,size:5),schema:split,type:terms),(id:'4',params:(filters:!((input:(query:(query_string:(analyze_wildcard:!t,query:'ExitCode:%200'))),label:''),(input:(query:(query_string:(analyze_wildcard:!t,query:'ExitCode:%20%3E0')))))),schema:group,type:filters)),listeners:(),params:(addLegend:!t,addTimeMarker:!f,addTooltip:!t,defaultYExtents:!f,mode:stacked,scale:linear,setYExtents:!f,shareYAxis:!t,times:!(),yAxis:()),title:'New%20Visualization',type:histogram))][7d]] * Compare 3 sites, job fractions [[https://es-cms.cern.ch/app/kibana?#/visualize/create?type=histogram&indexPattern=%5Bcms-%5DYYYY-MM-DD&_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-7d,mode:quick,to:now))&_a=(filters:!(),linked:!f,query:(query_string:(analyze_wildcard:!t,query:'GLIDEIN_CMSSite:%20(T2_CH_CSCS%20OR%20T2_UK_SGrid_RALPP%20OR%20T2_EE_Estonia)')),uiState:(vis:(colors:('ExitCode:%200':%237EB26D,'ExitCode:%20%3E0':%23E24D42))),vis:(aggs:!((id:'1',params:(),schema:metric,type:count),(id:'2',params:(customInterval:'2h',extended_bounds:(),field:RecordTime,interval:auto,min_doc_count:1),schema:segment,type:date_histogram),(id:'3',params:(field:GLIDEIN_CMSSite,order:desc,orderBy:'1',row:!t,size:5),schema:split,type:terms),(id:'4',params:(filters:!((input:(query:(query_string:(analyze_wildcard:!t,query:'TaskType:%20analysis'))),label:''),(input:(query:(query_string:(analyze_wildcard:!t,query:'TaskType:%20production')))),(input:(query:(query_string:(analyze_wildcard:!t,query:'TaskType:%20gensim')))),(input:(query:(query_string:(analyze_wildcard:!t,query:'-TaskType:%20(analysis%20OR%20production%20OR%20gensim)'))),label:other))),schema:group,type:filters)),listeners:(),params:(addLegend:!t,addTimeMarker:!f,addTooltip:!t,defaultYExtents:!f,mode:stacked,scale:linear,setYExtents:!f,shareYAxis:!t,times:!(),yAxis:()),title:'New%20Visualization',type:histogram))][7d]] ---+ Site Readiness * [[https://twiki.cern.ch/twiki/bin/view/CMS/SiteSupportSiteStatusSiteReadiness][Site Readiness Logic]] * [[https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T2_CH_CSCS][CSCS Site Readiness]] vs [[http://cmssst.web.cern.ch/cmssst/meet_plots//t2readyrank.png][ALL T2s Site Readiness]] * SSB: [[http://dashb-ssb.cern.ch/dashboard/request.py/siteview#currentView=Site+Readiness&search_0=CSCS][1]] [[http://dashb-ssb.cern.ch/dashboard/request.py/siteview#currentView=default&search_0=CSCS][2]] * [[https://dashb-ssb.cern.ch/dashboard/request.py/sitehistory?site=T2_CH_CSCS#currentView=Site+Readiness&time=720&start_date=2016-03-01&end_date=2016-04-18&values=false&spline=false&white=true][Site Availability Metrics]], [[http://wlcg-sam-cms.cern.ch/templates/ember/#/historicalsmry/heatMap?profile=CMS_CRITICAL_FULL&site=T2_CH_CSCS&time=Last%20Month&view=Service%20Availability][Computing Element availability]] * [[https://meter.cern.ch/public/_plugin/kibana/#/dashboard/temp/CMS::CRABglexec][glExec failures (Kibana)]] * [[https://cmst1.web.cern.ch/CMST1/SST/analysis/usableSites.json][Usable sites]] _this file is consumed by CRAB3 to decide if CSCS is OK or NOT_ * <pre># Last 24h of CMS Site Readiness @ @CSCS # https://dashb-ssb.cern.ch/dashboard/request.py/sitehistory?site=T2_CH_CSCS#currentView=Site+Readiness&time=custom&start_date=2016-09-28&end_date=2016-09-29&values=false&spline=false&white=false TODAY=` date +%F -d "-0 days"` YESTERDAY=`date +%F -d "-1 days"` lynx --dump "http://dashb-ssb.cern.ch/dashboard/request.py/getsiteplotdata?site=T2_CH_CSCS&view=Site%20Readiness&time=custom&dateFrom=${YESTERDAY}&dateTo=${TODAY}&prettyprint" | egrep --color '"HC glidein"|"Prod Status"|"Site Readiness"|"Site SAM availability"|"TopologyMaintenances"|"Maintenance saddlebrown"|"Maintenance brown"|"Error"|"Warning"|"OK"|$' </pre> ---+ Tickets vs CSCS * [[https://ggus.eu/?mode=ticket_search&cms_site=T2_CH_CSCS&timeframe=any&status=open&search_submit=GO%21][GGUS]] ; if you see a new CMS CSCS Ticket then search and %RED%subscribe%ENDCOLOR% to it by the [[https://xgus.ggus.eu/ngi_ch/index.php?mode=ticket_search&show_columns_check%5B%5D=GGUS_REQUEST_ID&show_columns_check%5B%5D=AFFECTED_VO&show_columns_check%5B%5D=AFFECTED_SITE&show_columns_check%5B%5D=CMS_SITE&show_columns_check%5B%5D=PRIORITY&show_columns_check%5B%5D=RESPONSIBLE_UNIT&show_columns_check%5B%5D=CMS_SU&show_columns_check%5B%5D=STATUS&show_columns_check%5B%5D=DATE_OF_CHANGE&show_columns_check%5B%5D=SHORT_DESCRIPTION&ticket_id=&ggus_ticket_id=&supportunit=&vo=cms&user=&keyword=&involvedsupporter=&assignedto=&affectedsite=&status=open&priority=&typeofproblem=all&ticket_category=all&date_type=creation+date&tf_radio=1&timeframe=any&from_date=29+Apr+2011&to_date=30+Apr+2096&untouched_date=&orderticketsby=REQUEST_ID&orderhow=desc&search_submit=GO%21][NGI_CH Tickets]] portal ; there is no way to be informed directly by GGUS :( * [[https://xgus.ggus.eu/ngi_ch/index.php?mode=ticket_search&show_columns_check%5B%5D=GGUS_REQUEST_ID&show_columns_check%5B%5D=AFFECTED_VO&show_columns_check%5B%5D=AFFECTED_SITE&show_columns_check%5B%5D=PRIORITY&show_columns_check%5B%5D=RESPONSIBLE_UNIT&show_columns_check%5B%5D=STATUS&show_columns_check%5B%5D=DATE_OF_CHANGE&show_columns_check%5B%5D=SHORT_DESCRIPTION&ticket_id=&ggus_ticket_id=&supportunit=&vo=cms&user=&keyword=&involvedsupporter=&assignedto=&affectedsite=CSCS-LCG2&status=open&priority=&typeofproblem=all&ticket_category=all&date_type=creation+date&tf_radio=1&timeframe=any&from_date=18+Apr+2016&to_date=19+Apr+2016&untouched_date=&orderticketsby=REQUEST_ID&orderhow=desc&search_submit=GO%21][NGI_CH]] * [[https://webrt.cscs.ch/][CSCS ( Their Local Ticketing System )]] ---+ GlideInWMS Jobs ---++ Doc * https://twiki.cern.ch/twiki/bin/viewauth/CMS/GlideinWMSFrontendOps * Job Exit Codes: https://twiki.cern.ch/twiki/bin/view/CMSPublic/JobExitCodes ---++ Global Pool * http://submit-3.t2.ucsd.edu/CSstoragePath/Monitor/latest.txt _look for both T2_CH_CSCS and T2_CH_CSCS_HPC_ ---++ cms-gwmsmon website requires your X509 in the browser since 09/11/2016 * https://hypernews.cern.ch/HyperNews/CMS/get/comp-ops/3272.html * <pre>lxplus109 ~]$ cern-get-sso-cookie --krb -r -u https://gwmsmon-development.cern.ch -o ~/private/ssocookie.txt lxplus109 ~]$ wget -q --load-cookies ~/private/ssocookie.txt https://gwmsmon-development.cern.ch/totalview/json/maxused lxplus109 ~]$ curl -L --cookie ~/private/ssocookie.txt --cookie-jar ~/private/ssocookie.txt https://gwmsmon-development.cern.ch/totalview/json/maxused </pre> ---++ Jobs in the global pool Here is the amount of total jobs available to run on our sites, can be further split to Analysis or Production. * https://cms-gwmsmon.cern.ch/totalview/T2_CH_CSCS * https://cms-gwmsmon.cern.ch/totalview/T2_CH_CSCS_HPC ---++ Debugging The CRAB3 Jobs Logs ---+++ By cms-gwmsmon By the 'User Web Directories' links published on https://cms-gwmsmon.cern.ch/analysisview/T2_CH_CSCS is possible to debug the CRAB3 Jobs Logs to their *greatest detail* ; regrettably not all of these 'User Web Directories' are directly accessible from Internet because of the CERN FW rules ; again regrettably these links present ALL the Jobs Logs without a mean to filter only the T2_CH_CSCS jobs ; on 23rd May 2016 the Internet accessible/blocked table was : | *From Internet* | *Only from lxplus by lynx / elinks / firefox / ...* | | http://submit-5.t2.ucsd.edu/CSstoragePath/?C=M;O=D | | | | http://vocms0109.cern.ch/?C=M;O=D= | | http://submit-4.t2.ucsd.edu/CSstoragePath/?C=M;O=D | | | | http://vocms066.cern.ch/?C=M;O=D= | | | http://vocms059.cern.ch/?C=M;O=D= | | http://vocms0114.cern.ch/?C=M;O=D | CERN FW misconfigured! | | | http://vocms095.cern.ch/?C=M;O=D | | | http://vocms021.cern.ch/?C=M;O=D | ---++++ By SSH / curl A trick to browse the previous hidden 'User Web Directories' links consists in opening 2 different terminals on a UI, in the 1st terminal login at CERN by =ssh -D 12345 YOURACCOUNT@lxplus.cern.ch= and then in the 2nd terminal : %TWISTY% <pre>$ curl --socks5 localhost:12345 --silent --stderr - http://vocms0109.cern.ch/cmsprd/%BLUE%160424_091722:sciaba_crab_HC-98-T2_CH_CSCS-27569-20160423050904%ENDCOLOR%/%ORANGE%job_out.1.0.txt%ENDCOLOR% | head ======== gWMS-CMSRunAnalysis.sh STARTING at Sun Apr 24 10:40:03 GMT 2016 on %BLUE%wn84.lcg.cscs.ch%ENDCOLOR% ======== Local time : Sun Apr 24 12:40:03 CEST 2016 Current system : Linux wn84.lcg.cscs.ch 2.6.32-573.12.1.el6.x86_64 #1 SMP Tue Dec 15 08:24:23 CST 2015 x86_64 x86_64 x86_64 GNU/Linux Arguments are -a sandbox.tar.gz --sourceURL=https://cmsweb.cern.ch/crabcache ... </pre> %ENDTWISTY% ---++++ By the CMS Dashboard Given the CRAB3 jobs ran at CSCS in a certain period http://dashb-cms-job.cern.ch/dashboard/templates/web-job2/#user=&refresh=0&table=Jobs&p=1&records=500&activemenu=1&usr=&site=T2_CH_CSCS&submissiontool=crab3 we can retrieve the https://cmsweb.cern.ch/scheddmon links to be read again by =curl= ; internally these links are ordinary symbolic links to the 'User Web Directories' links cited in the previous section with the difference that are put behind a common portal https://cmsweb.cern.ch/scheddmon/ AND they require your %BLUE%X509%ENDCOLOR% : <pre>$ curl --socks5 localhost:12345 --stderr - --capath /etc/grid-security/certificates -E %BLUE%$X509_USER_PROXY%ENDCOLOR% --cacert %BLUE%$X509_USER_PROXY%ENDCOLOR% https://cmsweb.cern.ch/scheddmon/0114/cms1702/160524_010723:zhangj_crab_l1-integration-v58p0_MC2015__SingleNeutrino_25nsPU10/job_out.1.2.txt | less </pre> ---+++ HammerCloud ( CRAB3 tests ) * [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/CMSHammerCloud][The CMS HammerCloud]] * Its [[http://hc-ai-core.cern.ch/hc/app/cms/][WebInterface]] ( search for CSCS ) ---++ arc0[1-3] + arcbrisi status/stats split by Factory Reported here as a reference, consult it *just if it's really needed* : | *CERN* | *Plot* | *GOC* | *Plot* | *UCSD* | *Plot* | | [[http://vocms0305.cern.ch/monitor//factoryEntryStatusNow.html?entry=CMSHTPC_T2_CH_CSCS_arc01_multicore][arc01]] | [[http://vocms0305.cern.ch/monitor//factoryStatus.html?entry=CMSHTPC_T2_CH_CSCS_arc01_multicore][plot]] | [[http://glidein.grid.iu.edu/factory/monitor//factoryEntryStatusNow.html?entry=CMSHTPC_T2_CH_CSCS_arc01_multicore][arc01]] | [[http://glidein.grid.iu.edu/factory/monitor//factoryStatus.html?entry=CMSHTPC_T2_CH_CSCS_arc01_multicore][plot]] | [[http://gfactory-1.t2.ucsd.edu/factory/monitor//factoryEntryStatusNow.html?entry=CMSHTPC_T2_CH_CSCS_arc01_multicore][arc01]] | [[http://gfactory-1.t2.ucsd.edu/factory/monitor//factoryStatus.html?entry=CMSHTPC_T2_CH_CSCS_arc01_multicore][plot]] | | [[http://vocms0305.cern.ch/monitor//factoryEntryStatusNow.html?entry=CMSHTPC_T2_CH_CSCS_arc02_multicore][arc02]] | [[http://vocms0305.cern.ch/monitor//factoryStatus.html?entry=CMSHTPC_T2_CH_CSCS_arc02_multicore][plot]] | [[http://glidein.grid.iu.edu/factory/monitor//factoryEntryStatusNow.html?entry=CMSHTPC_T2_CH_CSCS_arc02_multicore][arc02]] | [[http://glidein.grid.iu.edu/factory/monitor//factoryStatus.html?entry=CMSHTPC_T2_CH_CSCS_arc02_multicore][plot]] | [[http://gfactory-1.t2.ucsd.edu/factory/monitor//factoryEntryStatusNow.html?entry=CMSHTPC_T2_CH_CSCS_arc02_multicore][arc02]] | [[http://gfactory-1.t2.ucsd.edu/factory/monitor//factoryStatus.html?entry=CMSHTPC_T2_CH_CSCS_arc02_multicore][plot]] | | [[http://vocms0305.cern.ch/monitor//factoryEntryStatusNow.html?entry=CMSHTPC_T2_CH_CSCS_arc03_multicore][arc03]] | [[http://vocms0305.cern.ch/monitor//factoryStatus.html?entry=CMSHTPC_T2_CH_CSCS_arc03_multicore][plot]] | [[http://glidein.grid.iu.edu/factory/monitor//factoryEntryStatusNow.html?entry=CMSHTPC_T2_CH_CSCS_arc03_multicore][arc03]] | [[http://glidein.grid.iu.edu/factory/monitor//factoryStatus.html?entry=CMSHTPC_T2_CH_CSCS_arc03_multicore][plot]] | [[http://gfactory-1.t2.ucsd.edu/factory/monitor//factoryEntryStatusNow.html?entry=CMSHTPC_T2_CH_CSCS_arc03_multicore][arc03]] | [[http://gfactory-1.t2.ucsd.edu/factory/monitor//factoryStatus.html?entry=CMSHTPC_T2_CH_CSCS_arc03_multicore][plot]] | | [[http://vocms0305.cern.ch/monitor//factoryEntryStatusNow.html?entry=CMSHTPC_T2_CH_CSCS_arcbrisi][arcbrisi]] | [[http://vocms0305.cern.ch/monitor//factoryStatus.html?entry=CMSHTPC_T2_CH_CSCS_arcbrisi][plot]] | [[http://glidein.grid.iu.edu/factory/monitor//factoryEntryStatusNow.html?entry=CMSHTPC_T2_CH_CSCS_arcbrisi][arcbrisi]] | [[http://glidein.grid.iu.edu/factory/monitor//factoryStatus.html?entry=CMSHTPC_T2_CH_CSCS_arcbrisi][plot]] | [[http://gfactory-1.t2.ucsd.edu/factory/monitor//factoryEntryStatusNow.html?entry=CMSHTPC_T2_CH_CSCS_arcbrisi][arcbrisi]] | [[http://gfactory-1.t2.ucsd.edu/factory/monitor//factoryStatus.html?entry=CMSHTPC_T2_CH_CSCS_arcbrisi][plot]] | <br /> ---+ CMS Nagios Monitoring the CMS Nagios is useful to check the %RED%Failures History%ENDCOLOR% ; [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/CompOpsSAMTests][CMS Nagios Checks Logic]] / [[https://git.cern.ch/web/cmssam.git][CMS Checks Source]] : | *Nagios old style ( minimalist )* | *Nagios Check_mk style ( cluttered )* | *%RED%Failures History%ENDCOLOR%* | *JSON* | *Python* | | [[https://etf-cms-prod.cern.ch/etf/nagios/cgi-bin/status.cgi?host=arc01.lcg.cscs.ch][arc01]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=arc01.lcg.cscs.ch][arc01]] | [[https://etf-cms-prod.cern.ch/etf/nagios/cgi-bin/notifications.cgi?host=arc01.lcg.cscs.ch][arc01]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=arc01.lcg.cscs.ch&output_format=json][arc01]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=arc01.lcg.cscs.ch&output_format=python][arc01]] | | [[https://etf-cms-prod.cern.ch/etf/nagios/cgi-bin/status.cgi?host=arc02.lcg.cscs.ch][arc02]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=arc02.lcg.cscs.ch][arc02]] | [[https://etf-cms-prod.cern.ch/etf/nagios/cgi-bin/notifications.cgi?host=arc02.lcg.cscs.ch][arc02]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=arc02.lcg.cscs.ch&output_format=json][arc02]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=arc02.lcg.cscs.ch&output_format=python][arc02]] | | [[https://etf-cms-prod.cern.ch/etf/nagios/cgi-bin/status.cgi?host=arc03.lcg.cscs.ch][arc03]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=arc03.lcg.cscs.ch][arc03]] | [[https://etf-cms-prod.cern.ch/etf/nagios/cgi-bin/notifications.cgi?host=arc03.lcg.cscs.ch][arc03]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=arc03.lcg.cscs.ch&output_format=json][arc03]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=arc03.lcg.cscs.ch&output_format=python][arc03]] | | [[https://etf-cms-prod.cern.ch/etf/nagios/cgi-bin/status.cgi?host=arcbrisi.cscs.ch][arcbrisi]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=arcbrisi.cscs.ch][arcbrisi]] | [[https://etf-cms-prod.cern.ch/etf/nagios/cgi-bin/notifications.cgi?host=arcbrisi.cscs.ch][arcbrisi]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=arcbrisi.cscs.ch&output_format=json][arcbrisi]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=arcbrisi.cscs.ch&output_format=python][arcbrisi]] | | [[https://etf-cms-prod.cern.ch/etf/nagios/cgi-bin/status.cgi?host=storage01.lcg.cscs.ch][storage01]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=storage01.lcg.cscs.ch][storage01]] | [[https://etf-cms-prod.cern.ch/etf/nagios/cgi-bin/notifications.cgi?host=storage01.lcg.cscs.ch][storage01]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=storage01.lcg.cscs.ch&output_format=json][storage01]] | [[https://etf-cms-prod.cern.ch/etf/check_mk/view.py?view_name=host&host=storage01.lcg.cscs.ch&output_format=python][storage01]] | ---+ Storage * [[http://ganglia.lcg.cscs.ch/ganglia/files_cms.html][dir list]] ( big file ! ) * [[http://hep.kbfi.ee/~joosep/StorageChart/chart.html][storage visualization]] The free space should be at least =300TB=. <img width="300" alt="" src="http://ganglia.lcg.cscs.ch/ganglia3/graph.php?c=PHOENIX-services&h=storage01.lcg.cscs.ch&m=storage01_free_cms&r=week&z=medium&jr=&js=&vl=GB&st=now" height="200" /> Here you can check the dCache CMS allocation directly: <pre>ssh ui.lcg.cscs.ch ./cms_space.sh </pre> ---++ PhEDEx ---+++ Doc * https://twiki.cern.ch/twiki/bin/view/CMS/PhedexDraftDocumentation * [[https://twiki.cern.ch/twiki/bin/viewauth/CMS/DMWMPG_Namespace][Meaning of the /store/ subdirs]] ( check =/store/temp/= and =/store/temp/user/= ) ---+++ Stats * [[https://cmsweb.cern.ch/phedex/prod/Activity::QualityPlots?graph=quality_done&entity=src&src_filter=&dest_filter=T2_CH_CSCS&no_mss=true&period=l12h&upto=][Last 12h, OKs]] | [[https://cmsweb.cern.ch/phedex/prod/Activity::QualityPlots?graph=quality_all&entity=src&src_filter=&dest_filter=T2_CH_CSCS&no_mss=true&period=l12h&upto=][Last 12h, Quality Map]] * [[http://ganglia.lcg.cscs.ch/ganglia/phedex/?C=M;O=D][Local error stats]] * [[https://cmsweb.cern.ch/phedex/prod/Activity::ErrorInfo?tofilter=T2_CH_CSCS&fromfilter=.*&report_code=.*&xfer_code=.*&to_pfn=.*&from_pfn=.*&log_detail=.*&log_validate=.*&.submit=Update#][Errors at CSCS]] * [[https://cmsweb.cern.ch/phedex/prod/Components::Links?from_filter=.*&andor=and&to_filter=T2_CH_CSCS&Update=Update#][transfer matrix]] ---+++ Debugging the FTS3 logs PhEDEx copies the data at CSCS by [[http://fts3-docs.web.cern.ch/fts3-docs/fts-rest/docs/api-curl.html][FTS3]] jobs ; a job move >= 1 file ; if there are errors at CSCS the detailed file(s) transfer logs are available on the portal : https://fts3.cern.ch:8449/fts3/ftsmon/#/ An example of a detailed file transfer log is : https://fts412.cern.ch:8449/var/log/fts3/2016-05-24/cmsrm-se01.roma1.infn.it__storage01.lcg.cscs.ch/2016-05-24-0632__cmsrm-se01.roma1.infn.it__storage01.lcg.cscs.ch__803810223__3b06ed4e-2179-11e6-a787-02163e010724 To list the recent completed FTS3 jobs ID *ordered by time* : <pre>$ LONGOUPUT=" -l " # <-- if you don't want to see the long outputs then make it empty by LONGOUPUT="" $ cd /lhome/phedex/state/Prod/incoming/download-cms02/archive $ export X509_USER_PROXY=/lhome/phedex/gridcert/proxy.cert $ find . -printf "%T@ %Tc %p\n" | sort -n | grep xferinfo | cut -d'/' -f2,3 | xargs -iI grep status ./I | sed "s#glite-transfer-status -l #glite-transfer-status $LONGOUPUT#" | uniq | bash -x </pre> To list *your current* FTS3 jobs : %TWISTY% <pre>$ curl --capath /etc/grid-security/certificates -E $X509_USER_PROXY --cacert $X509_USER_PROXY https://fts3.cern.ch:8446/whoami {"dn": ["/DC=EU/DC=EGI/C=CH/O=People/O=Paul-Scherrer-Institut (PSI)/CN=Fabio Martinelli", "/DC=EU/DC=EGI/C=CH/O=People/O=Paul-Scherrer-Institut (PSI)/CN=Fabio Martinelli/CN=proxy"], "vos_id": ["d5bdc1ae-600f-58dd-a94f-5c16b07974fd", "fb4bc86a-6738-5c53-bb11-206717a994e7"], "roles": [], "delegation_id": "%BLUE%5075946ec4d75f8c%ENDCOLOR%", "user_dn": "/DC=EU/DC=EGI/C=CH/O=People/O=Paul-Scherrer-Institut (PSI)/CN=Fabio Martinelli", "level": {"transfer": "vo"}, "is_root": false, "base_id": "01874efb-4735-4595-bc9c-591aef8240c9", "vos": ["cms", "cms/chcms"], "voms_cred": ["/cms/Role=NULL/Capability=NULL", "/cms/chcms/Role=NULL/Capability=NULL"], "method": "certificate"}[martinelli_f@t3ui19 ~]$ $ curl --capath /etc/grid-security/certificates -E $X509_USER_PROXY --cacert $X509_USER_PROXY https://fts3.cern.ch:8446/jobs?dlg_id=%BLUE%5075946ec4d75f8c%ENDCOLOR% [{"cred_id": "5075946ec4d75f8c", "user_dn": "/DC=EU/DC=EGI/C=CH/O=People/O=Paul-Scherrer-Institut (PSI)/CN=Fabio Martinelli", "retry": 0, "job_id": "1cc653f4-bf27-412b-82b0-138505f5c98e", "cancel_job": false, "job_state": "ACTIVE", "submit_host": "fts410.cern.ch", "priority": 1, "source_space_token": "", "reuse_job": "N", "job_metadata": "", "source_se": "srm://cms-se0.kipt.kharkov.ua", "user_cred": "", "max_time_in_queue": null, "source_token_description": null, "job_params": "", "bring_online": -1, "reason": null, "space_token": "", "submit_time": "2016-05-26T14:01:54", "retry_delay": 0, "dest_se": "srm://storage01.lcg.cscs.ch", "internal_job_params": "", "finish_time": null, "verify_checksum": false, "vo_name": "cms", "copy_pin_lifetime": -1, "agent_dn": null, "job_finished": null, "overwrite_flag": false},{"cred_id": "5075946ec4d75f8c", ... $ curl --capath /etc/grid-security/certificates -E $X509_USER_PROXY --cacert $X509_USER_PROXY https://fts3.cern.ch:8446/jobs | sed -e 's/[{}]/''/g' | awk -v k="text" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]}'| grep --color %BLUE%mangano%ENDCOLOR% -A 28 -B 1 </pre> %ENDTWISTY% ---+++ From / To Links Status CLI * <pre>$ curl --capath /etc/grid-security/certificates -E $X509_USER_PROXY --cacert $X509_USER_PROXY "https://cmsweb.cern.ch/phedex/datasvc/json/prod/links?from=T2_CH_CSCS" 2>/dev/null | python -m json.tool | egrep --color 'status|$'</pre> * <pre>$ curl --capath /etc/grid-security/certificates -E $X509_USER_PROXY --cacert $X509_USER_PROXY "https://cmsweb.cern.ch/phedex/datasvc/json/prod/links?to=T2_CH_CSCS" 2>/dev/null | python -m json.tool | egrep --color 'status|$'</pre> ---+++ Datasets Transfer Requests WEB * https://cmsweb.cern.ch/phedex/datasvc/xml/prod/transferrequests?node=T2_CH_CSCS * https://cmsweb.cern.ch/phedex/datasvc/json/prod/transferrequests?node=T2_CH_CSCS CLI * <pre>$ curl --capath /etc/grid-security/certificates -E $X509_USER_PROXY --cacert $X509_USER_PROXY https://cmsweb.cern.ch/phedex/datasvc/xml/prod/transferrequests?node=T2_CH_CSCS 2>/dev/null | xmllint --format - </pre> * <pre>$ curl --capath /etc/grid-security/certificates -E $X509_USER_PROXY --cacert $X509_USER_PROXY "https://cmsweb.cern.ch/phedex/datasvc/json/prod/transferrequests?node=T2_CH_CSCS" 2>/dev/null | python -m json.tool</pre> ---+++ Datasets Deployed WEB * https://cmsweb.cern.ch/phedex/datasvc/xml/prod/blockreplicasummary?node=T2_CH_CSCS * https://cmsweb.cern.ch/phedex/datasvc/json/prod/blockreplicasummary?node=T2_CH_CSCS CLI * <pre>$ curl --capath /etc/grid-security/certificates -E $X509_USER_PROXY --cacert $X509_USER_PROXY https://cmsweb.cern.ch/phedex/datasvc/xml/prod/blockreplicasummary?node=T2_CH_CSCS 2>/dev/null | xmllint --format - </pre> * <pre>$ curl --capath /etc/grid-security/certificates -E $X509_USER_PROXY --cacert $X509_USER_PROXY "https://cmsweb.cern.ch/phedex/datasvc/json/prod/blockreplicasummary?node=T2_CH_CSCS" 2>/dev/null | python -m json.tool </pre> ---+++ Datasets Removal To be checked *seldom* <br /> The CMS datasets asked by the Swiss users have to to be regularly deleted both at CSCS and at PSI especially if the =/pnfs= space gets few : [[CmsTier3/DataSetCleaningQuery]] <br /> Another way to identify what PhEDEx left / couldn't download is by http://t3serv001.mit.edu/~cmsprod/ConsistencyChecks/home.html ---+++ Dark Data To be checked *seldom* <br /> Often there are files in =/store/= not known by PhEDEx ; they have to be identified by the tool [[https://twiki.cern.ch/twiki/bin/view/CMS/StorageConsistencyCheck][StorageConsistencyCheck]] and probably deleted :<br /> %TWISTY% <pre>[root@storage02:~]# psql -U postgres -d chimera -c " select path from v_pnfs where path like '/pnfs/lcg.cscs.ch/cms%' ; " -t -q -o ./CSCS.txt [root@storage02:~]# scp -p CSCS.txt phedex@cms02: [phedex@cms0%RED%2%ENDCOLOR% ]$ source /lhome/phedex/PHEDEX/etc/profile.d/init.sh [phedex@cms0%RED%2%ENDCOLOR% ]$ /lhome/phedex/sw/slc6_amd64_gcc481/cms/PHEDEX/4.1.7/Utilities/StorageConsistencyCheck -db /lhome/phedex/config/DBParam.CSCS:Prod/CSCS -lfnlist /lhome/phedex/CSCS.txt -node T2_CH_CSCS > %RED%CSCS.txt.StorageConsistencyCheck.out%ENDCOLOR% 2>&1 </pre> %ENDTWISTY%<br /> The output =%RED%CSCS.txt.StorageConsistencyCheck.out%ENDCOLOR%= is a list of files known to the SE but not to PhEDEx<br /> %TWISTY% <pre>[phedex@cms02 ]$ egrep ".root$" %RED%CSCS.txt.StorageConsistencyCheck.out%ENDCOLOR% | grep -v "%BLUE%/store/[user|group]%ENDCOLOR%" /store/CSA07/skim/2007/11/15/CSA07-CSA07JetMET-Gumbo-B1-PDJetMET_Skims1/0007/06660BB3-159B-DC11-8323-001A92971AAA.root /store/CSA07/skim/2007/11/15/CSA07-CSA07JetMET-Gumbo-B1-PDJetMET_Skims1/0007/0A82A9B3-159B-DC11-B3D3-001A92810ADE.root ... [phedex@cms02 ]$ egrep ".root$" %RED%CSCS.txt.StorageConsistencyCheck.out%ENDCOLOR% | grep -v "%BLUE%/store/[user|group]%ENDCOLOR%" -c 15645 </pre> %ENDTWISTY% ---+++ Dynamic Data Management ( DDM ) stats * http://t3serv001.mit.edu/~cmsprod/IntelROCCS/Detox/result/T2_CH_CSCS/ * http://t3serv001.mit.edu/~cmsprod/IntelROCCS/Detox/result/T2_CH_CSCS/Summary.txt * https://indico.cern.ch/event/304944/contributions/1672716/attachments/578895/797102/chep-dyndata-1.pdf ( DDM presentation ) ---+++ PSI Proxy renewal once every year %TWISTY% <pre># On a T3 UI, upload the proxy on the myproxy.cern.ch server and check if it's really there t3ui12> myproxy-init -s myproxy.cern.ch -l psi_t3cmsvobox_phedex_joosep_2016 -x -k renewable -R "*CN=t3cmsvobox.psi.ch" -v -c 8700 t3ui12> myproxy-info -v -s myproxy.cern.ch --username psi_t3cmsvobox_phedex_joosep_2016 -k renewable # On PSI vobox t3cmsvobox> /home/phedex/gridcert/proxy.cert # <-- copy here a Joosep's proxy by scp or simply copy/paste </pre> %ENDTWISTY% ---+++ CSCS Proxy renewal once every year %TWISTY% <pre> lxplus> voms-proxy-init -voms cms -valid 192:00 lxplus> voms-proxy-info lxplus> myproxy-init -s myproxy.cern.ch -l cscs_cms02_phedex_jpata_2017 -x -k renewable -R "*CN=cms02.lcg.cscs.ch" -v -c 8700 lxplus> myproxy-info -v -s myproxy.cern.ch --username cscs_cms02_phedex_jpata_2017 -k renewable lxplus> cp `voms-proxy-info | grep path | awk '{print $3}'` ~/x509_cms02 cms02> rsync jpata@lxplus.cern.ch:~/x509_cms02 /home/phedex/gridcert/x509_new </pre> %ENDTWISTY% <br /> ---+ XROOTD *Availability monitoring* * is CSCS included in the [[https://cmssst.web.cern.ch/cmssst/aaa/T2_CH_CSCS_report.html][CMS AAA federation]] ? if so then stop here. * Is it not ? then check the [[https://meter.cern.ch/public/_plugin/kibana/#/dashboard/temp/CMS::XrootD][Global CMS Xrootd services]], maybe there is a global issue *Transfers monitoring* * [[http://xrootd.t2.ucsd.edu/display?SITE=CH+CSCS&imgsize=1024x600&interval.max=0&interval.min=604800000&modules=xrd_report%2Flink_in_R_chart&modules=xrd_report%2Flink_out_R_chart&page=xrd_report%2Flink_traffic_by_site&plot_series=cms01.lcg.cscs.ch&plot_series=cms02.lcg.cscs.ch][MonALISA]] , [[http://dashb-cms-xrootd-transfers.cern.ch/ui/#access_type=()&ctr.site=(T2_CH_CSCS)&date.interval=1440][CMS dashboard]] %TWISTY%<br /> *Low level debugging* * Is CSCS in the Prod Fed ? <pre>[cms02] xrdmapc --list all xrdcmsglobal01.cern.ch:1094 2>&1 | grep cscs Srv cms01.lcg.cscs.ch:1094 Srv cms02.lcg.cscs.ch:1094 Srv cms01.lcg.cscs.ch:1094 Srv cms02.lcg.cscs.ch:1094 Srv cms01.lcg.cscs.ch:1094 Srv cms02.lcg.cscs.ch:1094 [cms02] xrdmapc --list all cms-xrd-transit.cern.ch:1094 2>&1 | grep cscs [cms02] echo $? 1 <-- %GREEN%OK!!!%ENDCOLOR% </pre> *xrootd tests : * Browsing * =$ xrdfs cms0%BLUE%1%ENDCOLOR%.lcg.cscs.ch ls -l -u /store/mc/RunIIFall15MiniAODv2/= * =$ xrdfs cms0%RED%2%ENDCOLOR%.lcg.cscs.ch ls -l -u /store/mc/RunIIFall15MiniAODv2/= * Downloading * =$ xrdcp --debug 1 -f root://cms0%BLUE%1%ENDCOLOR%.lcg.cscs.ch//store/data/Run2015D/Charmonium/AOD/16Dec2015-v1/50000/8672E121-8CAE-E511-8B85-0025905C42FE.root /dev/null= * =$ xrdcp --debug 1 -f root://cms0%RED%2%ENDCOLOR%.lcg.cscs.ch//store/data/Run2015D/Charmonium/AOD/16Dec2015-v1/50000/8672E121-8CAE-E511-8B85-0025905C42FE.root /dev/null= * Other simpler [[https://en.wikipedia.org/wiki/Netcat][netcat ( nc ) ]] checks that have to succeed from any network ( try them only if the previous tests failed ) : * =$ nc -w 5 -z cms0%BLUE%1%ENDCOLOR%.lcg.cscs.ch 1094= Expected Output : =Connection to cms0%BLUE%1%ENDCOLOR%.lcg.cscs.ch 1094 port [tcp/rootd] succeeded!= * =$ nc -w 5 -z cms0%RED%2%ENDCOLOR%.lcg.cscs.ch 1094= Expected Output : =Connection to cms0%RED%2%ENDCOLOR%.lcg.cscs.ch 1094 port [tcp/rootd] succeeded!= * =$ nc -w 5 -z storage01.lcg.cscs.ch 1095= Output : =Connection to storage01.lcg.cscs.ch 1095 port [tcp/nicelink] succeeded!= * they proof that the servers firewalls are not stopping the =xrootd= connections *AND* that there is really a service listening on those servers:ports %ENDTWISTY% <br /> ---+ SQUID * =cms0%RED%2%ENDCOLOR%= [[http://wlcg-squid-monitor.cern.ch/snmpstats/mrtgcms/cscs2/proxy-hit.html][SQUID traffic plots]] <img width="300" alt="" src="http://wlcg-squid-monitor.cern.ch/snmpstats/mrtgcms/cscs/proxy-hit-week.png" height="150" /><br /> <img width="300" alt="" src="http://wlcg-squid-monitor.cern.ch/snmpstats/mrtgcms/cscs2/proxy-hit-week.png" height="150" /> *Low level debugging* * [[https://en.wikipedia.org/wiki/Netcat][netcat ( nc ) ]] checks to verify if SQUID =cms0%RED%2%ENDCOLOR%= can be [[https://twiki.cern.ch/twiki/bin/view/Frontier/InstallSquid#Enabling_monitoring][monitored by SNMP]] : * =$ nc -u -z cms0%RED%2%ENDCOLOR%.lcg.cscs.ch 3401= Expected Output: =Connection to cms0%RED%2%ENDCOLOR%.lcg.cscs.ch 3401 port [udp/filecast] succeeded!= * Cms0X#Test_squid_proxy ---+ VOfeed %N% CSCS ARC CEs + SE have to be present on http://dashb-cms-vo-feed.cern.ch/dashboard/request.py/cmssitemapbdii <br /> Reference : https://twiki.cern.ch/twiki/bin/view/EGEE/VOTagsVal <br /> ---+ Grid services have to be available in the Top BDII CSCS ARC CEs + SE have to be present on =bdii-fzk.gridka.de= ; to check : <pre>ldapsearch -x -H ldap://bdii-fzk.gridka.de:2170 -b Mds-Vo-name=CSCS-LCG2,Mds-Vo-name=local,o=grid</pre> <br /> ---+ CMS Central Services Status To be checked if something is wrong in our site : <br /> https://meter.cern.ch/public/_plugin/kibana/#/dashboard/temp/CMS::CMS <br /> ---+ SiteDB info To be checked *seldom* : * https://cmsweb.cern.ch/sitedb/prod/sites/T2_CH_CSCS * https://cmsweb.cern.ch/sitedb/prod/pledges/T2_CH_CSCS ( CPU / Storage pledged to CMS ) * https://espace.cern.ch/WLCG-document-repository/Accounting/Forms/AllItems.aspx?RootFolder=%2fWLCG-document-repository%2fAccounting%2fTier-2&FolderCTID=0x01200021FF24A680BF724BBBB5F470355FAD5F ( monthly accounting vs pledges ) * http://dashb-cms-jobsmry.cern.ch/dashboard/request.py/dailysummary#button=resourceutil&sites%5B%5D=T2_CH_CSCS&sitesSort=2&start=null&end=null&timerange=lastMonth&granularity=Daily&generic=0&sortby=0&series=All ( monthly accounting vs pledges ) ---+ Explanation of the logics To be checked *seldom* : * [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/WaitingRoomMorgueAndSiteReadiness][Site Readiness logic]] * [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/UsableSitesForAnalysis][Site Readiness logic for CRAB3]] * [[https://twiki.cern.ch/twiki/bin/view/Frontier/InstallSquid][Generic Squid Installation]] + [[https://twiki.cern.ch/twiki/bin/view/CMS/SquidForCMS][CMS Squid Tweaking]] * Site Readiness logic for CRAB3 - files to be checked * https://cmst1.web.cern.ch/CMST1/SST/analysis/usableSites.txt * https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/ASCii/UsableSites.txt ---+ GGUS CMS ticket creation * [[https://ggus.eu/index.php?mode=ticket_cms][GGUS CMS ticket creation]] ---+ T2 cms02 VOBox installation doc Nowadays the CMS VO-box is managed by the CSCS puppet by their admin team. The old recipe is here: * [[LCGTier2/Cms0X][cms02 installation page]] | https://gitlab.cern.ch/SITECONF/T2_CH_CSCS | https://gitlab.cern.ch/SITECONF/T2_CH_CSCS_HPC | [[https://git.psi.ch/pata_j/cmsops_ch_scripts][Git repo with cms02 confs]] _to be updated by Joosep_
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r95
<
r94
<
r93
<
r92
<
r91
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r95 - 2017-02-22
-
JoosepPata
LCGTier2
Log In
(Topic)
LCGTier2 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Users
Entry point / Contact
RoadMap
ATLAS Pages
CMS Pages
CMS User Howto
CHIPP CB
Outreach
Technical
Cluster details
Services
Hardware and OS
Tools & Tips
Monitoring
Logs
Maintenances
Meetings
Tests
Issues
Blog
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
LCGTier2 Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Warning: Can't find topic "".""
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback