Tags:
meeting
1
tag this topic
create new tag
view all tags
<!-- keep this as a security measure: * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.LCGAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.LCGAdminGroup #uncomment this if you want the page only be viewable by the internal people #* Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.LCGAdminGroup,Main.ChippComputingBoardGroup --> ---+ CSCS Operations Meeting on 2016-09-27 * *Date and time*: * *Place*: * *External link / EVO*: ---++ Agenda * First meeting's overview * Issue list review * Ticket review * Maintenances * Other/AOB ---++ Attendants * Fabio, Roland, Dino, Dario, Gianni, Miguel, Stefano, Pablo, Luis * Gianfranco apologizes but sent feedback by email ---++ Minutes On the task list inside the first (priority) block (before end-of October): * Requests for offers to SNF are sent and waiting for input from vendors. SNF info input in progress. * Efficiency problems with scratch is currently treated as an incident that needs to be solved high priority. CMS confirms there is no problem, but confirmation from LHCb and ATLAS is needed * Regarding ARC config, ATLAS ask (via email) for configuration for arc01-03 and history of changes if available * We need to see both ATLAS and CMS efficiency plots and compare in order to understand better what happened in the last year(s). Fabio should send CSCS the graph, and CSCS will try to compare both and find commonalities and differences * Authentication in Kibana with grid certificates is ongoing. Fabio is interested to have temporary access with ssh tunnel, and Dino will tell him how on the chat * Joining the VOs: CMS is ready, ATLAS is waiting for two names. LHCb reported (before the meeting) that's not needed. The rest of the task (familiarize with the VO dashboards) is rescheduled for the Hackathon period (end of October). * Gianfranco sent two links for identifying A/R metrics: http://wlcg-sam-atlas.cern.ch/templates/ember/#/historicalsmry/heatMap?group=ATLAS_Cloud_DE&profile=ATLAS_AnalysisAvailability&time=1m&type=Availability%20Ranking%20Plot and http://dashb-atlas-ssb.cern.ch/dashboard/request.py/siteviewhistorywithstatistics?columnid=562&view=Shifter%20view#time=720&start_date=&end_date=&use_downtimes=false&merge_colors=false&sites=multiple&clouds=DE&site=CSCS-LCG2,CSCS-LCG2_MCORE,ANALY_CSCS,CSCS-LCG2-HPC,CSCS-LCG2-HPC_MCORE,ANALY_CSCS-HPC Dashboard Hackathon: * We should not have the hackathon without a clear plan on what is wanted to be done. In absence of a better plan, Pablo asks everyone to provide a list of TWO lights/metrics that they will like to see (that you consider most important) in the dashboard within the next two weeks. Stefano will coordinate the Hackathon. Regarding the rest of the task list (to be addressed starting in November): * We need to keep an eye (statistics) on memory utilization and problems derived to memory abuse (e.g. swapping) before imposing limits to jobs. Fabio suggests to impose a maximum of 2xRequiredMem but Pablo insists that could cause other problems and we should not try to solve problems that don't exist (Dino reports nodes are not swapping). This might be a problem, though, for LHConCRAY so the issue will be derived to the project instead. * The VO-Box discussion (and related tickets) are still waiting for Derek's input. This is getting increasingly important since Fabio is leaving, because the continuity of the CMS vobox needs to be guaranteed. It was agreed to increase the priority of this task by setting a deadline to November instead of December (that will be too tight for Fabio) * The Slurm reports might be easily fixed: CSCS will re-assess the work that is involved and see if that can easily/quickly be done. * The "Finalize BDII" task is clarified: it involves changing the HA setting from lbcd to keepalive. * All the other tasks have no input and experience no change in priority Regarding open tickets: * #22368 Chech CSCS status on CMS dashboard. This is top priority for Fabio, but it depends on the decision from Derek (another reason to have a final word) * #24193 Stalled jobs at CSCS. This is an old ticket from Vladimir and can be closed * Time ran out and we could not go through all tickets in detail, but VO Reps report there is no burning issue at the moment Next meeting in two weeks, same day and same time (11th of October at 14:00) ---++ Action items * ATLAS and LHCb to confirm if the efficiency problems are still there * CSCS to send Gianfranco ARC config and history of changes if available * Fabio to send CSCS a graph with efficiency plots with as much as 4 years of history * Dino to help Fabio use Kibana via ssh tunnel * CSCS to send Gianfranco the two names to include into the VO * Derek to send CSCS a reply for the VO-Box proposal. Final implementation finished before end-of November * CSCS to reassess the work involved in fixing the slurm reports * CSCS to update the task list with proposed changes * EVERYONE to provide a list of two lights/metrics that they would like to see in the dashboard before next meeting
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r1 - 2016-09-28
-
PabloFernandez
LCGTier2
Log In
(Topic)
LCGTier2 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Users
Entry point / Contact
RoadMap
ATLAS Pages
CMS Pages
CMS User Howto
CHIPP CB
Outreach
Technical
Cluster details
Services
Hardware and OS
Tools & Tips
Monitoring
Logs
Maintenances
Meetings
Tests
Issues
Blog
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
LCGTier2 Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Warning: Can't find topic "".""
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback