<!-- keep this as a security measure: * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.LCGAdminGroup,Main.EgiGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.LCGAdminGroup #uncomment this if you want the page only be viewable by the internal people #* Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.LCGAdminGroup --> ---+ Swiss Grid Operations Meeting on 2013-03-07 * *Date and time*: First Thursday of the month, at 14:00 * *Place*: Vidyo (room: Swiss_Grid_Operations_Meeting, extension: 9227296) * *External link*: http://vidyoportal.cern.ch/flex.html?roomdirect.html&key=Nrq24qRR4V1u * *Phone gate*: From Switzerland: 0225330322 (portal) + 9227296 (extension) + # (pound sign) * *IRC chat*: irc:gridchat.cscs.ch:994#lcg (ask pw via email) ---++ Agenda Status * CSCS (reports Pablo): * Site entered Phase G: * Substitution of all Phase C storage (Thors, and their temporary replacement lent by CSCS) and further extension by six IBM DCS3700 controllers full of 3 TB disks, increasing the available permanent storage from 1.1 to 1.6 Petabytes. * Installation of 20 new SandyBridge @ 2.6 GHz compute nodes, increasing the amount to available job slots from 1792 to 2432, and the computing capacity from 17500 to 24200 HepSpec06 (1 HepSpec06 = 1 GigaFlop). * Installation of 2 virtualization hosts, with 1 TB of space on SSD drives and 96 GB RAM each, that will host all production virtual machines that currently reside on Phase C service nodes. * Installation of an authenticated xRootd door for dCache, and two special service for the CMS XROOTD Federation in addition: * xrootd+cmsd in cmsvobox, to act as the CSCS redirector, that publishes files to the regional one. * dCache xrootd door, chrooted to /pnfs/lcg.cscs.ch/cms/trivcat, on storage01 on a special port. To be authenticated in a couple of months, when we upgrade to 2.2 * UMD-2 upgrade status: * ARC-CE. Upgrade done, on SL5. * Site-BDII. Upgrade done, on SL6. * UI. Ongoing, ready soon, but not urgent, for it is only internal. * APEL. Ongoing. We're setting up a parallel instance and try to reproduce normal behavior * WNs. Waiting ATLAS validation for SL6. Mixed SL5/SL6 only possible splitting the cluster with a different queue for atlas+sl6 * CreamCE. Installing in cream03 next week, the other two in April with new hardware * dCache. Installing in April with new hardware * PSI (reports Fabio): * Constantly hunting for old /pnfs user files that bring our dCache daily > 90% . * Preparing dCache 2.2 migration forseen on March 28th, so far the setup is: * BDII : VM SL6, UMD2, bdii-5.2.12-1, you can observe it by running: =ldapsearch -x -H ldap://t3bdii01.psi.ch:2170 -b o=grid | less= * SE : VM SL6, dcache-2.2.8-1, bdii-5.2.12-1, dCache services: =dcap gridftp gsidcap srm spacemanager transfermanagers httpd billing srm-loginbroker pinmanager dir info poolmanager broadcast loginbroker topo= * DB : VM SL6, dcache-2.2.8-1, Postgresql 9.2.3, dCache services: =gplazma pnfsmanager cleaner acl admin nfsv3=. * Evaluating Xrootd and/or WebDav as a new local service for users. * Once the migration will be done I can provide the conf to CSCS, not great science there but it took me a while to write it. * UNIBE (reports Gianfranco): * Xxx * UNIGE (reports Szymon): * Xxx * UZH (reports Sergio): * Xxx * Switch (reports Alessandro): * UMD1 deprecated after April 2013: tickets have been opened to sites with UMD1 services * EMI Nagios tests to replace SAM ones (forwarded to Sigve and CSCS): anyone interested in this? * Problems with the WN tar ball distro and the Nagios version test: NGI_DE affected, NGI_CH not (we do not use the tar ball distro) * After end of EMI the UMD repository will be overhauled -> necessary to change the repo details, ongoing * How about checking the values in https://accounting.egi.eu/repcountry.php? Gianfranco suggested to select a time slice, run the accounting script -> un-normalized numbers -> normalize them and compare with the portal * ggus/sympa problem: one ggus test ticket was overlooked, but CCing operations@swing-grid.ch creates spurios emails somehow * UNIBE problem with srm/gridftp->bug: the grid manager stops when the transfer fails (weird). It seems solved now disabling the (deprecated) gridftp test (UNIBE-ID was not affected) * On March 11th EMI3 will be announced (Monte Bianco) * EGI asked for feedback on the (needed?) support for debian: anyone? * OMB discussed the ARGUS use case: not compulsory; do we want it? Notice that for ARC a text file is used instead. * New EGI data retention policy: retention period of 12 months -> 18 months (proposed date of enforcement July 1st) -> this requires a change on the SGAS server at SWITCH (which serves SWING/SMSCG) * Sigve circulated the agenda for the INSPIRE meeting in March/Lugano(CSCS) Other topics * HammerCloud Jobs and Site Readiness * % of successful jobs generally low (between 80/90) for CMS HC jobs; comparable sites usually have >99% * CMS jobs < 80% for the last 2 days (so site in status '[[https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T2_CH_CSCS][Not Ready]]' at the moment) * Is the problem understood (is it really an ARGUS 1.4.1 problem that will be solved by upgrading to UMD2)? * Does ATLAS see similar problems with HC jobs at CSCS? * Next Face to Face meeting in two weeks, in two parts: * CHIPP, from 10.15h to 14.15h * EGI, from 14.30h to 16.30h * Topic3 Next meeting date: AOB ---++ Attendants * CSCS: George, Pablo * CMS: Daniel, Fabio * ATLAS: * LHCb: Roland * EGI: ---++ Action items * Item1
This topic: LCGTier2
>
WebHome
>
MeetingsBoard
>
MeetingSwissGridOperations20130307
Topic revision: r11 - 2013-04-04 - PabloFernandez
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback