Tags:
meeting
1
SwissGridOperationsMeeting
1
tag this topic
create new tag
view all tags
<!-- keep this as a security measure: * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.LCGAdminGroup,Main.EgiGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.LCGAdminGroup #uncomment this if you want the page only be viewable by the internal people #* Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.LCGAdminGroup --> ---+ Swiss Grid Operations Meeting on 2013-10-10 * *Date and time*: Thursday 10 October 2013, at 14:00 * *Place*: Vidyo (room: Swiss_Grid_Operations_Meeting, extension: 9227296) * *External link*: http://vidyoportal.cern.ch/flex.html?roomdirect.html&key=Nrq24qRR4V1u * *Phone gate*: From Switzerland: 0225330322 (portal) + 9227296 (extension) + # (pound sign) * *IRC chat*: irc:gridchat.cscs.ch:994#lcg (ask pw via email) ---++ Agenda Status * *CSCS* (reports Miguel): * SLURM migration status: all going according to plan for CREAM-CE (cream02). Need to finish tuning ARC-CE (arc02) so it publishes correct accounting data. APEL seems to be ok as well, but we are not publishing accounting with the new APEL (apel02) until APEL development team gives us green light (GGUS #97623). * Need to do further tuning on the information system of CREAM-CE (especially in regards to GLUE2). * Plan to open the firewall today for cream02 and tomorrow for arc02. submission to cream04 has been disabled, as we need to move the gridmapdir to the NAS (same location as the other SLURM CREAM-CEs). * The queues have changed names, as already mentioned on Digest #1. * New storage arrived, installed in the racks, RAIDs tested, but SEs not ready yet. Will be added to dCache once we are done with SLURM. * New IB and ETH switches purchased as planned (IB arrived, ETH not yet). * According to last EGI Operations meeting we need to upgrade dCache to version 2.6 in order to be compatible with SHA-2. Deadline is end of november. Suggesting to do this on November 06 during CSCS standard maintenance day. TBC. * *PSI* (reports Fabio): * Finally dropped all the constantly failing 1TB Seagate disks from the 5 * =SUN X4540= ! * Upgraded the 5 * =SUN X4540 ILOM FW= * Upgraded the 5 * =SUN X4540 LSI HBA FW= by using the [[http://www.oracle.com/technetwork/server-storage/servermgmt/tech/hardware-installation-assistant/index.html][Oracle Hardware Installation Assistant CD]] * Installed a [[http://www.oracle.com/technetwork/server-storage/solaris/overview/solaris-latest-version-170418.html][Solaris 10 1/13]] [[https://wiki.chipp.ch/twiki/bin/view/CmsTier3/NodeTypeJumpStart][Jumpstart Enterprise Toolkit]] VM =t3jumpstart01= * Automatically reinstalled the 5 * =SUN X4540= with *[* [[http://www.oracle.com/technetwork/server-storage/solaris/overview/solaris-latest-version-170418.html][Solaris 10 1/13]] + dCache 2.9 + JDK 7 *]* by using *[* =t3jumpstart01= + Puppet + http://pkgutil.net/ *]* * So 3 * =SUN X4540= are used in production; about the remaining 2 * =SUN X4540=, I'm waiting for the additional 1TB Hitachi disks that are going to be sent by CSCS. * Preparing the [[http://www.dcache.org/manuals/upgrade/upgrade-2.2-to-2.6.html][dCache 2.2 to 2.6 upgrade]] * *UNIBE* (reports Gianfranco): * ce01.lhep cluster gained stability (SunBlades+thumpers). Only cvmfs partition full issue still from time to time. Need to follow-up and/or move to the NFS shared cache. Still running production only though (and local users), commissioning for analysis still pending. * ce.lhep cluster (older hardware) has been shutdown and the ce.lhep service has been decommissioned in GOCDB * cluster now expanded by 55 nodes inherited from CERN (we should have ~850 cores on this) * completely re-cabled (power+network) * could only use 2 Force10 switches for now (problem re-configuring more of them) * ce.lhep now rebuilt as ce02.lhep: ROCKS 6.1 with SLC6.4 * working right now on the images for WN with ATLAS customisation, Lustre MDS and OSS * expect to have part (or all?) of it online by end next week * *UNIGE* (reports Szymon): * A script to check if "everything" is OK on a batch or login machine * all NFS file systems, /cvmfs, AFS, / and /var and /tmp <90% full, /tmp writing, pbs_mom * running in an hourly cron * if one of the checks fails, or the script blocks, e.g. on df: email and 'offline' the host * automatic elimination of "black holes", for the 1st time * Our own Nagios, initial setup, pings all machines * thanks to Fabio for a few useful hints * Work still going on in the machine room * New disk servers (4 x IBM x3630 M4, 43 TB each for data) physically mounted, in racks * New infrastructure to install OS via network boot * Yann is working on how to install OS and get console * Free CPU, inherited from ATLAS Trigger (35 x DELL 8 core) will wait more * *UZH* (reports Sergio): * Xxx * *Switch* (reports Alessandro): * Xxx Other topics * Sysadmin training to be done at the beginning of November. Exact dates to be defined. Candidates are days 4-5 OR (ideally) 11-12. Other suggestions? * Fabio: I'd want to connect by SSH and read the [[https://ngi-de-nagios.gridka.de/nagios/][German Nagios logs]]: if you want to request the same please send me: ( your silence => I don't need it ): * your desidered %BLUE%account name%ENDCOLOR% * the %BLUE%IPs%ENDCOLOR% from where you are going to connect * your %BLUE%SSH Pub key%ENDCOLOR%. * *16th Oct* : so far me + Miguel sent our keys to A.Usai and in turn to KIT, waiting for their feedback. Next meeting date: Suggesting October 31. AOB ---++ Attendants * CSCS: Miguel Gila, Gianni Ricciardi * CMS: Fabio, Daniel * ATLAS: * LHCb: Roland * EGI: ---++ Action items * Item1
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r10
<
r9
<
r8
<
r7
<
r6
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r10 - 2013-10-16
-
FabioMartinelli
LCGTier2
Log In
(Topic)
LCGTier2 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Users
Entry point / Contact
RoadMap
ATLAS Pages
CMS Pages
CMS User Howto
CHIPP CB
Outreach
Technical
Cluster details
Services
Hardware and OS
Tools & Tips
Monitoring
Logs
Maintenances
Meetings
Tests
Issues
Blog
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
LCGTier2 Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Warning: Can't find topic "".""
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback