T3 Downtime due to PSI yearly Compute Center Maintenance -- 05. 01. 2022 DerekFeichtinger Downtime will last from 16:00h Fri, 7. Jan until 10:00h on Mon, 10. Jan |
Major downtime Wed, 26. 5. 2021 to Thu, 27. 5. 2021 -- 20. 04. 2021 DerekFeichtinger Downtime due to construction work for upgrading the PSI compute center's power and cooling capacities. |
New worker nodes t3wn70-73 adding 512 new slots -- 12. 04. 2021 DerekFeichtinger 4 powerful worker node based on AMD EPYC were added to the Tier-3: t3wn[70-73] |
New NFS /work area -- 12. 03. 2021 DerekFeichtinger More performant and backupped /work NFS service. You can read up on the changes in accessing snapshots following the links. |
June 05-07 Downtime for SE reconfiguration and upgrade -- 02. 06. 2020 DerekFeichtinger The SE will be reconfigured to allow additional forms off access to its files. NFS4.1 will allow direct access to numpy data sets for your python based analysis. The downtime will start on Fri 8AM and last until Mon 8AM. |
17-18 Feb 2020 T3 DOWNTIME -- 10. 02. 2020 NinaLoktionova Due to dCache upgrade there is no any services available on 17-18 February. |
T3 shutdown on 10-13 January 2020 -- 09. 12. 2019 NinaLoktionova On January 2020 the annual system test (maintenance day) will be conducted on the entire PSI. The IT services are not or only partially available from Friday, 10 January 2020, 20:00 until Sunday, 12 January 2020, 20:00. |
Shutdown of Tier3 on 4-6 January 2019 -- 21. 12. 2018 NinaLoktionova Because of central PSI shutdown, T3 is not available starting from afternoon 4.01 till afternoon 7.01. |
Upgrade of T3 Storage -- 13. 09. 2018 NinaLoktionova Since August we've put into production 4 new storage servers as a replacement of old hardware and enhancement of the dcache storage. So that /pnfs space is currently about 1.2PB. |
T3 is NOT be available because of dCache upgrade on Apr. 19 -- 09. 04. 2018 NinaLoktionova the scheduled time for this intervention is between 9:30 am to 19:30 pm |
Tier-3 Security Update Downtime On 25.07 At 10:00 -- 13. 07. 2017 NinaLoktionova Dear all, we expect about 2 hours interruption of services. |
Emergency Downtime of T3 due to HW failure -- 08. 06. 2017 DerekFeichtinger Due to a controller failure on the central home file system storage the T3 has been taken offline on 6. 6. 2017. Must wait for replacement part delivery (expected on 8. 6. 2017). |
Shutdown of T3 on Mon Apr 3 due to security update -- 30. 03. 2017 DerekFeichtinger We are required to perform an urgent security update to some of the T3 hosts, else we risk suspension from the grid services. |
snaphot taking of /shome now only on daily basis -- 26. 02. 2017 DerekFeichtinger shapshots of the home file system /shome will only be taken on daily basis, since hourly snapshots led to frequent problems of users not being able to free space. |
Annual power test at PSI on 07/08.01.2017 -- 08. 12. 2016 JoosepPata PSI will perform a power test on the aforementioned dates, meaning any T3 computing will be unavailable at this time. |
/shome reboot on 27.10.2016, 12:30-13:30 -- 26. 10. 2016 FabioMartinelli /shome reboot on 27.10.2016, 12:30-13:30 in order to upgrade both the RAID FW and the related Linux driver |
New T3 UIs -- 12. 10. 2016 FabioMartinelli Installed t3ui01 (PSI), t3ui02 (ETHZ), t3ui03 (UniZ) |
T3 scheduled downtime on Fri 23.09.2016 - 13:30-18:00 -- 19. 09. 2016 FabioMartinelli In order to upgrade both the /pnfs and the /shome file service the T3 will be in scheduled downtime on Fri 23.09.2016 - 13:30-18:00 |
T2 scheduled downtime - 21/09/2016 from 08:00 till 20:00 -- 13. 09. 2016 FabioMartinelli T2 scheduled downtime - 21/09/2016 from 08:00 till 20:00 |
T3 scheduled downtime - Friday 01/07/2016 from 14:00 till 15:00 -- 16. 06. 2016 FabioMartinelli On Friday 01/07/2016 from 14:00 till 15:00 the /shome file service is going to be updated |
New PSI mailing list service -- 18. 05. 2016 FabioMartinelli The PSI mailing list service has been migrated from mailman to Sympa |
New 576 CPU cores available at T3 -- 03. 05. 2016 FabioMartinelli Deployed 9 t3wn* servers featuring 64 CPU cores / 128GB RAM / 10Gbs cards |
T3 scheduled downtime - Friday 06/05/2016 from 10:00 till 17:00 -- 02. 05. 2016 FabioMartinelli On Friday 06/05/2016 from 10:00 till 17:00 we're both upgrading the /pnfs file service and the T3 networking to 10Gbps |
New T3 shome & swshare -- 22. 03. 2016 FabioMartinelli New T3 shome & swshare mounted in /mnt/t3nfs01/data01/{shome,swshare} |
T3 downtime 8th/9th Jan 2016 -- 10. 12. 2015 FabioMartinelli PSI is going to perform its yearly electrical tests |
/pnfs scheduled downtime - Friday 25/09/2015 from 13:30 till 14:30 -- 22. 09. 2015 FabioMartinelli The /pnfs file service will be updated from version 2.13.7 to 2.13.9 |
/pnfs scheduled downtime - Friday 11/09/2015 from 13:00 till 20:00 -- 01. 09. 2015 FabioMartinelli The /pnfs file service will be upgraded from version 2.10 to 2.13 |
T3 downtime from 24/07/2015 at 16:00 till 27/07/2015 at 14:00 -- 14. 07. 2015 FabioMartinelli The T3 will be completely stopped for ~3 days |
/pnfs scheduled downtime - Friday 12/06/2015 from 13:00 till 14:00 -- 10. 06. 2015 FabioMartinelli On Friday we're testing if dCache properly reboots from scratch. |
Possible T3 power cut on Thursday 30th April from 19:00 till 19:30 -- 30. 04. 2015 FabioMartinelli Unexpected electrical maintenance at T3 |
T3 is in downtime on Friday 20th March from 12:00 till 22:00 -- 16. 03. 2015 FabioMartinelli We're upgrading dCache from version 2.6 to version 2.10 |
All the SL5 UI servers will be disposed but t3ui05 -- 11. 03. 2015 FabioMartinelli On Monday 16th March we're stopping all the old SL5 t3ui0* servers but t3ui05 |
VOID !! T3 is in downtime on Friday 20th Feb from 12:00 till 22:00 -- 16. 02. 2015 FabioMartinelli VOID !! We're upgrading dCache from version 2.6 to version 2.10 |
On 19.12.2014 at 12:30 all the t3ui are going to be rebooted -- 19. 12. 2014 FabioMartinelli Because of a Linux security update all the t3ui* servers will be rebooted |
All the t3wn will be migrated to SL6 by the end of Jan 2015 -- 15. 12. 2014 FabioMartinelli Both CMSSW and WLCG expects a SL6 T3 to properly work, so all our SL5 t3wn* will be reinstalled as SL6 t3wn* |
T3 will be in downtime from Fri 9th Jan at 16:00 till Mon 12th Jan at 14:00 -- 08. 12. 2014 FabioMartinelli PSI is performing its yearly electrical tests |
How to use both IPython and CMSSW IPython on SL6 -- 26. 11. 2014 FabioMartinelli HowToWorkInCmsEnv#The_CMS_Environment_and_IPython |
/cvmfs/cms.cern.ch will replace /swshare/cms on Fri 21-11-2014 at 16:00 -- 17. 11. 2014 FabioMartinelli /cvmfs is the standard technology used in the CMS Grid to distribute each CMSSW releases around the world. |
gfalFS -- 14. 11. 2014 FabioMartinelli On the SL6 t3ui* servers the gfalFS tool can be used to mount one or more CMS SEs as local directories. |
/pnfs downtime on Friday 12th Sep from 13:00 until ~19:00 -- 03. 09. 2014 FabioMartinelli The /pnfs file service will be in downtime to be updated to its latest version |
New SL6 WNs -- 15. 08. 2014 FabioMartinelli The T3 batch queues also use the new SL6 WNs t3wn[41,43,44,50] ( 100GB RAM, 32 cores ) |
New SL6 UIs -- 16. 07. 2014 FabioMartinelli New SL6 t3ui[12,15,16,17,18,19] providing 1.7TB /scratch RAID10 |
Scheduled t3ui02,03,06,07 downtime on Fri 20th Jun from 14:00 to 16:00 -- 18. 06. 2014 FabioMartinelli Next Friday 20th June at 14:00 we're going to enlarge the t3ui0[2,3,6,7]:/scratch of ~100GB |
New tool 'dc_find' to quickly search into /pnfs -- 04. 04. 2014 FabioMartinelli By using the tool dc_find the T3 users can easily list their own/group or global /pnfs files and propose to the admins what must to be deleted. |
CERN has introduced the new gfal CLIs and APIs to interact with the Grid SEs -- 25. 03. 2014 FabioMartinelli During 2014 the Grid users have to replace the lcg-* commands usage with the new gfal-* commands |
Scheduled downtime from Friday 10th January 08:00 am to Monday 13th January 11:00 am -- 17. 12. 2013 FabioMartinelli As every year, PSI has a forced shutdown of most IT services because of its general electrical maintenance. |
Scheduled network downtime on Wed 18th Dec 2013 - from 9:00 am to 10:00 am -- 13. 12. 2013 FabioMartinelli The T3 will be unavailable on Wed 18th Dec 2013 from 9:00 am to 10:00 am beacuse of a network recabling. |
Scheduled downtime to upgrade dCache from version 2.2 to 2.6 -- 8. 11. 2013 FabioMartinelli On Friday 8th Nov at 12:00 we're going to upgrade dCache from version 2.2 to 2.6 ![]() |
Network slowness on the nodes t3wn30-40 -- 03. 10. 2013 FabioMartinelli Today the nodes t3wn[30-40] are limited to 10MB/s , the PSI network team will check our switches. |
1h downtime on Friday 13 Sep 2013 9:30 am -- 10. 09. 2013 FabioMartinelli Because of an HW error we need to stop the /pnfs files service for ~1h. |
devtools-1.1 utilities installed on the t3ui,t3wn servers -- 04. 09. 2013 FabioMartinelli By using these utilities users can arbitrarily use either the 2012 gcc compilers - ver. 4.7.2 or the 2008 default gcc compilers - ver. 4.1.2 |
Scheduled downtime to introduce the new Storage System -- 15. 08. 2013 FabioMartinelli On Friday 16th morning we'll stop the /pnfs file services to quickly introduce the new Storage System. |
New WNs Grid middleware to be tested before 1st June. -- 06. 05. 2013 FabioMartinelli A temporary batch queue short.q.validation has been created to validate the new WNs Grid middleware. |
T3 Scheduled Downtime on Monday 6th May ~9:30 am -- 29. 04. 2013 FabioMartinelli We need to stop for ~ 2h several VMs that have to be migrated into the new PSI VMware cluster. |
T3 Scheduled Downtime on March 28th -- 27. 02. 2013 FabioMartinelli We're going to upgrade both dCache to ver. 2.2 and Postgresql to ver. 9.2.3 |
Doodle about the next T3 downtime in March -- 18. 02. 2013 FabioMartinelli Doodle opened, it will be closed on Feb 25th ![]() |
/pnfs/psi.ch/cms NFS Read-only mounted on each UI -- 18. 02. 2013 FabioMartinelli /pnfs/psi.ch/cms is now Read-only mounted on each UI to allow users an handy navigation of /pnfs . |
Reinstallation of t3ui02 t3ui03 t3ui04 on Feb 8th -- 07. 02. 2013 FabioMartinelli After this final reinstallation all our UIs t3ui0[1-9] will offer 261GB /scratch. |
Reinstallation of t3ui05 t3ui06 t3ui07 on Feb 6th -- 04. 02. 2013 FabioMartinelli We're going to reinstall t3ui05 t3ui06 t3ui07 to increase both size and speed of their /scratch . |
Downtime for the annual PSI IT maintenance on Fri, Jan 11th, 16h until Mon 14th, evening -- 17. 12. 2012 FabioMartinelli Annual PSI IT maintenance that we're going to use also to migrate dCache from 1.9.5 to 1.9.12. |
Downtime for major upgrade of the SE on Thu Nov 29th - Fri 30th -- 02. 11. 2012 DerekFeichtinger Dcache, the Storage Element SW will be upgraded. The upgrade involves a complete migration of the underlying data base to a new format (chimera). Therefore, all operations involving access to the SE must be stopped for the time of the upgrade. |
Enforced flexible Job RAM limits -- 26. 07. 2012 FabioMartinelli Each Job will request by default 3GB of RAM but it's permited to explicitely request up to 6GB. |
Added new WNs t3wn[30-40] -- 05. 06. 2012 FabioMartinelli Introduced additional 176 job slots, each ~1.2 faster than the previous 160 slots. |
Major Downtime March 14/15 for T3 upgrades -- 09. 02. 2012 DerekFeichtinger Major upgrades to the storage and compute infrastructure of the T3 require a complete exchange of the current network switching. |
Introduced /tmp and /scratch disks quota on UIs and WNs -- 13. 01. 2012 FabioMartinelli To prevent a generic user to fill a shared partition and to observe the others space usage. |
Downtime Jan 6-8 -- 23. 12. 2011 DerekFeichtinger Due to PSI computing center maintenance the Tier-3 will go on downtime from Fri Jan 6, 15h in the afternoon until Monday Jan 9 in the early morning. |
Introduction of new walltime limits for all.q, long.q. New interactive debugging queue -- 23. 11. 2011 DerekFeichtinger On Nov 28, Based on the agreed policies, we will introduce a limit of 10h for jobs on the all.q and 96h on the long.q. The new interactive debugging queue is now also accessible for users. |
Testing of new batch system policies -- 11. 10. 2011 DerekFeichtinger In order to keep some fast turnaround resources free during normal work hours, we are testing out a number of new batch system policies. Submit to the short.q (up to 90 min jobs) to benefit from the free slots. |
Downtime Jan 7-9 for PSI compute center maintenance -- 05. 01. 2011 DerekFeichtinger Due to maintenance and systems testing in the PSI compute centers we must shut down the Tier-3 for the weekend of Jan 8/9. The downtime will begin on Friday evening, 17h. The system will be brought up again on Sunday morning. |
Maintenance downtime Tue, Dec 21st 2010 FINISHED ![]() Need to do some end of the year maintenance + enable shome quotas. |
SE impaired due to fileserver problem - RESOLVED ![]() Problem with one fileserver (t3fs07) where the disk failover and spare replacement did not work correctly. Solved by replacement parts from SUN/Oracle support on Oct 22. |
Short downtime on Mon, June 14, 9h-10h for NFS server reboot -- 14. 06. 2010 DerekFeichtinger The management processor on the NFS home server is in an unresponsive state. A total reboot + firmware upgrade is needed (as announced on t3 user mailing list). |
RAM upgrade on Fri, May 21 -- 17. 05. 2010 DerekFeichtinger Upgrade of the PSI Tier-3 worker nodes to 24 GB RAM per node (3GB per core). |
Downtime Thu Mar 11th (+ 12th in case of problems) -- 09. 03. 2010 DerekFeichtinger Upgrade of dCache to 'golden' production release. Pool migrations |
Downtime Mon Jan 25 - Tue Jan 26 -- 20. 01. 2010 DerekFeichtinger upgrade of WNs to SL5, reinstallation of batch system, another try at upgrading the dcache storage manager to 1.9.5-11 |
Downtime 8th to 11th Jan 2010 -- 16. 12. 2009 DerekFeichtinger Due to a power shutdown at PSI on Sat 9th, all systems need to go down. |
Emergency downtime Thu Dec 11 2009 - Sun Dec 13 -- 11. 12. 2009 DerekFeichtinger Due to a repeated failure of dcache on file server t3fs05 we had to take a downtime. Fix requires Solaris OS update (reinstallation). The UI will also be unavailable from Friday noon. |
Downtime July 30/31 2009 (finished) -- 21. 07. 2009 DerekFeichtinger Downtime for a number of upgrades and the introduction of larger NFS area |
Downtime May 8th - 10th (finished) -- 05. 05. 2009 DerekFeichtinger Basic OS and MW updates and adaption of the SE information system to get us correctly registered |
Quotas for /shome filesystem -- 10. 01. 2009 DerekFeichtinger To protect the system from filling up we enforced user quotas on /shome (15GB soft / 25GB hard limit) |
PSI CMS Tier-3 cluster is now online -- 03. 11. 2008 DerekFeichtinger The PSI CMS Tier-3 cluster is now online. This is the common cluster for CMS members of ETHZ, University of Zurich and PSI. |
Tier-3 users' test phase -- 05. 10. 2008 DerekFeichtinger The cluster is ready for test users. CMSSW jobs run fine. Data can be ordered with PhEDEx. CRAB jobs work ok except for some options like -resubmit. User feedback is found on the CMSTier3Log1 page |
Tier-3 Installation phase (2) -- 30. 08. 2008 DerekFeichtinger Most low level problems solved. Registering to Grid and CMS services. Todo list. |
Tier-3 Installation Phase -- 18. 08. 2008 DerekFeichtinger The Tier-3 is in the install phase. First successful tests with the storage have been done. But a major OpenSolaris problem slows us down |