<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups # * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup # * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup --> ---+!! %TOPIC% %TOC% ---++ Symptoms Summary: %FORMFIELD{"Symptom summary"}% SE cannot be accessed through srm. * In t3se01:catalina.out: =createConnection(): Got exception org.postgresql.util.PSQLException, SQLState: 53300= * In t3cachedb01:pg_sql: =FATAL: connection limit exceeded for non-superusers= ---++ Occurrences At what times did this problem occur (used to estimate frequency): | 2011-11-27 ~ 22:40 according to the [[https://t3nagios.psi.ch/nagios/cgi-bin/status.cgi?host=t3vm01][Nagios t3vm01 SRM checks]] | ---++ Observations <!-- #collect here the information which may help to better understand the state of the system or services, e.g. #log excerpts, strace output, etc. #this also may help to identify the problem if similar conditions arise again --> Number of connections is set up in =/var/lib/pgsql/data/postgresql.conf: max_connections=. From the [[http://www.postgresql.org/docs/8.2/static/runtime-config-connection.html][manual]] it can be seen that to raise that number it could be necessary to modify the SysV parameter =SEMMNI=. It is not clear why the 100 limit is reached. Probably, a certain number of transfers fails and leaves hanged connections on the db level, piling up until the limit is reached. A Nagios plot about this will be created (Fabio) to monitor this pile-up effect. ---++ Solution or Workaround A clean restart of dcache on se01,following the instructions reported here: StartStopDcache. This cured the issue all the times. ---++ Monitoring for this condition <!-- #how can this condition be recognized automatically, if at all? --> A check for number of DB connections is needed into Nagios; So I've implemented this [[http://bucardo.org/check_postgres/check_postgres.pl.html#backends][check_postgres]] like this [[https://t3nagios.psi.ch/nagios/cgi-bin/extinfo.cgi?type=2&host=t3dcachedb01&service=PostgreSQL+number+of+connections+per+DB][Nagios check deployed into t3dcachedb01]]. <pre> [root@t3dcachedb01 ~]# rpm -ql check_postgres /usr/bin/check_postgres.pl /usr/share/doc/check_postgres-2.12.0/check_postgres.pl.html [root@t3dcachedb01 ~]# grep postgres /etc/nagios/nrpe.cfg command[check_postgres_backends]=/usr/bin/check_postgres.pl --action=backends </pre> -- Main.LeonardoSala - 2011-11-28
IssueForm
Affected Service
dcache, srm
Symptom summary
too many pgsql connections, error 53300
Reason Understood
no
Solution Exists
workaround
Obsolete
no
This topic: CmsTier3
>
WebHome
>
AdminArea
>
IssueDcacheTooManySQLConnections
Topic revision: r2 - 2011-11-28 - FabioMartinelli
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback