<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups # * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup # * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup --> ---+!! %TOPIC% %TOC% ---++ Symptoms Summary: %FORMFIELD{"Symptom summary"}% ---++ Occurrences At what times did this problem occur (used to estimate frequency): | 2009-06-17 | ---++ Observations The nscd does not seem to cache some host entries. Switching the debug level to >1 in the =/etc/nscd.conf= shows that for some hostnames always cache failures are returned, while the caching works correctly for others. Experimentation shows that the caching systematically fails for hosts which resolve to multiple IP addresses. As a particularly bad bonus, the lookup failures are correctly cached by the nscd, leading to a failure for all subsequent requests for that host resolution, until the cache is cleared again (_negative-time-to-live_ config parameter, 20s by default). This situation is extremely bad on our T3, because the DMZ nameserver that we use is protected from too many requests from the same host during short time spans. So, we get host lookup failures for these cases. The problem was noted with CRAB jobs trying to resolve cmsdbprod for registering data sets. ---+++Test example: cmsdbsprod resolves to two IP addresses <pre> host cmsdbsprod.cern.ch cmsdbsprod.cern.ch has address 128.142.142.178 cmsdbsprod.cern.ch has address 128.142.142.133 </pre> a little stress test <pre> for ((n=1;$n<200;n=$n+1)); do gethostip cmsdbsprod.cern.ch ; done </pre> nscd.log entry example: <pre> ... 19128: handle_request: request received (Version = 2) from PID 26386 19128: GETHOSTBYNAME (cmsdbsprod.cern.ch) 19128: Haven't found "cmsdbsprod.cern.ch" in hosts cache! 19128: handle_request: request received (Version = 2) from PID 26386 19128: GETHOSTBYNAME (cmsdbsprod.cern.ch) 19128: Haven't found "cmsdbsprod.cern.ch" in hosts cache! ... </pre> ---++ Solution or Workaround Googling brought only one reference to this problem (http://bugs.gentoo.org/196241). There, upgrading glibc to 2.8 was recommended, but no reply from the submitter is seen. We currently run glibc-2.3.4 on our SL4 installations. An ugly workaround is to hardcode the few hosts that give problems into the =/etc/hosts= files. Done for the moment. ---++ Monitoring for this condition <!-- #how can this condition be recognized automatically, if at all? --> -- Main.DerekFeichtinger - 17 Jun 2009
IssueForm
Affected Service
all host resolution dependent services
Symptom summary
The nscd does not cache hosts which resolve to multiple IP addresses
Reason Understood
yes
Solution Exists
workaround
Obsolete
no
This topic: CmsTier3
>
WebHome
>
AdminArea
>
IssueNscdAliasedHostsNotCached
Topic revision: r3 - 2009-06-19 - DerekFeichtinger
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback