Tags:
create new tag
view all tags

Arrow left Go to previous page / next page of Tier3 site log MOVED TO...

16. 10. 2008 Out of memory problem impacting t3wn06

Ganglia and other services have been impacted since a few days. Seems that an out of memory condition due to a user process could have been the source of the problem.

  • Need to shield the system better (queue limits)
  • Need to establish a daily checking of sensors by the admins

excerpts of /var/log/messages:

Oct 14 12:53:55 t3wn06 kernel: Node 0 HighMem: empty
Oct 14 12:53:55 t3wn06 kernel: Swap cache: add 510281, delete 510031, find 23/45, race 0+0
Oct 14 12:53:55 t3wn06 kernel: Free swap:            0kB
Oct 14 12:53:55 t3wn06 kernel: 4325376 pages of RAM
Oct 14 12:53:55 t3wn06 kernel: 220750 reserved pages
Oct 14 12:53:55 t3wn06 kernel: 67679 pages shared
Oct 14 12:53:55 t3wn06 kernel: 251 pages swap cached
Oct 14 12:53:55 t3wn06 kernel: Out of Memory: Killed process 18107 (MarkovChains.ex).
Oct 14 12:53:55 t3wn06 kernel: oom-killer: gfp_mask=0xd2
Oct 14 12:53:55 t3wn06 kernel: Mem-info:
.....
Oct 14 12:53:56 t3wn06 kernel: Free pages:       28232kB (0kB HighMem)
Oct 14 12:53:56 t3wn06 kernel: Active:2248797 inactive:1829427 dirty:0 writeback:0 unstable:0 free:7058 slab:4306 mapped:4
077596 pagetables:9275
Oct 14 12:53:56 t3wn06 kernel: Node 0 DMA free:11648kB min:12kB low:24kB high:36kB active:0kB inactive:0kB present:16384kB
 pages_scanned:0 all_unreclaimable? yes
Oct 14 12:53:56 t3wn06 kernel: protections[]: 0 0 0
Oct 14 12:53:56 t3wn06 kernel: Node 0 Normal free:16584kB min:16612kB low:33224kB high:49836kB active:8995752kB inactive:7
317196kB present:17285120kB pages_scanned:25577838 all_unreclaimable? yes
Oct 14 12:53:56 t3wn06 kernel: protections[]: 0 0 0
Oct 14 12:53:56 t3wn06 kernel: Node 0 HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB present:0kB
pages_scanned:0 all_unreclaimable? no
Oct 14 12:53:56 t3wn06 kernel: protections[]: 0 0 0
Oct 14 12:53:56 t3wn06 kernel: Node 0 DMA: 2*4kB 1*8kB 1*16kB 5*32kB 3*64kB 2*128kB 3*256kB 0*512kB 0*1024kB 1*2048kB 2*40
96kB = 11648kB
Oct 14 12:53:56 t3wn06 kernel: Node 0 Normal: 0*4kB 1*8kB 0*16kB 2*32kB 2*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 4
*4096kB = 16584kB
Oct 14 12:53:56 t3wn06 kernel: Node 0 HighMem: empty
Oct 14 12:53:56 t3wn06 kernel: Swap cache: add 510281, delete 510031, find 23/46, race 0+0
Oct 14 12:53:56 t3wn06 kernel: Free swap:            0kB
Oct 14 12:53:56 t3wn06 kernel: 4325376 pages of RAM
Oct 14 12:53:56 t3wn06 kernel: 220750 reserved pages
Oct 14 12:53:56 t3wn06 kernel: 67642 pages shared
Oct 14 12:53:56 t3wn06 kernel: 251 pages swap cached
Oct 14 12:53:56 t3wn06 kernel: Out of Memory: Killed process 18090 (489).
...
Oct 14 12:53:58 t3wn06 kernel: Out of Memory: Killed process 3503 (gmond).
...
Oct 14 12:53:59 t3wn06 kernel: Out of Memory: Killed process 9646 (rpc.statd).
...
Oct 16 08:29:43 t3wn06 kernel: statd: server localhost not responding, timed out
Oct 16 08:29:43 t3wn06 kernel: lockd: cannot monitor 192.33.123.26
Oct 16 08:29:43 t3wn06 kernel: lockd: failed to monitor 192.33.123.26

-- DerekFeichtinger - 16 Oct 2008


Arrow left Go to previous page / next page of Tier3 site log MOVED TO...

Topic revision: r1 - 2008-10-16 - DerekFeichtinger
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback