Details
Description
Was running a clustered load test against zmemcached r12 this evening. The environment looked something like the following
- There were 10 memcached server nodes running on the centos image with ~9GB of swap enabled (see MB-992 for details on how swap was set up)
- The 10 nodes were broken up into 5 Master/Slave pairs such that each master has one slave
- There were 4 "client nodes" that were running mc-hammer against the 5 master nodes (no traffic was directly sent to any of the slave nodes). Each client node ran the following
nohup ./hammer -n 500000 ec2-184-73-1-84.compute-1.amazonaws.com:11211,ec2-174-129-65-164.compute-1.amazonaws.com:11211,ec2-75-101-229-73.compute-1.amazonaws.com:11211,ec2-204-236-199-120.compute-1.amazonaws.com:11211,ec2-184-73-49-108.compute-1.amazonaws.com:11211 > hammer.out 2>&1 < /dev/null &
Within about two minutes after starting load, the memcached process on the two slaves crashed. /var/log/messages shows the following
Jun 8 21:39:08 domU-12-31-39-13-C9-21 kernel: [213091.092202] memcached[4100]: segfault at 000000000c9a2000 rip 00002b449219f383 rsp 0000000044006ad8 error 4
Jun 8 21:39:18 domU-12-31-39-0C-19-D1 kernel: [2441151.882791] memcached[4224]: segfault at 000000000cea2000 rip 00002b88aeba9383 rsp 0000000044006ad8 error 4
Sadly no cores were generated.
Will respin test in order to get core files