Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
6.5.0
-
Enterprise Edition 6.5.0 build 4724
-
Untriaged
-
Centos 64-bit
-
-
Unknown
-
KV-Engine Mad-Hatter GA
Description
Steps to Reproduce:
- Create a 4 node cluster.
+-------------+----------+--------------+
| Nodes | Services | Status |
+-------------+----------+--------------+
| 172.23.97.3 | [u'kv'] | Cluster node |
| 172.23.97.4 | None | <--- IN --- |
| 172.23.97.5 | None | <--- IN --- |
| 172.23.97.6 | None | <--- IN --- |
+-------------+----------+--------------+
- Create a bucket with compression=off, eviction policy = valueOnly, replicas = 1.
- Load 50M docs in the bucket with durability=MAJORITY. This step was successful.
+----------------+---------+----------+-----+----------+--------------+--------------+-------------+
| Bucket | Type | Replicas | TTL | Items | RAM Quota | RAM Used | Disk Used |
+----------------+---------+----------+-----+----------+--------------+--------------+-------------+
| GleamBookUsers | membase | 1 | 0 | 50000000 | 431270920192 | 136477644848 | 40167732516 |
+----------------+---------+----------+-----+----------+--------------+--------------+-------------+
- Rebalance In 1 node(172.23.97.10) with another 20M updates, 10M creates with durability=MAJORITY in parallel.
These are performance testing boxes having RAM quota of 101 GB each allocated for data service.
Rebalance didn't start for 380 seconds and then after Rebalance reached 1%, there was a memcached crash on the node which was under Rebalance In operation(172.23.97.10)
TImeStamp of Rebalance failure is:
ns_server.info.log:[ns_server:info,2019-10-30T22:41:35.212-07:00,ns_1@172.23.97.3:rebalance_agent<0.648.0>:rebalance_agent:handle_down:296]Rebalancer process <0.27979.5> died (reason {mover_crashed, |
This corresponds to the following Memcached Crash on 172.23.97.10 (and hence rebalance failed):
memcached.log:2019-10-30T22:41:35.131875-07:00 CRITICAL Breakpad caught a crash (Couchbase version 6.5.0-4724). Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/237c4e35-8628-fd84-65d6db96-7f49e040.dmp before terminating. |
memcached.log:2019-10-30T22:41:35.131909-07:00 CRITICAL Stack backtrace of crashed thread: |
memcached.log:2019-10-30T22:41:35.132140-07:00 CRITICAL /opt/couchbase/bin/memcached() [0x400000+0x13138d] |
memcached.log:2019-10-30T22:41:35.132166-07:00 CRITICAL /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler12GenerateDumpEPNS0_12CrashContextE+0x3ce) [0x400000+0x1491ee] |
memcached.log:2019-10-30T22:41:35.132184-07:00 CRITICAL /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler13SignalHandlerEiP9siginfo_tPv+0x94) [0x400000+0x149504] |
memcached.log:2019-10-30T22:41:35.132203-07:00 CRITICAL /lib64/libpthread.so.0() [0x7fc02423e000+0xf370] |
memcached.log:2019-10-30T22:41:35.132258-07:00 CRITICAL /lib64/libc.so.6(gsignal+0x37) [0x7fc023e7d000+0x351d7] |
memcached.log:2019-10-30T22:41:35.132321-07:00 CRITICAL /lib64/libc.so.6(abort+0x148) [0x7fc023e7d000+0x368c8] |
memcached.log:2019-10-30T22:41:35.132387-07:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x125) [0x7fc024973000+0x91195] |
memcached.log:2019-10-30T22:41:35.132407-07:00 CRITICAL /opt/couchbase/bin/memcached() [0x400000+0x144cc2] |
memcached.log:2019-10-30T22:41:35.132447-07:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fc024973000+0x8ef86] |
memcached.log:2019-10-30T22:41:35.132485-07:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fc024973000+0x8efd1] |
memcached.log:2019-10-30T22:41:35.132520-07:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fc024973000+0x8f213] |
memcached.log:2019-10-30T22:41:35.132549-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fc01e204000+0x58746] |
memcached.log:2019-10-30T22:41:35.132566-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fc01e204000+0xd95d3] |
memcached.log:2019-10-30T22:41:35.132579-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fc01e204000+0xdadc8] |
memcached.log:2019-10-30T22:41:35.132597-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fc01e204000+0x1915c7] |
memcached.log:2019-10-30T22:41:35.132610-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fc01e204000+0xe5915] |
memcached.log:2019-10-30T22:41:35.132623-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fc01e204000+0x13745e] |
memcached.log:2019-10-30T22:41:35.132635-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fc01e204000+0x137a31] |
memcached.log:2019-10-30T22:41:35.132646-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fc01e204000+0x131594] |
memcached.log:2019-10-30T22:41:35.132657-07:00 CRITICAL /opt/couchbase/bin/../lib/libplatform_so.so.0.1.0() [0x7fc02681c000+0x8f27] |
memcached.log:2019-10-30T22:41:35.132670-07:00 CRITICAL /lib64/libpthread.so.0() [0x7fc02423e000+0x7dc5] |
memcached.log:2019-10-30T22:41:35.132734-07:00 CRITICAL /lib64/libc.so.6(clone+0x6d) [0x7fc023e7d000+0xf776d] |
Error Messages:
Rebalance exited with reason {mover_crashed,
|
{unexpected_exit,
|
{‘EXIT’,<0.26837.6>, |
{{{badmatch,{error,closed}},
|
[{mc_client_binary,cmd_vocal_recv,5, |
[{file,“src/mc_client_binary.erl”},
|
{line,155}]}, |
{mc_client_binary,
|
wait_for_seqno_persistence,3, |
[{file,“src/mc_client_binary.erl”},
|
{line,696}]}, |
{ns_memcached,
|
‘-wait_for_seqno_persistence/3-fun-0-‘,3, |
[{file,“src/ns_memcached.erl”},
|
{line,1272}]}, |
{ns_memcached,
|
‘-perform_very_long_call/3-fun-0-‘,2, |
[{file,“src/ns_memcached.erl”},
|
{line,344}]}, |
{ns_memcached_sockets_pool,
|
‘-executing_on_socket/3-fun-0-‘,3, |
[{file,
|
“src/ns_memcached_sockets_pool.erl”},
|
{line,92}]}, |
{async,‘-async_init/4-fun-1-’,3, |
[{file,“src/async.erl”},{line,197}]}]}, |
{gen_server,call,
|
[{‘janitor_agent-GleamBookUsers’,
|
‘ns_1@172.23.97.10’}, |
{if_rebalance,<0.28325.5>, |
{update_vbucket_state,511,active, |
undefined,undefined,undefined}},
|
infinity]}}}}}.
|
Rebalance Operation Id = a7371ab853216cbc3dd2f8df81a329da
|
Worker <0.32290.5> (for action {move,{511, |
[‘ns_1@172.23.97.4’,‘ns_1@172.23.97.6’], |
[‘ns_1@172.23.97.10’,‘ns_1@172.23.97.6’], |
[]}}) exited with reason {unexpected_exit,
|
{‘EXIT’,
|
<0.26837.6>, |
{{{badmatch,
|
{error,
|
closed}},
|
[{mc_client_binary,
|
cmd_vocal_recv,
|
5, |
[{file,
|
“src/mc_client_binary.erl”},
|
{line,
|
155}]}, |
{mc_client_binary,
|
wait_for_seqno_persistence,
|
3, |
[{file,
|
“src/mc_client_binary.erl”},
|
{line,
|
696}]}, |
{ns_memcached,
|
‘-wait_for_seqno_persistence/3-fun-0-‘, |
3, |
[{file,
|
“src/ns_memcached.erl”},
|
{line,
|
1272}]}, |
{ns_memcached,
|
‘-perform_very_long_call/3-fun-0-‘, |
2, |
[{file,
|
“src/ns_memcached.erl”},
|
{line,
|
344}]}, |
{ns_memcached_sockets_pool,
|
‘-executing_on_socket/3-fun-0-‘, |
3, |
[{file,
|
“src/ns_memcached_sockets_pool.erl”},
|
{line,
|
92}]}, |
{async,
|
‘-async_init/4-fun-1-’, |
3, |
[{file,
|
“src/async.erl”},
|
{line,
|
197}]}]}, |
{gen_server,
|
call,
|
[{‘janitor_agent-GleamBookUsers’,
|
‘ns_1@172.23.97.10’}, |
{if_rebalance,
|
<0.28325.5>, |
{update_vbucket_state,
|
511, |
active,
|
undefined,
|
undefined,
|
undefined}},
|
infinity]}}}}
|
Attachments
Issue Links
- duplicates
-
MB-36720 memcached crashed during rebalance - monotonic invariant exception on the PassiveDM HPS
- Closed