Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Cheshire-Cat
-
7.0.0-4574
-
Triaged
-
Centos 64-bit
-
1
-
Yes
-
KV-Engine 2021-March
Description
Script to Repro
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/win10-bucket-ops.ini rerun=False,get-cbcollect-info=True,quota_percent=95,crash_warning=True,rebalance_moves_per_node=64 -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_rebalance_in,nodes_init=3,nodes_in=2,bucket_spec=multi_bucket.buckets_for_rebalance_tests_more_collections,data_load_spec=volume_test_load_with_CRUD_on_collections,data_load_stage=before,scrape_interval=5,rebalance_moves_per_node=32,quota_percent=80,skip_validations=False,GROUP=rebalance_with_collection_crud'
|
Steps to Repro
1) Create a 3 node cluster
2021-03-02 00:17:36,871 | test | INFO | pool-1-thread-6 | [table_view:display:72] Rebalance Overview
---------------------------------------------------------------------
Nodes | Services | Version | CPU | Status |
---------------------------------------------------------------------
172.23.98.196 | kv | 7.0.0-4574-enterprise | 10.3232272613 | Cluster node |
172.23.98.195 | None | <--- IN — | ||
172.23.121.10 | None | <--- IN — |
---------------------------------------------------------------------
2) Create buckets/scopes/collections/data
2021-03-02 00:22:36,851 | test | INFO | MainThread | [table_view:display:72] Bucket statistics
-------------------------------------------------------------------------
Bucket | Type | Replicas | Durability | TTL | Items | RAM Quota | RAM Used | Disk Used |
-------------------------------------------------------------------------
bucket1 | couchbase | 3 | none | 0 | 3000 | 629145600 | 143988160 | 329478219 |
bucket2 | ephemeral | 3 | none | 0 | 3000 | 629145600 | 313926472 | 102 |
default | couchbase | 3 | none | 0 | 500000 | 6291456000 | 524102632 | 596750032 |
-------------------------------------------------------------------------
3) Adding node fails as shown below which is because of MB-44012.
2021-03-02 00:22:49,375 | test | INFO | MainThread | [cluster_ready_functions:set_rebalance_moves_per_nodes:119] Changed Rebalance settings: {u'rebalanceMovesPerNode': 64}
|
2021-03-02 00:23:26,615 | test | ERROR | pool-1-thread-24 | [rest_client:_http_request:747] POST http://172.23.98.196:8091/controller/addNode body: hostname=http%3A%2F%2F172.23.104.186%3A8091&password=password&user=Administrator headers: {'Accept': '*/*', 'Connection': 'close', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==\n', 'Content-Type': 'application/x-www-form-urlencoded'} error: 400 reason: unknown ["Join completion call failed. Got HTTP status 500 from REST call post to http://172.23.104.186:8091/completeJoin. Body was: \"[\\\"Unexpected server error, request logged.\\\"]\""] auth: Administrator:password
|
2021-03-02 00:23:26,858 | test | ERROR | pool-1-thread-24 | [task:call:242] Error adding node: 172.23.104.186 to the cluster:172.23.98.196 - ["Join completion call failed. Got HTTP status 500 from REST call post to http://172.23.104.186:8091/completeJoin. Body was: \"[\\\"Unexpected server error, request logged.\\\"]\""]
|
We continued to do CRUD on collections when we see ae8d6778-2a62-426c-7fdd22bd-95275336.dmp on 172.23.121.10.
grep CRITICAL on 172.23.121.10
[root@localhost logs]# grep CRITICAL memcached.log.0000*
|
memcached.log.000016.txt:2021-03-02T00:21:57.107143-08:00 CRITICAL *** Fatal error encountered during exception handling ***
|
memcached.log.000016.txt:2021-03-02T00:21:57.107243-08:00 CRITICAL Caught unhandled std::exception-derived exception. what(): std::bad_alloc
|
memcached.log.000016.txt:2021-03-02T00:21:57.737236-08:00 CRITICAL Breakpad caught a crash (Couchbase version 7.0.0-4574). Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/ae8d6778-2a62-426c-7fdd22bd-95275336.dmp before terminating.
|
memcached.log.000016.txt:2021-03-02T00:21:57.737279-08:00 CRITICAL Stack backtrace of crashed thread:
|
memcached.log.000016.txt:2021-03-02T00:21:57.737590-08:00 CRITICAL #0 /opt/couchbase/bin/memcached() [0x400000+0x14cc4d]
|
memcached.log.000016.txt:2021-03-02T00:21:57.737617-08:00 CRITICAL #1 /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler12GenerateDumpEPNS0_12CrashContextE+0x3ea) [0x400000+0x16304a]
|
memcached.log.000016.txt:2021-03-02T00:21:57.737640-08:00 CRITICAL #2 /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler13SignalHandlerEiP9siginfo_tPv+0xb8) [0x400000+0x163388]
|
memcached.log.000016.txt:2021-03-02T00:21:57.737735-08:00 CRITICAL #3 /lib64/libpthread.so.0() [0x7f516953b000+0xf630]
|
memcached.log.000016.txt:2021-03-02T00:21:57.737786-08:00 CRITICAL #4 /lib64/libc.so.6(gsignal+0x37) [0x7f516916d000+0x36387]
|
memcached.log.000016.txt:2021-03-02T00:21:57.737828-08:00 CRITICAL #5 /lib64/libc.so.6(abort+0x148) [0x7f516916d000+0x37a78]
|
memcached.log.000016.txt:2021-03-02T00:21:57.737886-08:00 CRITICAL #6 /opt/couchbase/bin/../lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x125) [0x7f5169c70000+0x91195]
|
memcached.log.000016.txt:2021-03-02T00:21:57.737912-08:00 CRITICAL #7 /opt/couchbase/bin/memcached() [0x400000+0x15c972]
|
memcached.log.000016.txt:2021-03-02T00:21:57.737960-08:00 CRITICAL #8 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f5169c70000+0x8ef86]
|
memcached.log.000016.txt:2021-03-02T00:21:57.738014-08:00 CRITICAL #9 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f5169c70000+0x8efd1]
|
memcached.log.000016.txt:2021-03-02T00:21:57.738081-08:00 CRITICAL #10 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f5169c70000+0xb9dfe]
|
memcached.log.000016.txt:2021-03-02T00:21:57.738106-08:00 CRITICAL #11 /lib64/libpthread.so.0() [0x7f516953b000+0x7ea5]
|
memcached.log.000016.txt:2021-03-02T00:21:57.738161-08:00 CRITICAL #12 /lib64/libc.so.6(clone+0x6d) [0x7f516916d000+0xfe8dd]
|
cbcollect_info attached. This was not seen on 7.0.0-4554.
Attachments
Issue Links
- is duplicated by
-
MB-44688 [cbstats] checkpoint point stats missing for few vbuckets for sometime
- Closed