Script to Repro
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/testexec-temp_win_private_job_3.ini rerun=False,get-cbcollect-info=True,quota_percent=95,crash_warning=True -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_rebalance_in,nodes_init=3,nodes_in=1,bucket_spec=multi_bucket.buckets_for_rebalance_tests_more_collections,data_load_spec=volume_test_load_with_CRUD_on_collections,data_load_stage=before,scrape_interval=5,rebalance_moves_per_node=32,quota_percent=80,skip_validations=False'
|
Automation Log : consoleText_MB-46023.txt
Steps to Repro
1. Create a 3 node cluster.
2021-04-30 05:24:21,211 | test | INFO | pool-2-thread-5 | [table_view:display:72] Rebalance Overview
----------------------------------------------------------------------------------
Nodes |
Services |
Version |
CPU |
Status |
----------------------------------------------------------------------------------
172.23.107.126 |
fts, index, kv, n1ql |
7.0.0-5071-enterprise |
8.90529684323 |
Cluster node |
172.23.107.127 |
None |
|
|
<--- IN — |
172.23.107.129 |
None |
|
|
<--- IN — |
----------------------------------------------------------------------------------
2. Create required buckets
2021-04-30 05:24:31,767 | test | INFO | MainThread | [bucket_ready_functions:create_buckets_using_json_data:1799] Creating required buckets from template
|
3. Then we try to create scopes/collections we notice some of the collection creation fails. It should be noted that very small amount of collection creation actually fails. Others seems to have gone through fine.
2021-04-30 05:24:45,117 | infra | ERROR | Thread-257 | [Rest_Connection:_http_request:253] POST http://172.23.107.126:8091/pools/default/buckets/default/scopes/scope1/collections/ body: maxTTL=0&name=collections_2 headers: {'Accept': '*/*', 'Connection': 'close', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==', 'Content-Type': 'application/x-www-form-urlencoded'} error: 500 reason: unknown ["Unexpected server error, request logged."] auth: Administrator:password
|
Exception in thread Thread-257:Traceback (most recent call last):
|
File "/opt/jython/Lib/threading.py", line 222, in _Thread__bootstrap
|
self.run()
|
File "/opt/jython/Lib/threading.py", line 213, in run
|
self._target(*self._args, **self._kwargs)
|
File "lib/Jython_tasks/task.py", line 3746, in create_collection_from_spec
|
self.set_exception("Create collection failed for "
|
File "lib/Jython_tasks/task.py", line 92, in set_exception
|
raise BaseException(self.exception)
|
BaseException: Create collection failed for default:scope1:collections_2, Reason - ["Unexpected server error, request logged."]
|
2021-04-30 05:24:45,144 | infra | ERROR | Thread-259 | [Rest_Connection:_http_request:253] POST http://172.23.107.126:8091/pools/default/buckets/bucket1/scopes/F7u9BJ6GmPYi77cd-31-941000/collections/ body: name=u15coYm%25Fsu-a-31-948000 headers: {'Accept': '*/*', 'Connection': 'close', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==', 'Content-Type': 'application/x-www-form-urlencoded'} error: 500 reason: unknown ["Unexpected server error, request logged."] auth: Administrator:password
|
2021-04-30 05:24:45,144 | infra | ERROR | Thread-258 | [Rest_Connection:_http_request:253] POST http://172.23.107.126:8091/pools/default/buckets/bucket2/scopes/scope1/collections/ body: name=RkMf0LngvXklUe-31-863000 headers: {'Accept': '*/*', 'Connection': 'close', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==', 'Content-Type': 'application/x-www-form-urlencoded'} error: 500 reason: unknown ["Unexpected server error, request logged."] auth: Administrator:password
|
Exception in thread Thread-259:Traceback (most recent call last):
|
File "/opt/jython/Lib/threading.py", line 222, in _Thread__bootstrap
|
Exception in thread Thread-258:Traceback (most recent call last):
|
File "/opt/jython/Lib/threading.py", line 222, in _Thread__bootstrap
|
self.run()
|
File "/opt/jython/Lib/threading.py", line 213, in run
|
self.run()
|
File "/opt/jython/Lib/threading.py", line 213, in run
|
self._target(*self._args, **self._kwargs)
|
File "lib/Jython_tasks/task.py", line 3746, in create_collection_from_spec
|
self._target(*self._args, **self._kwargs)
|
File "lib/Jython_tasks/task.py", line 3746, in create_collection_from_spec
|
self.set_exception("Create collection failed for "
|
File "lib/Jython_tasks/task.py", line 92, in set_exception
|
self.set_exception("Create collection failed for "
|
File "lib/Jython_tasks/task.py", line 92, in set_exception
|
raise BaseException(self.exception)
|
raise BaseException(self.exception)
|
BaseException: Create collection failed for bucket1:F7u9BJ6GmPYi77cd-31-941000:u15coYm%Fsu-a-31-948000, Reason - ["Unexpected server error, request logged."]
|
BaseException: Create collection failed for bucket2:scope1:RkMf0LngvXklUe-31-863000, Reason - ["Unexpected server error, request logged."]
|
In info.log we see corresponding errors.
[ns_server:error,2021-04-30T05:24:45.083-07:00,ns_1@172.23.107.126:<0.11204.0>:menelaus_util:reply_server_error:206]Server error during processing: ["web request failed",
|
{path,
|
"/pools/default/buckets/default/scopes/scope1/collections/"},
|
{method,'POST'},
|
{type,exit},
|
{what,
|
{shutdown,
|
{gen_server,call,
|
[collections,#Fun<collections.6.51386363>,
|
infinity]}}},
|
{trace,
|
[{gen_server,call,3,
|
[{file,"gen_server.erl"},{line,223}]},
|
{menelaus_web_collections,
|
'-handle_post_collection/3-fun-1-',4,
|
[{file,"src/menelaus_web_collections.erl"},
|
{line,73}]},
|
{request_throttler,do_request,3,
|
[{file,"src/request_throttler.erl"},
|
{line,58}]},
|
{menelaus_util,handle_request,2,
|
[{file,"src/menelaus_util.erl"},
|
{line,217}]},
|
{mochiweb_http,headers,6,
|
[{file,
|
"c:/Jenkins/workspace/couchbase-server-windows/couchdb/src/mochiweb/mochiweb_http.erl"},
|
{line,150}]},
|
{proc_lib,init_p_do_apply,3,
|
[{file,"proc_lib.erl"},{line,249}]}]}]
|
[ns_server:error,2021-04-30T05:24:45.107-07:00,ns_1@172.23.107.126:<0.11225.0>:menelaus_util:reply_server_error:206]Server error during processing: ["web request failed",
|
{path,
|
"/pools/default/buckets/bucket1/scopes/F7u9BJ6GmPYi77cd-31-941000/collections/"},
|
{method,'POST'},
|
{type,exit},
|
{what,
|
{noproc,
|
{gen_server,call,
|
[collections,#Fun<collections.6.51386363>,
|
infinity]}}},
|
{trace,
|
[{gen_server,call,3,
|
[{file,"gen_server.erl"},{line,223}]},
|
{menelaus_web_collections,
|
'-handle_post_collection/3-fun-1-',4,
|
[{file,"src/menelaus_web_collections.erl"},
|
{line,73}]},
|
{request_throttler,do_request,3,
|
[{file,"src/request_throttler.erl"},
|
{line,58}]},
|
{menelaus_util,handle_request,2,
|
[{file,"src/menelaus_util.erl"},
|
{line,217}]},
|
{mochiweb_http,headers,6,
|
[{file,
|
"c:/Jenkins/workspace/couchbase-server-windows/couchdb/src/mochiweb/mochiweb_http.erl"},
|
{line,150}]},
|
{proc_lib,init_p_do_apply,3,
|
[{file,"proc_lib.erl"},{line,249}]}]}]
|
|
See the UI logs.
We see messages like the following on the UI during the same time frame.
172.23.107.126
Control connection to memcached on 'ns_1@172.23.107.126' disconnected. Check logs for details.
|
Somebody thinks we're master. Not forcing mastership takover over ourselves
|
Haven't heard from a higher priority node or a master, so I'm taking over.
|
cbcollect_info attached. It should be noted that this happens intermittently and typically passes in reruns and we have not seen this in linux before.