Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
None
-
6.0.0
-
Centos cluster 1
-
Untriaged
-
-
Unknown
Description
Build: 6.0.0 build 1529
Test Job: http://qa.sc.couchbase.com/job/centos-systest-launcher/1580/console
Cluster: http://172.23.108.103:8091/
We run the following steps in centos longevity
Longevity :
- Create 22 node cluster (9 KV,5 index, 2 query , 2 fts, 2 eventing, 2 cbas)
- Create 10 buckets (default bucket with Active compression)
- Create views
- Load data
- Remove kv node
- Deploy eventing functions
- Create dataset on analytics on 4 buckets
- Create index on 2 datasets
- Create 2i index
- Load more data
- Run queries on 2i
- Swap a KV node
- Run 240 queries per second on analytics
- Connect link Local
- Load more data to default bucket
- Add eventing node
- Remove eventing node
- Swap eventing node
- Disconnect link Local
- Add analytics node
- connect link Local
- Disconnect link Local
- Remove analytics node
- connect link Local
- Swap analytics node
- Kill analytics nodes
- Run views
- Create fts indexes
- Regex search on FTS
- XDCR replication
- Add rbac users
- Undeploy eventing handlers
- Load 1M doc
- Create 2i indexes
- Rebalance in index
- Rebalance out index
- Swap index node
- Rebalance in 2 index nodes
- Rebalance out 2 index nodes
- Rebalance out 1 KV
- Rebalance in 1 KV
- Failover -> Full recovery index node
- Failover -> Rebalance out index node
- Add index node
- Redeploy eventing handlers
- Run Tpcc
- Update Doc
- Add a kv node , failover kv node -> rebalance
- swap hard failover -> Add 1 KV remove 2 KV as soft and hard failover
- Multinode autofailover -> failover 3 KV nodes and rebalance
The rebalance operation fails because of a “badarg” exception. This issue was seen while debugging MB-30967:
[user:error,2018-09-11T20:27:39.219-07:00,ns_1@172.23.108.103:<0.8021.0>:ns_orchestrator:do_log_rebalance_completion:1117]Rebalance exited with reason {mover_crashed, |
{unexpected_exit,
|
{'EXIT',<0.684.1355>, |
{badarg,
|
{gen_server,call,
|
[{'janitor_agent-DISTRICT', |
'ns_1@172.23.108.104'}, |
{if_rebalance,<0.17033.1318>, |
{dcp_takeover,'ns_1@172.23.99.21',679}}, |
infinity]}}}}}
|
The following crash can be seen on node 172.23.108.104:
=========================CRASH REPORT=========================
|
crasher:
|
initial call: janitor_agent:-spawn_rebalance_subprocess/3-fun-0-/0 |
pid: <0.18154.445> |
registered_name: []
|
exception error: bad argument
|
in function link/1 |
called as link(undefined)
|
in call from janitor_agent:'-handle_call/3-fun-5-'/3 (src/janitor_agent.erl, line 721) |
in call from janitor_agent:'-spawn_rebalance_subprocess/3-fun-0-'/3 (src/janitor_agent.erl, line 896) |
ancestors: ['janitor_agent-DISTRICT','janitor_agent_sup-DISTRICT', |
'single_bucket_kv_sup-DISTRICT',ns_bucket_sup, |
ns_bucket_worker_sup,ns_server_sup,ns_server_nodes_sup,
|
<0.5984.169>,ns_server_cluster_sup,<0.89.0>] |
messages: []
|
links: [<0.25295.403>,<0.25397.403>] |
dictionary: []
|
trap_exit: false |
status: running
|
heap_size: 987 |
stack_size: 27 |
reductions: 1004 |
The following are the logs:
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.104.61.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.104.67.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.104.69.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.104.70.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.104.87.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.104.88.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.106.188.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.108.103.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.108.104.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.96.145.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.96.148.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.96.168.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.96.56.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.96.95.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.97.239.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.97.242.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.98.135.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.99.11.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.99.20.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.99.21.zip
https://s3.amazonaws.com/cb-engineering/mb30967/collectinfo-2018-09-12T191303-ns_1@172.23.99.25.zip