Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-33142

[high-bucket] - Swap rebalance for kv failed with reason mover_crashed

    XMLWordPrintable

Details

    • Untriaged
    • Unknown

    Description

      Build 6.5.0-2360

      Observed that swap rebalance for kv node failed with reason mover_crashed, unexpected_exit. This is in test case of high bucket density with 30 buckets.
      This is not consistently reproducible as previous run on same build did not show this error.
      Note that cluster have all services running on different nodes.

      On UI error appeared as-

      Rebalance exited with reason {mover_crashed,
      {unexpected_exit,
      {'EXIT',<0.14273.482>,
      {{bulk_set_vbucket_state_failed,
      [{'ns_1@172.23.97.15',
      {'EXIT',
      {{{{{case_clause,
      {error,
      {{{badmatch,
      {error,
      {{badmatch,{error,timeout}},
      [{mc_client_binary,
      cmd_vocal_recv,5,
      [{file,
      "src/mc_client_binary.erl"},
      {line,154}]},
      {mc_client_binary,cmd_vocal,
      3,
      [{file,
      "src/mc_client_binary.erl"},
      {line,139}]},
      {dcp_commands,
      open_connection,5,
      [{file,
      "src/dcp_commands.erl"},
      {line,74}]},
      {dcp_proxy,connect,5,
      [{file,"src/dcp_proxy.erl"},
      {line,252}]},
      {dcp_proxy,maybe_connect,2,
      [{file,"src/dcp_proxy.erl"},
      {line,209}]},
      {dcp_consumer_conn,init,2,
      [{file,
      "src/dcp_consumer_conn.erl"},
      {line,57}]},
      {dcp_proxy,init,1,
      [{file,"src/dcp_proxy.erl"},
      {line,59}]},
      {gen_server,init_it,2,
      [{file,"gen_server.erl"},
      {line,365}]}]}}},
      [{dcp_replicator,init,1,
      [{file,
      "src/dcp_replicator.erl"},
      {line,48}]},
      {gen_server,init_it,2,
      [{file,"gen_server.erl"},
      {line,365}]},
      {gen_server,init_it,6,
      [{file,"gen_server.erl"},
      {line,333}]},
      {proc_lib,init_p_do_apply,3,
      [{file,"proc_lib.erl"},
      {line,247}]}]},
      {child,undefined,
      {'ns_1@172.23.97.14',
      [collections,del_times,snappy,
      xattr]},
      {dcp_replicator,start_link,
      ['ns_1@172.23.97.14',"eventing",
      [collections,del_times,snappy,
      xattr]]},
      temporary,60000,worker,
      [dcp_replicator]}}}},
      [{dcp_sup,start_replicator,2,
      [{file,"src/dcp_sup.erl"},
      {line,57}]},
      {dcp_sup,
      '-manage_replicators/2-lc$^3/1-3-',
      2,
      [{file,"src/dcp_sup.erl"},
      {line,98}]},
      {dcp_replication_manager,
      handle_call,3,
      [{file,
      "src/dcp_replication_manager.erl"},
      {line,89}]},
      {gen_server,try_handle_call,4,
      [{file,"gen_server.erl"},
      {line,636}]},
      {gen_server,handle_msg,6,
      [{file,"gen_server.erl"},
      {line,665}]},
      {proc_lib,init_p_do_apply,3,
      [{file,"proc_lib.erl"},
      {line,247}]}]},
      {gen_server,call,
      ['dcp_replication_manager-eventing',
      {manage_replicators,
      ['ns_1@172.23.97.12',
      'ns_1@172.23.97.13',
      'ns_1@172.23.97.14']},
      infinity]}},
      {gen_server,call,
      ['replication_manager-eventing',
      {change_vbucket_replication,265,
      'ns_1@172.23.97.14'},
      infinity]}},
      {gen_server,call,
      [{'janitor_agent-eventing',
      'ns_1@172.23.97.15'},
      {if_rebalance,<0.30809.476>,
      {update_vbucket_state,263,pending,
      passive,'ns_1@172.23.97.14'}},
      infinity]}}}}]},
      [{janitor_agent,bulk_set_vbucket_state,4,
      [{file,"src/janitor_agent.erl"},
      {line,400}]},
      {proc_lib,init_p,3,
      [{file,"proc_lib.erl"},{line,232}]}]}}}}
      

      Here are logs-
      KV nodes-
      https://s3.amazonaws.com/bugdb/jira/kv_reb_failure_hbd/collectinfo-2019-02-22T032653-ns_1%40172.23.97.12.zip
      https://s3.amazonaws.com/bugdb/jira/kv_reb_failure_hbd/collectinfo-2019-02-22T032653-ns_1%40172.23.97.13.zip
      https://s3.amazonaws.com/bugdb/jira/kv_reb_failure_hbd/collectinfo-2019-02-22T032653-ns_1%40172.23.97.14.zip
      https://s3.amazonaws.com/bugdb/jira/kv_reb_failure_hbd/collectinfo-2019-02-22T032653-ns_1%40172.23.97.15.zip

      other nodes-
      https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-arke-multi-bucket-288/172.23.96.20.zip
      https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-arke-multi-bucket-288/172.23.96.23.zip
      https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-arke-multi-bucket-288/172.23.97.177.zip
      https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-arke-multi-bucket-288/172.23.97.19.zip
      https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-arke-multi-bucket-288/172.23.97.20.zip

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            mahesh.mandhare Mahesh Mandhare (Inactive)
            mahesh.mandhare Mahesh Mandhare (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty