Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-10286

Rebalance out crashes instantly

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 3.0
    • 3.0
    • ns_server
    • Security Level: Public
    • None
    • Triaged
    • Centos 64-bit

    Description

      Build Tested : 3.0.0-387-rel

      Test to reproduce:
      ./testrunner -i /tmp/new-viewtests-all.ini get-cbcollect-info=True,get-delays=True -t view.createdeleteview.CreateDeleteViewTests.ddoc_ops_removing_master,ddoc_ops=create,test_with_view=True,num_ddocs=4,num_views_per_ddoc=3,items=200000

      Test was trying to rebalance out the master node and at the same time performing ddoc operations(Create/Update/Delete)

      Time Stamp :

      2014-02-23 13:06:35 | ERROR | MainProcess | Cluster_Thread | [rest_client._rebalance_progress]

      {u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'}

      - rebalance failed

      Tests Failed:
      ddoc_ops_removing_master,ddoc_ops=update,test_with_view=True,num_ddocs=4,num_views_per_ddoc=3,items=200000
      ddoc_ops_removing_master,ddoc_ops=delete,test_with_view=True,num_ddocs=3,num_views_per_ddoc=4,items=200000
      ddoc_ops_removing_master,ddoc_ops=create,test_with_view=False,num_ddocs=4,num_views_per_ddoc=3,items=200000
      ddoc_ops_removing_master,ddoc_ops=update,test_with_view=False,num_ddocs=4,num_views_per_ddoc=3,items=200000 ddoc_ops_removing_master,ddoc_ops=delete,test_with_view=False,num_ddocs=3,num_views_per_ddoc=4,items=200000

      Logs :
      [user:info,2014-02-23T13:07:00.635,ns_1@10.1.3.74:<0.4652.0>:ns_orchestrator:handle_info:460]Rebalance exited with reason {{badmatch,
      {error,
      {{badmatch,
      {failed,[

      {'ns_1@10.1.3.74',no_reply}]}},
      [{ns_vbucket_mover,init,1},
      {gen_server,init_it,6},
      {proc_lib,init_p_do_apply,3}]}}},
      [{ns_rebalancer,run_mover,7},
      {ns_rebalancer,rebalance_one_bucket,5},
      {ns_rebalancer,rebalance_one_bucket,7},
      {lists,foreach,2},
      {ns_rebalancer,rebalance,3}]}

      [ns_server:debug,2014-02-23T13:07:00.635,ns_1@10.1.3.74:<0.17769.1>:ns_pubsub:do_subscribe_link:136]Parent process of subscription {master_activity_events,<0.17765.1>} exited with reason {{badmatch,
      {error,
      {{badmatch,
      {failed,
      [{'ns_1@10.1.3.74', no_reply}]}},
      [{ns_vbucket_mover, init, 1},
      {gen_server, init_it, 6},
      {proc_lib, init_p_do_apply, 3}]}}},
      [{ns_rebalancer, run_mover, 7},
      {ns_rebalancer, rebalance_one_bucket, 5},
      {ns_rebalancer, rebalance_one_bucket, 7},
      {lists, foreach, 2},
      {ns_rebalancer, rebalance, 3}]}
      [ns_server:debug,2014-02-23T13:07:00.635,ns_1@10.1.3.74:<0.17846.1>:ns_pubsub:do_subscribe_link:136]Parent process of subscription {ns_node_disco_events,<0.17838.1>} exited with reason {{badmatch,
      {failed,
      [{'ns_1@10.1.3.74', no_reply}]}},
      [{ns_vbucket_mover, init, 1},
      {gen_server, init_it, 6},
      {proc_lib, init_p_do_apply, 3}]}
      [error_logger:error,2014-02-23T13:07:00.636,ns_1@10.1.3.74:error_logger<0.6.0>:ale_error_logger_handler:log_report:115]
      =========================CRASH REPORT=========================
      crasher:
      initial call: ns_vbucket_mover:init/1
      pid: <0.17838.1>
      registered_name: []
      exception exit: {{badmatch,{failed,[{'ns_1@10.1.3.74',no_reply}

      ]}},
      [

      {ns_vbucket_mover,init,1},
      {gen_server,init_it,6},
      {proc_lib,init_p_do_apply,3}]}
      in function gen_server:init_it/6
      ancestors: [<0.17736.1>]
      messages: [spawn_initial]
      links: [<0.17736.1>,<0.17846.1>,<0.301.0>]
      dictionary: [{bucket_name,"default"},
      {i_am_master_mover,true},
      {child_processes,[]}]
      trap_exit: true
      status: running
      heap_size: 10946
      stack_size: 24
      reductions: 210114
      neighbours:

      [error_logger:error,2014-02-23T13:07:00.636,ns_1@10.1.3.74:error_logger<0.6.0>:ale_error_logger_handler:log_msg:119]Error in process <0.17736.1> on node 'ns_1@10.1.3.74' with exit value: badmatch,{error,{{badmatch,{failed,[{'ns_1@10.1.3.74',no_reply}],[{ns_vbucket_mover,init,1}

      ,

      {gen_server,init_it,6}

      ,

      {proc_lib,init_p_do_apply,3}

      ]}}},[

      {ns_rebalancer,run_mover,7}

      ,

      {ns_rebalancer,rebalance_one_bucket,5}

      ,...

      Jenkins Jobs:
      http://qa.hq.northscale.net/view/3.0.0/job/centos_x64--29_01--new_view_all-P1/20/consoleFull
      http://qa.hq.northscale.net/view/3.0.0/job/centos_x64--29_01--new_view_all-P1/21/consoleFull

      Notes:
      Uploading Logs

      Attachments

        For Gerrit Dashboard: MB-10286
        # Subject Branch Project Status CR V

        Activity

          People

            artem Artem Stemkovski
            Meenakshi Meenakshi Goel
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty