Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-8314

rebalance exited ns_vbucket_mover failed to initiate_indexing

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 2.1.0
    • 2.1.0
    • ns_server
    • Security Level: Public
    • None

    Description

      Have a 5 node cluster that was upgraded from 181 -> 202 (build 805).
      After upgrade attempted to swap out orchestrator(10.3.121.69) and add in a new 202 node (10.3.3.131)

      Rebalance fails with ns_vbucket_mover000 on orchestrator reporting:
      <0.11037.1> exited with {noproc,
      {gen_server,call,
      [

      {'janitor_agent-saslbucket', 'ns_1@10.3.3.131'}

      ,

      {if_rebalance,<0.1.1>,initiate_indexing}

      ,
      infinity]}}

      On new node 202, couchdb went down and erlang crash dump was generated(attached).

      Saw these errors from mccouch which may be why vbuckets couldn't be moved to this node:
      Fri May 17 07:27:36.981472 PDT 3: (saslbucket) Trying to connect to mccouch: "127.0.0.1:11213"
      Fri May 17 07:27:36.981615 PDT 3: (saslbucket) Connected to mccouch: "127.0.0.1:11213"
      Fri May 17 07:27:37.019496 PDT 3: (saslbucket) Connection closed by mccouch
      Fri May 17 07:27:37.019527 PDT 3: (saslbucket) Resetting connection to mccouch, lastReceivedCommand = notify_vbucket_update lastSentCommand = notify_vbucket_update currentCommand =unknown
      Fri May 17 07:27:37.019595 PDT 3: (saslbucket) Trying to connect to mccouch: "127.0.0.1:11213"
      Fri May 17 07:27:37.019730 PDT 3: (saslbucket) Connected to mccouch: "127.0.0.1:11213"
      Fri May 17 07:27:37.021763 PDT 3: (saslbucket) Connection closed by mccouch
      Fri May 17 07:27:37.021788 PDT 3: (saslbucket) Resetting connection to mccouch, lastReceivedCommand = select_bucket lastSentCommand = notify_vbucket_update currentCommand =unknown

      =========================CRASH REPORT=========================
      crasher:
      initial call: mc_connection:init/1
      pid: <0.965.0>
      registered_name: []
      exception error: no case clause matching

      {error,system_limit}
      in function mc_connection:do_notify_vbucket_update/3
      in call from mc_connection:handle_message/9
      in call from mc_connection:read_full_message/2
      in call from mc_connection:run_loop/2
      ancestors: [mc_conn_sup,mc_sup,ns_server_sup,ns_server_cluster_sup,
      <0.59.0>]
      messages: []
      links: <0.641.0>,#Port<0.6784>
      dictionary: []
      trap_exit: false
      status: running
      heap_size: 1597
      stack_size: 24
      reductions: 1094838
      neighbours:

      [error_logger:error,2013-05-17T7:27:36.622,ns_1@10.3.3.131:error_logger<0.6.0>:ale_error_logger_handler:log_report:72]
      =========================SUPERVISOR REPORT=========================
      Supervisor: {local,mc_conn_sup}
      Context: child_terminated
      Reason: {case_clause,{error,system_limit}}
      Offender: [{pid,<0.965.0>},
      {name,mc_connection},
      {mfargs,{mc_connection,start_link,undefined}},
      {restart_type,temporary},
      {shutdown,brutal_kill},
      {child_type,worker}]

      =========================CRASH REPORT=========================
      crasher:
      initial call: couch_file:spawn_reader/2
      pid: <0.652.0>
      registered_name: []
      exception exit: {problem_reopening_file,
      {error,system_limit}

      ,

      {set_close_after,infinity,<0.650.0>}

      ,
      <0.652.0>,
      "/opt/couchbase/var/lib/couchbase/data/_replicator.couch.1",
      10}
      in function couch_file:reader_loop/3
      ancestors: [<0.650.0>,couch_server,couch_primary_services,
      couch_server_sup,cb_couch_sup,ns_server_cluster_sup,
      <0.59.0>]
      messages: []
      links: [<0.650.0>]
      dictionary: []
      trap_exit: true
      status: running
      heap_size: 377
      stack_size: 24
      reductions: 504
      neighbours:

      Attachments

        1. 10.3.121.69_diag.tar.gz
          1.62 MB
        2. 10.3.3.131_diags.tar.gz
          1.28 MB
        3. erl_crash.dump
          906 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            tommie Tommie McAfee (Inactive)
            tommie Tommie McAfee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty