Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-43773

Rebalance failed during 'update_vbucket_state' followed by ns-server crash

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • 6.6.2
    • 6.6.2
    • qe
    • Enterprise Edition 6.6.2 build 9439

    Description

      Build: 6.6.2-9439

      Scenario:

      1. 9 node cluster (3 kv + 4 fts + 2 n1ql,2i)
      2. 6 couchstore buckets with replica=1
      3. Create 50 GSI on each bucket
      4. Create 10 fts index with index_partition=6
      5. Create and drop 100 GSIs on 4 buckets (total 400 indexes)
      6. Create and drop 50 FTS indexes on 3 buckets (Total 150 fts)
      7. Rebalance out one KV node (172.23.107.19) (rebalance_failed due to rebalance_stopped by janitor)
      8. Failed over the same node (172.23.107.19) (Success)
      9. Rebalance_in the node again (Failed)

      Observation:

      Seeing rebalance_in failed during "update_vbucket_state" followed by ns_server crash on node "172.23.107.19".

      This node has cores and memory as per our requirements:

      [root@s72022 ~]# free -ht
                    total        used        free      shared  buff/cache   available
      Mem:           4.3G        2.7G        1.1G        247M        486M        1.1G
      Swap:            0B          0B          0B
      Total:         4.3G        2.7G        1.1G

       

      Worker <0.26185.48> (for action {move,{554,
      ['ns_1@172.23.107.87', 'ns_1@172.23.107.19'],
      ['ns_1@172.23.107.19', 'ns_1@172.23.107.87'],
      []}}) exited with reason {unexpected_exit,
      {'EXIT', <0.26564.48>,
      {{nodedown, 'ns_1@172.23.107.19'},
      {gen_server, call,
      [{'janitor_agent-bucket-0', 'ns_1@172.23.107.19'},
      {if_rebalance, <0.17463.47>,
      {update_vbucket_state, 554, active, undefined, undefined, undefined}}, infinity]}}}}
       
      Rebalance exited with reason {mover_crashed,
      {unexpected_exit,
      {'EXIT',<0.26564.48>,
      {{nodedown,'ns_1@172.23.107.19'},
      {gen_server,call,
      [{'janitor_agent-bucket-0', 'ns_1@172.23.107.19'},
      {if_rebalance,<0.17463.47>,
      {update_vbucket_state,554,active, undefined,undefined,undefined}}, infinity]}}}}}.
      Rebalance Operation Id = d897d085b8db8df6ec5ec5f8b27d4c28
       
      Service 'ns_server' exited with status 137. Restarting. Messages:
      working as port
      8727: Booted. Waiting for shutdown request
      [os_mon] cpu supervisor port (cpu_sup): Erlang has closed
      [os_mon] memory supervisor port (memsup): Erlang has closed

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ashwin.govindarajulu Ashwin Govindarajulu
            ashwin.govindarajulu Ashwin Govindarajulu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty