Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-45969

[Windows] Rebalance failed due to service_rebalance_failed, eventing, worker died

    XMLWordPrintable

Details

    Description

      Build: 7.0.0-5017

      Scenario:

      • Multi node cluster with all index, n1ql,cbas,eventing,fts services deployed
      • Running various rebalances in parallel to constant data load
      • Deploying eventing functions when rebalance_in is been triggered
      • Rebalance failed during rebalance_in operation of kv node

        +----------------+---------------+-----------------------+---------------+--------------+
        | Nodes          | Services      | Version               | CPU           | Status       |
        +----------------+---------------+-----------------------+---------------+--------------+
        | 172.23.136.114 | index, n1ql   | 7.0.0-5017-enterprise | 14.5080915318 | Cluster node |
        | 172.23.136.106 | kv            | 7.0.0-5017-enterprise | 95.3309016852 | Cluster node |
        | 172.23.138.127 | cbas          | 7.0.0-5017-enterprise | 23.905        | Cluster node |
        | 172.23.136.108 | kv            | 7.0.0-5017-enterprise | 94.7126321842 | Cluster node |
        | 172.23.136.112 | backup        | 7.0.0-5017-enterprise | 1.21166666667 | Cluster node |
        | 172.23.136.115 | eventing, fts | 7.0.0-5017-enterprise | 8.803946568   | Cluster node |
        | 172.23.136.113 | index, n1ql   | 7.0.0-5017-enterprise | 13.9064348928 | Cluster node |
        | 172.23.136.110 | kv            | 7.0.0-5017-enterprise | 92.4732706106 | Cluster node |
        | 172.23.136.105 | kv            | 7.0.0-5017-enterprise | 94.7908767427 | Cluster node |
        | 172.23.136.107 | ['kv']        |                       |               | <--- IN ---  |
        +----------------+---------------+-----------------------+---------------+--------------+
        

      Observation:

      Seeing eventing rebalance failed with reason,

      "Some apps are deploying or resuming on nodeId: c2e16dfe88967da8a18a2a76462c6b93 Apps: map[a2_users_search:2021-04-27 22:27:20.2836704 -0700 PDT"

      Rebalance exited with reason {service_rebalance_failed,eventing,
      {worker_died,
      {'EXIT',<0.20190.18>,
      {{badmatch,
      {error,
      {bad_nodes,eventing,prepare_rebalance,
      [{'ns_1@172.23.136.115',
      {error,
      {unknown_error,
      <<"Some apps are deploying or resuming on nodeId: c2e16dfe88967da8a18a2a76462c6b93 Apps: map[a2_users_search:2021-04-27 22:27:20.2836704 -0700 PDT m=+2452.559014101]">>}}}]}}},
      [{service_rebalancer,rebalance_worker,1,
      [{file,"src/service_rebalancer.erl"},
      {line,158}]},
      {proc_lib,init_p,3,
      [{file,"proc_lib.erl"},{line,234}]}]}}}}.
      Rebalance Operation Id = e197b89281485206bb3f29fba4e1f1ca

       

      Attachments

        Activity

          People

            ashwin.govindarajulu Ashwin Govindarajulu
            ashwin.govindarajulu Ashwin Govindarajulu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              PagerDuty