Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-61243

Eventing requires unnecessary rebalance

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown

    Description

      TLDR: the eventing service doesn't appear to keep track of the nodes in the cluster providing the eventing service. Thus after a cluster reboot a rebalance is required.

      When a cluster is rebooted the eventing service returns just itself in "nodes" and "isBalanced" is true.  Because the number of "nodes" doesn't match what ns_server has eventing configured for we require a rebalance. To reproduce:

      • cluster_run -n 2 --dont-rename
      • cluster_connect -n 2 -s 1024 -I 512 -M plasma -T n0:kv+index+n1ql+fts+eventing+cbas+backup,n1:kv+index+n1ql+fts+eventing+cbas+backup
      • Log into UI and see that rebalance completes
      • CTRL^C in the window where cluster_run was run
      • cluster_run -n 2 --dont-rename

      At this point the /pools/default endpoint returns that "eventing" and "backup" require a rebalance.

        "balanced": false,
        "servicesNeedRebalance": [
          {
            "code": "service_not_balanced",
            "description": "Service needs rebalance.",
            "services": [
              "eventing",
              "backup"
            ]
          }
      

      The reason ns_server believes "eventing" needs a rebalance is due to GetTopology responses from each of the two nodes includes just that node and indicates isBalanced is true. Here's the entries for the two nodes on my run

      [json_rpc:debug,2024-03-20T14:13:38.757-07:00,n_0@127.0.0.1:json_rpc_connection-eventing-service_api<0.1232.0>:json_rpc_connection:handle_info:107]got response: [{<<"id">>,2},
                     {<<"result">>,
                      {[{<<"rev">>,<<"AAAAAAAAAAA=">>},
                        {<<"nodes">>,[<<"1080b788c0e8115ce25ff93ed60cd4f1">>]},
                        {<<"isBalanced">>,true}]}},
                     {<<"error">>,null}]
      

      and the other node

      [json_rpc:debug,2024-03-20T14:13:38.767-07:00,n_1@127.0.0.1:json_rpc_connection-eventing-service_api<0.1200.0>:json_rpc_connection:handle_info:107]got response: [{<<"id">>,2},
                     {<<"result">>,
                      {[{<<"rev">>,<<"AAAAAAAAAAA=">>},
                        {<<"nodes">>,[<<"16745cea9a733708f49fa44e1def4528">>]},
                        {<<"isBalanced">>,true}]}},
                     {<<"error">>,null}]
      

      As an example of what would be expected...this is after doing a rebalance from the UI

      [json_rpc:debug,2024-03-20T14:19:06.583-07:00,n_0@127.0.0.1:json_rpc_connection-eventing-service_api<0.1232.0>:json_rpc_connection:handle_info:107]got response: [{<<"id">>,40},
                     {<<"result">>,
                      {[{<<"rev">>,<<"AAAAAAAAAAQ=">>},
                        {<<"nodes">>,
                         [<<"1080b788c0e8115ce25ff93ed60cd4f1">>,
                          <<"16745cea9a733708f49fa44e1def4528">>]},
                        {<<"isBalanced">>,true}]}},
                     {<<"error">>,null}]
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              sujay.gad Sujay Gad
              steve.watanabe Steve Watanabe
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty