Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-30947

Rebalance in extraneously exited, reported as successful

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 6.0.0
    • 6.0.0
    • ns_server
    • Untriaged
    • Unknown

    Description

      (observed in a recent Jenkins dev test run)
      On a rebalance in of two Analytics nodes, the rebalance quickly exits with a crash saying it received an exit signal (nothing external was sent), and yet the rebalance was declared as successful. The analytics coordinator was never called to perform the rebalance.

      2018-08-15T19:57:50.291-07:00 INFO ClusterExecutionITBase [main] Rebalancing from [0, 1] to [0, 1, 2, 3]
      2018-08-15T19:57:54.947-07:00 INFO ClusterExecutionITBase [main] Running cli: [server-add, -c, 127.0.0.1:9000, -u, couchbase, -p, couchbase, --server-add, 127.0.0.1:9003,127.0.0.1:9002, --server-add-username, couchbase, --server-add-password, couchbase, --services, analytics]
      2018-08-15T19:58:02.792-07:00 INFO ClusterExecutionITBase [main+] >> SUCCESS: Server added
      2018-08-15T19:58:02.794-07:00 INFO ClusterExecutionITBase [main]   calling rebalance with following args: [rebalance, -c, 127.0.0.1:9000, -u, couchbase, -p, couchbase]
      2018-08-15T19:58:04.073-07:00 INFO ClusterExecutionITBase [main+] >> Unable to display progress bar on this os
      2018-08-15T19:58:04.073-07:00 INFO ClusterExecutionITBase [main+] >> SUCCESS: Rebalance complete
      

      [ns_server:debug,2018-08-15T19:58:03.505-07:00,n_0@127.0.0.1:service_rebalancer-cbas<0.3419.0>:service_agent:wait_for_agents:73]Waiting for the service agents for service cbas to come up on nodes:
      ['n_1@172.17.0.2','n_2@127.0.0.1','n_3@127.0.0.1']
      [ns_server:info,2018-08-15T19:58:03.508-07:00,n_0@127.0.0.1:leader_registry<0.951.0>:leader_registry_server:handle_down:253]Process <0.3322.0> registered as 'ns_rebalance_observer' terminated.
      [ns_server:debug,2018-08-15T19:58:03.508-07:00,n_0@127.0.0.1:<0.3323.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {master_activity_events,<0.3322.0>} exited with reason shutdown
      [ns_server:debug,2018-08-15T19:58:03.514-07:00,n_0@127.0.0.1:<0.3223.0>:ns_rebalancer:rebalance_topology_aware_service:621]Got an exit signal while waiting for the service rebalance to complete. Service: cbas. Exit message: {'EXIT',
                                                                                                            <0.3417.0>,
                                                                                                            normal}
      [ns_server:debug,2018-08-15T19:58:03.522-07:00,n_0@127.0.0.1:service_rebalancer-cbas<0.3419.0>:service_agent:wait_for_agents_loop:91]All service agents are ready for cbas
      [ns_server:error,2018-08-15T19:58:03.545-07:00,n_0@127.0.0.1:service_rebalancer-cbas<0.3419.0>:service_rebalancer:run_rebalance:74]Got exit message from parent: {'EXIT',<0.3223.0>,shutdown}
      [rebalance:info,2018-08-15T19:58:03.545-07:00,n_0@127.0.0.1:service_rebalancer-cbas-worker<0.3452.0>:service_rebalancer:rebalance:110]Rebalancing service cbas with id <<"456544653348b373270e9ca6ff57f78c">>.
      KeepNodes: ['n_1@172.17.0.2','n_2@127.0.0.1','n_3@127.0.0.1']
      EjectNodes: []
      DeltaNodes: []
      [ns_server:debug,2018-08-15T19:58:03.554-07:00,n_0@127.0.0.1:leader_activities<0.947.0>:leader_activities:handle_activity_down:526]Activity terminated with reason {raised,
                                       {exit,normal,
                                        [{ns_rebalancer,
                                          '-rebalance_topology_aware_service/4-fun-1-',
                                          5,
                                          [{file,"src/ns_rebalancer.erl"},
                                           {line,627}]},
                                         {misc,with_trap_exit,1,
                                          [{file,"src/misc.erl"},{line,2307}]},
                                         {ns_rebalancer,
                                          '-rebalance_topology_aware_services/4-fun-0-',
                                          4,
                                          [{file,"src/ns_rebalancer.erl"},
                                           {line,600}]},
                                         {lists,filtermap,2,
                                          [{file,"lists.erl"},{line,1302}]},
                                         {ns_rebalancer,rebalance_services,2,
                                          [{file,"src/ns_rebalancer.erl"},
                                           {line,534}]},
                                         {ns_rebalancer,rebalance_body,6,
                                          [{file,"src/ns_rebalancer.erl"},
                                           {line,724}]},
                                         {async,'-async_init/4-fun-2-',3,
                                          [{file,"src/async.erl"},{line,208}]}]}}. Activity:
      {activity,<0.3221.0>,#Ref<0.0.0.23433>,default,
                <<"55e356fed96b76c319ae67931025b0ad">>,
                [rebalance],
                majority,[]}
      [user:info,2018-08-15T19:58:03.554-07:00,n_0@127.0.0.1:<0.974.0>:ns_orchestrator:do_log_rebalance_completion:1111]Rebalance completed successfully.
      

      <0.974.0>:ns_orchestrator:do_log_rebalance_completion:1111]Rebalance completed successfully.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            tanzeem.ahmed Tanzeem Ahmed (Inactive)
            michael.blow Michael Blow
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty