Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-36785

[System Test] Rebalance exited with reason - mover_crashed - unexpected_exit

    XMLWordPrintable

Details

    • Untriaged
    • Unknown

    Description

      Build : 6.5.0-4758

      Test: -test tests/eventing/test_eventing_volume.yml -scope tests/eventing/scope_eventing_volume.yml

      Test step:

      [2019-11-06T12:15:23-08:00, sequoiatools/couchbase-cli:6.5:d92759] rebalance -c 172.23.104.16:8091 --server-remove 172.23.104.17,172.23.104.18 -u Administrator -p password
       
      Error occurred on container - appropriate/curl:[-u Administrator:password -s 172.23.104.16:8091/pools/default/rebalanceProgress]
       
      docker logs f037f5
      docker start f037f5
       
      warning using 'json' filter:  unexpected end of JSON input []
       
      Error occurred on container - sequoiatools/couchbase-cli:6.5:[rebalance -c 172.23.104.16:8091 --server-remove 172.23.104.17,172.23.104.18 -u Administrator -p password]
       
      docker logs d92759
      docker start d92759
       
      *Unable to display progress bar on this os
      >ERROR: Unable to connect to host at http://172.23.104.16:8091
      [2019-11-06T12:19:12-08:00, sequoiatools/cmd:3ac2d0] 60

      Error in 172.23.104.16:

      2019-11-06T12:18:43.654-08:00, auto_failover:0:info:message(ns_1@172.23.104.25) - Enabled auto-failover with timeout 120 and max count 1
      2019-11-06T12:19:03.801-08:00, ns_vbucket_mover:0:critical:message(ns_1@172.23.104.16) - Worker <0.23290.73> (for action {move,{444,
                                             ['ns_1@172.23.104.17'],
                                             ['ns_1@172.23.104.16'],
                                             []}}) exited with reason {unexpected_exit,
                                                                       {'EXIT',
                                                                        <0.23875.73>,
                                                                        {failed_to_update_vbucket_map,
                                                                         "n1ql_op_dst",
                                                                         444,
                                                                         {error,
                                                                          [{'ns_1@172.23.104.25',
                                                                            timeout},
                                                                           {'ns_1@172.23.104.93',
                                                                            timeout},
                                                                           {'ns_1@172.23.104.94',
                                                                            timeout},
                                                                           {'ns_1@172.23.96.96',
                                                                            timeout}]}}}}
      2019-11-06T12:19:03.806-08:00, ns_orchestrator:0:critical:message(ns_1@172.23.104.16) - Rebalance exited with reason {mover_crashed,
                                    {unexpected_exit,
                                     {'EXIT',<0.23875.73>,
                                      {failed_to_update_vbucket_map,"n1ql_op_dst",
                                       444,
                                       {error,
                                        [{'ns_1@172.23.104.25',timeout},
                                         {'ns_1@172.23.104.93',timeout},
                                         {'ns_1@172.23.104.94',timeout},
                                         {'ns_1@172.23.96.96',timeout}]}}}}}.
      Rebalance Operation Id = 0ef740d5584d2b3c365cd9fcd893e4b5
      

      cbcollect logs:

      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1573071847/collectinfo-2019-11-06T202409-ns_1%40172.23.104.16.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1573071847/collectinfo-2019-11-06T202409-ns_1%40172.23.104.17.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1573071847/collectinfo-2019-11-06T202409-ns_1%40172.23.104.18.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1573071847/collectinfo-2019-11-06T202409-ns_1%40172.23.104.19.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1573071847/collectinfo-2019-11-06T202409-ns_1%40172.23.104.21.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1573071847/collectinfo-2019-11-06T202409-ns_1%40172.23.104.23.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1573071847/collectinfo-2019-11-06T202409-ns_1%40172.23.104.25.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1573071847/collectinfo-2019-11-06T202409-ns_1%40172.23.104.93.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1573071847/collectinfo-2019-11-06T202409-ns_1%40172.23.104.94.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1573071847/collectinfo-2019-11-06T202409-ns_1%40172.23.96.96.zip

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          The orchestrator node got disconnected from the rest of the cluster:

          2019-11-06T12:19:32.342-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.93) - Node 'ns_1@172.23.104.93' saw that node 'ns_1@172.23.104.21' went down. Details: [{nodedown_reason,
                                                                                             net_tick_timeout}]
          2019-11-06T12:19:32.342-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.93) - Node 'ns_1@172.23.104.93' saw that node 'ns_1@172.23.104.16' went down. Details: [{nodedown_reason,
                                                                                             net_tick_timeout}]
          2019-11-06T12:19:32.342-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.93) - Node 'ns_1@172.23.104.93' saw that node 'ns_1@172.23.104.18' went down. Details: [{nodedown_reason,
                                                                                             net_tick_timeout}]
          2019-11-06T12:19:32.342-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.93) - Node 'ns_1@172.23.104.93' saw that node 'ns_1@172.23.104.17' went down. Details: [{nodedown_reason,
                                                                                             net_tick_timeout}]
          2019-11-06T12:19:35.254-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.93) - Node 'ns_1@172.23.104.93' saw that node 'ns_1@172.23.104.23' went down. Details: [{nodedown_reason,
                                                                                             connection_closed}]
          2019-11-06T12:19:35.254-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.23) - Node 'ns_1@172.23.104.23' saw that node 'ns_1@172.23.96.96' went down. Details: [{nodedown_reason,
                                                                                            net_tick_timeout}]
          2019-11-06T12:19:35.255-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.23) - Node 'ns_1@172.23.104.23' saw that node 'ns_1@172.23.104.94' went down. Details: [{nodedown_reason,
                                                                                             net_tick_timeout}]
          2019-11-06T12:19:35.255-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.23) - Node 'ns_1@172.23.104.23' saw that node 'ns_1@172.23.104.25' went down. Details: [{nodedown_reason,
                                                                                             net_tick_timeout}]
          2019-11-06T12:19:35.256-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.23) - Node 'ns_1@172.23.104.23' saw that node 'ns_1@172.23.104.93' went down. Details: [{nodedown_reason,
                                                                                             net_tick_timeout}]
          

          So it's expected that rebalance will fail in such circumstances.

          Aliaksey Artamonau Aliaksey Artamonau (Inactive) added a comment - The orchestrator node got disconnected from the rest of the cluster: 2019-11-06T12:19:32.342-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.93) - Node 'ns_1@172.23.104.93' saw that node 'ns_1@172.23.104.21' went down. Details: [{nodedown_reason, net_tick_timeout}] 2019-11-06T12:19:32.342-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.93) - Node 'ns_1@172.23.104.93' saw that node 'ns_1@172.23.104.16' went down. Details: [{nodedown_reason, net_tick_timeout}] 2019-11-06T12:19:32.342-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.93) - Node 'ns_1@172.23.104.93' saw that node 'ns_1@172.23.104.18' went down. Details: [{nodedown_reason, net_tick_timeout}] 2019-11-06T12:19:32.342-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.93) - Node 'ns_1@172.23.104.93' saw that node 'ns_1@172.23.104.17' went down. Details: [{nodedown_reason, net_tick_timeout}] 2019-11-06T12:19:35.254-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.93) - Node 'ns_1@172.23.104.93' saw that node 'ns_1@172.23.104.23' went down. Details: [{nodedown_reason, connection_closed}] 2019-11-06T12:19:35.254-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.23) - Node 'ns_1@172.23.104.23' saw that node 'ns_1@172.23.96.96' went down. Details: [{nodedown_reason, net_tick_timeout}] 2019-11-06T12:19:35.255-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.23) - Node 'ns_1@172.23.104.23' saw that node 'ns_1@172.23.104.94' went down. Details: [{nodedown_reason, net_tick_timeout}] 2019-11-06T12:19:35.255-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.23) - Node 'ns_1@172.23.104.23' saw that node 'ns_1@172.23.104.25' went down. Details: [{nodedown_reason, net_tick_timeout}] 2019-11-06T12:19:35.256-08:00, ns_node_disco:5:warning:node down(ns_1@172.23.104.23) - Node 'ns_1@172.23.104.23' saw that node 'ns_1@172.23.104.93' went down. Details: [{nodedown_reason, net_tick_timeout}] So it's expected that rebalance will fail in such circumstances.

          It's either an environment issue or an issue with the test. Either way, the server behaved correctly.

          Aliaksey Artamonau Aliaksey Artamonau (Inactive) added a comment - It's either an environment issue or an issue with the test. Either way, the server behaved correctly.

          Bulk closing invalid, won-fix and duplicate bugs

          raju Raju Suravarjjala added a comment - Bulk closing invalid, won-fix and duplicate bugs

          People

            girish.benakappa Girish Benakappa
            girish.benakappa Girish Benakappa
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty