Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-20079

If system time is moved backwards, ep-engine task scheduling breaks - CentOS6 + Suse

    XMLWordPrintable

Details

    • Untriaged
    • No

    Description

      ./testrunner -i INI_FILE.ini ,GROUP=ALL -t xdcr.lww.Lww.test_seq_upd_on_bi_with_target_clock_faster,rdirection=bidirection,ctopology=chain,skip_ntp=True

      test cleanup failed with these errors:

      [2016-07-05 01:58:54,518] - [xdcrnewbasetests:2407] INFO - ====  XDCRNewbasetests cleanup is started for test #28 test_seq_upd_on_bi_with_target_clock_faster ====
      [2016-07-05 01:58:54,519] - [xdcrnewbasetests:1101] INFO - removing xdcr/nodes settings
      [2016-07-05 01:58:54,535] - [rest_client:1038] INFO - Deleting replication /controller/cancelXDCR/cda479cc6b22a20a718654fd277a3e83%2Flww%2Flww
      [2016-07-05 01:58:54,547] - [rest_client:1038] INFO - Deleting replication /controller/cancelXDCR/cda479cc6b22a20a718654fd277a3e83%2Fnolww%2Fnolww
      [2016-07-05 01:58:54,592] - [rest_client:986] INFO - removing remote cluster name:remote_cluster_C1-C2
      [2016-07-05 01:58:54,618] - [xdcrnewbasetests:1111] INFO - cleanup [ip:172.23.107.118 port:8091 ssh_username:root, ip:172.23.106.227 port:8091 ssh_username:root]
      [2016-07-05 01:58:54,656] - [bucket_helper:138] INFO - deleting existing buckets [u'lww', u'nolww'] on 172.23.107.118
      [2016-07-05 01:58:54,656] - [bucket_helper:140] INFO - remove bucket lww ...
      [2016-07-05 01:58:57,508] - [bucket_helper:154] INFO - deleted bucket : lww from 172.23.107.118
      [2016-07-05 01:58:57,508] - [bucket_helper:230] INFO - waiting for bucket deletion to complete....
      [2016-07-05 01:58:57,521] - [rest_client:133] INFO - node 172.23.107.118 existing buckets : [u'nolww']
      [2016-07-05 01:58:57,521] - [bucket_helper:140] INFO - remove bucket nolww ...
      [2016-07-05 01:58:59,774] - [bucket_helper:154] INFO - deleted bucket : nolww from 172.23.107.118
      [2016-07-05 01:58:59,774] - [bucket_helper:230] INFO - waiting for bucket deletion to complete....
      [2016-07-05 01:58:59,777] - [rest_client:133] INFO - node 172.23.107.118 existing buckets : []
      [2016-07-05 01:58:59,800] - [cluster_helper:255] INFO - rebalancing all nodes in order to remove nodes
      [2016-07-05 01:58:59,807] - [rest_client:1283] INFO - rebalance params : password=password&ejectedNodes=ns_1%40172.23.106.227&user=Administrator&knownNodes=ns_1%40172.23.107.118%2Cns_1%40172.23.106.227
      [2016-07-05 01:58:59,813] - [rest_client:1287] INFO - rebalance operation started
      [2016-07-05 01:58:59,824] - [rest_client:1418] INFO - rebalance percentage : 0.00 %
      [2016-07-05 01:59:19,855] - [rest_client:1373] INFO - rebalance progress took 20.0412900448 seconds 
      [2016-07-05 01:59:19,855] - [rest_client:1374] INFO - sleep for 10 seconds after rebalance...
      [2016-07-05 01:59:29,873] - [cluster_helper:295] INFO - removed all the nodes from cluster associated with ip:172.23.107.118 port:8091 ssh_username:root ? [(u'ns_1@172.23.106.227', 8091)]
      [2016-07-05 01:59:29,879] - [cluster_helper:78] INFO - waiting for ns_server @ 172.23.107.118:8091
      [2016-07-05 01:59:29,883] - [cluster_helper:80] INFO - ns_server @ 172.23.107.118:8091 is running
      [2016-07-05 01:59:29,896] - [bucket_helper:138] INFO - deleting existing buckets [] on 172.23.106.227
      [2016-07-05 01:59:29,911] - [cluster_helper:78] INFO - waiting for ns_server @ 172.23.106.227:8091
      [2016-07-05 01:59:29,913] - [cluster_helper:80] INFO - ns_server @ 172.23.106.227:8091 is running
      Cluster instance shutdown with force
      [2016-07-05 01:59:29,914] - [xdcrnewbasetests:1101] INFO - removing xdcr/nodes settings
      [2016-07-05 01:59:29,948] - [rest_client:986] INFO - removing remote cluster name:remote_cluster_C2-C1
      [2016-07-05 01:59:29,966] - [xdcrnewbasetests:1111] INFO - cleanup [ip:172.23.106.201 port:8091 ssh_username:root, ip:172.23.106.32 port:8091 ssh_username:root]
      [2016-07-05 01:59:30,009] - [bucket_helper:138] INFO - deleting existing buckets [u'lww', u'nolww'] on 172.23.106.201
      [2016-07-05 01:59:30,009] - [bucket_helper:140] INFO - remove bucket lww ...
      [2016-07-05 02:00:00,320] - [rest_client:778] ERROR - http://172.23.106.201:8091/pools/default/buckets/lww error 500 reason: unknown {"_":"Bucket deletion not yet complete, but will continue.\r\n"}
      [2016-07-05 02:00:00,320] - [rest_client:1840] WARNING - Bucket deletion timed out waiting for all nodes
      [2016-07-05 02:00:00,352] - [bucket_helper:153] ERROR - Unable to get timings for bucket
      [2016-07-05 02:00:00,352] - [bucket_helper:154] INFO - deleted bucket : lww from 172.23.106.201
      [2016-07-05 02:00:00,352] - [bucket_helper:230] INFO - waiting for bucket deletion to complete....
      [2016-07-05 02:00:00,387] - [rest_client:133] INFO - node 172.23.106.201 existing buckets : [u'nolww']
      [2016-07-05 02:00:00,387] - [bucket_helper:140] INFO - remove bucket nolww ...
      [2016-07-05 02:00:30,698] - [rest_client:778] ERROR - http://172.23.106.201:8091/pools/default/buckets/nolww error 500 reason: unknown {"_":"Bucket deletion not yet complete, but will continue.\r\n"}
      [2016-07-05 02:00:30,699] - [rest_client:1840] WARNING - Bucket deletion timed out waiting for all nodes
      [2016-07-05 02:00:30,708] - [bucket_helper:153] ERROR - Unable to get timings for bucket
      [2016-07-05 02:00:30,708] - [bucket_helper:154] INFO - deleted bucket : nolww from 172.23.106.201
      [2016-07-05 02:00:30,708] - [bucket_helper:230] INFO - waiting for bucket deletion to complete....
      [2016-07-05 02:00:30,711] - [rest_client:133] INFO - node 172.23.106.201 existing buckets : []
      [2016-07-05 02:00:30,736] - [cluster_helper:255] INFO - rebalancing all nodes in order to remove nodes
      [2016-07-05 02:00:30,741] - [rest_client:1283] INFO - rebalance params : password=password&ejectedNodes=ns_1%40172.23.106.32&user=Administrator&knownNodes=ns_1%40172.23.106.201%2Cns_1%40172.23.106.32
      [2016-07-05 02:00:30,745] - [rest_client:1287] INFO - rebalance operation started
      [2016-07-05 02:00:30,759] - [rest_client:1418] INFO - rebalance percentage : 0.00 %
      [2016-07-05 02:00:40,777] - [rest_client:1418] INFO - rebalance percentage : 0.00 %
      [2016-07-05 02:00:50,793] - [rest_client:1418] INFO - rebalance percentage : 0.00 %
      [2016-07-05 02:01:00,808] - [rest_client:1418] INFO - rebalance percentage : 0.00 %
      [2016-07-05 02:01:10,822] - [rest_client:1418] INFO - rebalance percentage : 0.00 %
      [2016-07-05 02:01:20,837] - [rest_client:1401] ERROR - {u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'} - rebalance failed
      [2016-07-05 02:01:20,992] - [rest_client:2605] INFO - Latest logs from UI on 172.23.106.201:
      [2016-07-05 02:01:20,992] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.106.201', u'code': 0, u'text': u'Replication from bucket "nolww" to bucket "nolww" on cluster "remote_cluster_C2-C1" created.', u'shortText': u'message', u'serverTime': u'2016-07-05T02:55:51.117Z', u'module': u'xdcr', u'tstamp': 1467712551117, u'type': u'info'}
      [2016-07-05 02:01:20,992] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.106.201', u'code': 0, u'text': u'Replication from bucket "lww" to bucket "lww" on cluster "remote_cluster_C2-C1" created.', u'shortText': u'message', u'serverTime': u'2016-07-05T02:55:50.864Z', u'module': u'xdcr', u'tstamp': 1467712550864, u'type': u'info'}
      [2016-07-05 02:01:20,992] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.106.201', u'code': 0, u'text': u'Created remote cluster reference "remote_cluster_C2-C1" via 172.23.107.118:8091.', u'shortText': u'message', u'serverTime': u'2016-07-05T02:55:50.179Z', u'module': u'xdcr', u'tstamp': 1467712550179, u'type': u'info'}
      [2016-07-05 02:01:20,992] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.106.201', u'code': 2, u'text': u'Rebalance exited with reason {buckets_shutdown_wait_failed,\n                              [{\'ns_1@172.23.106.201\',\n                                {\'EXIT\',\n                                 {old_buckets_shutdown_wait_failed,\n                                  ["nolww","lww"]}}}]}\n', u'shortText': u'message', u'serverTime': u'2016-07-05T02:01:11.152Z', u'module': u'ns_orchestrator', u'tstamp': 1467709271152, u'type': u'info'}
      [2016-07-05 02:01:20,992] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.106.201', u'code': 0, u'text': u'Failed to wait deletion of some buckets on some nodes: [{\'ns_1@172.23.106.201\',\n                                                         {\'EXIT\',\n                                                          {old_buckets_shutdown_wait_failed,\n                                                           ["nolww","lww"]}}}]\n', u'shortText': u'message', u'serverTime': u'2016-07-05T02:01:11.151Z', u'module': u'ns_rebalancer', u'tstamp': 1467709271151, u'type': u'critical'}
      [2016-07-05 02:01:20,993] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.106.201', u'code': 4, u'text': u"Starting rebalance, KeepNodes = ['ns_1@172.23.106.201'], EjectNodes = ['ns_1@172.23.106.32'], Failed over and being ejected nodes = []; no delta recovery nodes\n", u'shortText': u'message', u'serverTime': u'2016-07-05T02:00:30.733Z', u'module': u'ns_orchestrator', u'tstamp': 1467709230733, u'type': u'info'}
      [2016-07-05 02:01:20,993] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.106.201', u'code': 102, u'text': u'Client-side error-report for user "Administrator" on node \'ns_1@172.23.106.201\':\nUser-Agent:Python-httplib2/$Rev: 259 $\nStarting rebalance from test, ejected nodes [u\'ns_1@172.23.106.32\']\n', u'shortText': u'client-side error report', u'serverTime': u'2016-07-05T02:00:30.729Z', u'module': u'menelaus_web', u'tstamp': 1467709230729, u'type': u'warning'}
      [2016-07-05 02:01:20,993] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.106.201', u'code': 0, u'text': u'Shutting down bucket "lww" on \'ns_1@172.23.106.201\' for deletion', u'shortText': u'message', u'serverTime': u'2016-07-05T01:59:30.362Z', u'module': u'ns_memcached', u'tstamp': 1467709170362, u'type': u'info'}
      [2016-07-05 02:01:20,993] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.106.32', u'code': 0, u'text': u'Shutting down bucket "lww" on \'ns_1@172.23.106.32\' for deletion', u'shortText': u'message', u'serverTime': u'2016-07-05T01:59:30.316Z', u'module': u'ns_memcached', u'tstamp': 1467709170316, u'type': u'info'}
      [2016-07-05 02:01:20,993] - [rest_client:2606] ERROR - {u'node': u'ns_1@172.23.106.201', u'code': 0, u'text': u'Remote cluster reference "remote_cluster_C2-C1" known via 172.23.107.118:8091 removed.', u'shortText': u'message', u'serverTime': u'2016-07-05T01:59:29.945Z', u'module': u'xdcr', u'tstamp': 1467709169945, u'type': u'info'}
      ERROR
       
      ======================================================================
      ERROR: test_seq_upd_on_bi_with_target_clock_faster (xdcr.lww.Lww)
      ----------------------------------------------------------------------
      Traceback (most recent call last):
        File "pytests/xdcr/lww.py", line 27, in tearDown
          super(Lww, self).tearDown()
        File "pytests/xdcr/xdcrnewbasetests.py", line 2409, in tearDown
          cb_cluster.cleanup_cluster(self)
        File "pytests/xdcr/xdcrnewbasetests.py", line 1126, in cleanup_cluster
          ClusterOperationHelper.cleanup_cluster([node])
        File "lib/membase/helper/cluster_helper.py", line 260, in cleanup_cluster
          wait_for_rebalance=wait_for_rebalance)
        File "lib/membase/api/rest_client.py", line 112, in remove_nodes
          return self.rest.monitorRebalance()
        File "lib/membase/api/rest_client.py", line 1343, in monitorRebalance
          progress = self._rebalance_progress()
        File "lib/membase/api/rest_client.py", line 1403, in _rebalance_progress
          raise RebalanceFailedException(msg)
      RebalanceFailedException: Rebalance Failed: {u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'} - rebalance failed
       
      ----------------------------------------------------------------------
      Ran 1 test in 374.973s
      

      All the consecutive tests failed with same rebalance issue - attaching logs

      Attachments

        Issue Links

          Activity

            People

              arunkumar Arunkumar Senthilnathan (Inactive)
              arunkumar Arunkumar Senthilnathan (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty