Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6760

swaprebalance.SwapRebalanceFailedTests.test_failed_swap_rebalance get stuck for build 1776 while rebalancing

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 2.0
    • None
    • ns_server
    • Security Level: Public
    • None
    • Build 1776

    Description

      Rebalance Hangs while doing following scenario:-
      In a 8 node cluster
      Load 100000 data item
      Remove two nodes and then add them back,Run rebalance while data access runs parallel
      Once rebalance reaches at 60% kill one node (master) and wait for it warmed up
      Once warmup completed start rebalance,Rebalance get hang

      2012-09-27 02:17:05,948 - root - INFO - rebalance reached >60.0512448634% in 145.019459963 seconds
      2012-09-27 02:17:06,273 - root - INFO - creating direct client 10.3.121.92:11210 default
      2012-09-27 02:17:07,074 - root - INFO - os:cmd("kill -9 11660 ")
      2012-09-27 02:17:07,087 - root - INFO - /diag/eval status: True content: [] command: os:cmd("kill -9 11660 ")
      2012-09-27 02:17:07,088 - root - INFO - killed 10.3.121.92:8091?? (True, '[]')
      2012-09-27 02:17:07,117 - root - INFO - creating direct client 10.3.121.92:11210 default
      2012-09-27 02:17:07,835 - root - ERROR - Could not get warmup_time stats from server 10.3.121.92:8091, exception 'ep_warmup_time'
      2012-09-27 02:17:07,871 - root - INFO - creating direct client 10.3.121.92:11210 default
      2012-09-27 02:17:08,591 - root - ERROR - Could not get warmup_time stats from server 10.3.121.92:8091, exception 'ep_warmup_time'
      2012-09-27 02:17:08,623 - root - INFO - creating direct client 10.3.121.92:11210 default
      2012-09-27 02:17:09,334 - root - INFO - ep_warmup_time is 1943373
      2012-09-27 02:17:09,335 - root - INFO - Collected the stats 1943373 for server 10.3.121.92:8091
      2012-09-27 02:17:09,362 - root - INFO - warmup completed, awesome!!! Warmed up. 0 items
      2012-09-27 02:17:14,380 - root - INFO - rebalance params : password=password&ejectedNodes=ns_1%4010.3.121.93%2Cns_1%4010.3.121.94&user=Administrator&knownNodes=ns_1%4010.3.121.93%2Cns_1%4010.3.121.94%2Cns_1%4010.3.121.92%2Cns_1%4010.3.121.95%2Cns_1%4010.3.121.96
      2012-09-27 02:17:14,395 - root - INFO - rebalance operation started
      2012-09-27 02:17:14,417 - root - INFO - rebalance percentage : 0 %
      2012-09-27 02:17:16,442 - root - INFO - rebalance percentage : 0.0 %
      2012-09-27 02:17:18,498 - root - INFO - rebalance percentage : 0.0 %
      2012-09-27 02:17:20,510 - root - INFO - rebalance percentage : 0.0 %
      2012-09-27 02:17:22,523 - root - INFO - rebalance percentage : 0.0 %
      .....
      .....

      Attachments

        1. 10.3.121.92-8091-diag.txt.gz
          5.93 MB
        2. 10.3.121.92-8091-diag.txt.gz
          6.57 MB
        3. 10.3.121.95-8091-diag.txt.gz
          18.12 MB
        4. 10.3.121.96-8091-diag.txt.gz
          16.39 MB
        5. 10.3.121.97-8091-diag.txt.gz
          145 kB
        6. 10.3.121.98-8091-diag.txt.gz
          145 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Rohit Rohit Sinha (Inactive)
            Rohit Rohit Sinha (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty