Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6711

size of one vBucket is 10 GB - Rebalance exited with reason replicator_died( many retries to notify CouchDB of update : vbid=294 rev=1)

    Details

      Description

      2.0.0-1751-rel

      testrunner -i /tmp/rebalance_regression.ini get-logs=True,disabled_consistent_view=False -t swaprebalance.SwapRebalanceFailedTests.test_failed_swap_rebalance,replica=2,num-buckets=1,num-swap=2,swap-orchestrator=False
      http://qa.hq.northscale.net/job/centos-64-2.0-rebalance-regressions/47/consoleFull

      2012-09-21 06:26:25.276 menelaus_web_alerts_srv:1:info:message(ns_1@10.3.121.94) - Approaching full disk warning. Usage of disk "/" on node "10.3.121.94" is around 91%.
      2012-09-21 06:43:15.462 supervisor_cushion:1:warning:port exited too soon after restart(ns_1@10.3.121.94) - Service memcached exited on node 'ns_1@10.3.121.94' in 0.58s

      2012-09-21 06:44:30.881 ns_vbucket_mover:0:critical:message(ns_1@10.3.121.92) - <0.17933.33> exited with {exited,
      {'EXIT',<0.17965.33>,
      {replicator_died,
      {'EXIT',<19812.318.4>,downstream_closed}}}}
      2012-09-21 06:44:30.892 ns_memcached:2:info:message(ns_1@10.3.121.96) - Shutting down bucket "bucket-0" on 'ns_1@10.3.121.96' for deletion
      2012-09-21 06:44:31.274 ns_orchestrator:2:info:message(ns_1@10.3.121.92) - Rebalance exited with reason {exited,
      {'EXIT',<0.17965.33>,
      {replicator_died,
      {'EXIT',<19812.318.4>,downstream_closed}}}}

      logs on vm 10.3.121.94 contain many retries to notify CouchDB of update : vbid=294 rev=1

      memcached<0.31491.3>: Fri Sep 21 06:12:14.929498 PDT 3: Retry notify CouchDB of update, vbid=294 rev=1
      memcached<0.31491.3>: Fri Sep 21 06:12:14.931793 PDT 3: Retry notify CouchDB of update, vbid=294 rev=1
      root@ubuntu1104-64:/opt/couchbase/var/lib/couchbase/logs# more info.5
      [ns_server:info,2012-09-21T6:06:34.551,ns_1@10.3.121.94:ns_port_memcached:ns_port_server:log:169]memcached<0.31491.3>: Fri Sep 21 06:06:34.3
      50714 PDT 3: Retry notify CouchDB of update, vbid=294 rev=1
      memcached<0.31491.3>: Fri Sep 21 06:06:34.352976 PDT 3: Retry notify CouchDB of update, vbid=294 rev=1
      memcached<0.31491.3>: Fri Sep 21 06:06:34.355496 PDT 3: Retry notify CouchDB of update, vbid=294 rev=1

      as a result the size of vbucket 294 is 9.3 GB?

      root@ubuntu1104-64:/opt/couchbase/var/lib/couchbase/data/bucket-2# ls -la
      total 9337196
      drwxr-xr-x 2 couchbase couchbase 4096 2012-09-21 06:44 .
      drwxr-xr-x 4 couchbase couchbase 4096 2012-09-21 03:54 ..
      rw-rr- 1 couchbase couchbase 442459 2012-09-21 04:11 0.couch.16
      rw-rr- 1 couchbase couchbase 397403 2012-09-21 04:11 147.couch.14
      rw-rr- 1 couchbase couchbase 438363 2012-09-21 06:44 148.couch.14
      rw-rr- 1 couchbase couchbase 413787 2012-09-21 04:11 149.couch.14
      rw-rr- 1 couchbase couchbase 417883 2012-09-21 06:44 150.couch.14
      rw-rr- 1 couchbase couchbase 454747 2012-09-21 06:44 151.couch.13
      rw-rr- 1 couchbase couchbase 442459 2012-09-21 04:11 152.couch.13
      rw-rr- 1 couchbase couchbase 430171 2012-09-21 04:11 153.couch.13
      rw-rr- 1 couchbase couchbase 413787 2012-09-21 06:44 154.couch.13
      rw-rr- 1 couchbase couchbase 409691 2012-09-21 04:11 172.couch.12
      rw-rr- 1 couchbase couchbase 413787 2012-09-21 04:11 173.couch.12
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 174.couch.12
      rw-rr- 1 couchbase couchbase 405595 2012-09-21 04:11 175.couch.12
      rw-rr- 1 couchbase couchbase 397403 2012-09-21 04:11 176.couch.12
      rw-rr- 1 couchbase couchbase 438363 2012-09-21 04:11 177.couch.12
      rw-rr- 1 couchbase couchbase 405595 2012-09-21 04:11 178.couch.11
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 179.couch.11
      rw-rr- 1 couchbase couchbase 413787 2012-09-21 04:11 1.couch.16
      rw-rr- 1 couchbase couchbase 1355867 2012-09-21 03:54 25.couch.1
      rw-rr- 1 couchbase couchbase 9494917305 2012-09-21 06:43 294.couch.1
      rw-rr- 1 couchbase couchbase 430171 2012-09-21 04:11 294.couch.11
      rw-rr- 1 couchbase couchbase 421979 2012-09-21 04:11 295.couch.11
      rw-rr- 1 couchbase couchbase 405595 2012-09-21 04:11 296.couch.11
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 297.couch.10
      rw-rr- 1 couchbase couchbase 409691 2012-09-21 04:11 298.couch.10
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 299.couch.10
      rw-rr- 1 couchbase couchbase 389211 2012-09-21 04:11 2.couch.16
      rw-rr- 1 couchbase couchbase 421979 2012-09-21 04:11 300.couch.10
      rw-rr- 1 couchbase couchbase 405595 2012-09-21 04:11 301.couch.9
      rw-rr- 1 couchbase couchbase 594011 2012-09-21 03:54 33.couch.1
      rw-rr- 1 couchbase couchbase 430171 2012-09-21 04:11 343.couch.9
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 344.couch.9
      rw-rr- 1 couchbase couchbase 438363 2012-09-21 04:11 345.couch.8
      rw-rr- 1 couchbase couchbase 454747 2012-09-21 04:11 346.couch.8
      rw-rr- 1 couchbase couchbase 405595 2012-09-21 04:11 347.couch.8
      rw-rr- 1 couchbase couchbase 401499 2012-09-21 04:11 348.couch.8
      rw-rr- 1 couchbase couchbase 421979 2012-09-21 04:11 349.couch.7
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 350.couch.7
      rw-rr- 1 couchbase couchbase 430171 2012-09-21 04:11 351.couch.7
      rw-rr- 1 couchbase couchbase 409691 2012-09-21 04:11 352.couch.7
      rw-rr- 1 couchbase couchbase 405595 2012-09-21 04:11 353.couch.7
      rw-rr- 1 couchbase couchbase 401499 2012-09-21 04:11 354.couch.6
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 355.couch.6
      rw-rr- 1 couchbase couchbase 397403 2012-09-21 04:11 356.couch.6
      rw-rr- 1 couchbase couchbase 442459 2012-09-21 04:11 357.couch.5
      rw-rr- 1 couchbase couchbase 438363 2012-09-21 04:11 358.couch.5
      rw-rr- 1 couchbase couchbase 450651 2012-09-21 04:11 359.couch.5
      rw-rr- 1 couchbase couchbase 421979 2012-09-21 04:11 360.couch.5
      rw-rr- 1 couchbase couchbase 405595 2012-09-21 04:11 361.couch.5
      rw-rr- 1 couchbase couchbase 421979 2012-09-21 04:11 362.couch.5
      rw-rr- 1 couchbase couchbase 430171 2012-09-21 04:11 363.couch.4
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 364.couch.4
      rw-rr- 1 couchbase couchbase 430171 2012-09-21 04:11 365.couch.4
      rw-rr- 1 couchbase couchbase 450651 2012-09-21 04:11 366.couch.4
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 367.couch.4
      rw-rr- 1 couchbase couchbase 430171 2012-09-21 04:11 392.couch.4
      rw-rr- 1 couchbase couchbase 413787 2012-09-21 04:11 393.couch.4
      rw-rr- 1 couchbase couchbase 413787 2012-09-21 04:11 394.couch.4
      rw-rr- 1 couchbase couchbase 413787 2012-09-21 04:11 395.couch.4
      rw-rr- 1 couchbase couchbase 442459 2012-09-21 04:11 396.couch.4
      rw-rr- 1 couchbase couchbase 417883 2012-09-21 04:11 397.couch.4
      rw-rr- 1 couchbase couchbase 430171 2012-09-21 04:11 398.couch.4
      rw-rr- 1 couchbase couchbase 23777465 2012-09-21 06:44 399.couch.1
      rw-rr- 1 couchbase couchbase 421979 2012-09-21 04:11 399.couch.4
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 3.couch.16
      rw-rr- 1 couchbase couchbase 999515 2012-09-21 03:54 42.couch.1
      rw-rr- 1 couchbase couchbase 450651 2012-09-21 04:11 4.couch.15
      rw-rr- 1 couchbase couchbase 442459 2012-09-21 04:11 594.couch.3
      rw-rr- 1 couchbase couchbase 450651 2012-09-21 04:11 595.couch.3
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 596.couch.3
      rw-rr- 1 couchbase couchbase 401499 2012-09-21 04:11 597.couch.3
      rw-rr- 1 couchbase couchbase 430171 2012-09-21 04:11 598.couch.3
      rw-rr- 1 couchbase couchbase 421979 2012-09-21 04:11 599.couch.3
      rw-rr- 1 couchbase couchbase 401499 2012-09-21 04:11 5.couch.15
      rw-rr- 1 couchbase couchbase 454747 2012-09-21 04:11 600.couch.3
      rw-rr- 1 couchbase couchbase 413787 2012-09-21 04:11 601.couch.2
      rw-rr- 1 couchbase couchbase 409691 2012-09-21 04:11 634.couch.2
      rw-rr- 1 couchbase couchbase 417883 2012-09-21 04:11 635.couch.2
      rw-rr- 1 couchbase couchbase 405595 2012-09-21 04:11 636.couch.2
      rw-rr- 1 couchbase couchbase 442459 2012-09-21 04:11 637.couch.2
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 638.couch.2
      rw-rr- 1 couchbase couchbase 421979 2012-09-21 04:11 639.couch.2
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 640.couch.2
      rw-rr- 1 couchbase couchbase 434267 2012-09-21 04:11 641.couch.2
      rw-rr- 1 couchbase couchbase 413787 2012-09-21 04:11 642.couch.2
      rw-rr- 1 couchbase couchbase 430171 2012-09-21 04:11 643.couch.2
      rw-rr- 1 couchbase couchbase 434267 2012-09-21 04:11 644.couch.2
      rw-rr- 1 couchbase couchbase 401499 2012-09-21 04:11 645.couch.2
      rw-rr- 1 couchbase couchbase 438363 2012-09-21 04:11 646.couch.2
      rw-rr- 1 couchbase couchbase 446555 2012-09-21 04:11 647.couch.2
      rw-rr- 1 couchbase couchbase 413787 2012-09-21 04:11 648.couch.2
      rw-rr- 1 couchbase couchbase 446555 2012-09-21 04:11 649.couch.2
      rw-rr- 1 couchbase couchbase 417883 2012-09-21 04:11 6.couch.15
      rw-rr- 1 couchbase couchbase 426075 2012-09-21 04:11 7.couch.15
      rw-rr- 1 couchbase couchbase 417883 2012-09-21 04:11 8.couch.15
      rw-rr- 1 couchbase couchbase 4130 2012-09-21 04:11 master.couch.16
      rw-rr- 1 couchbase couchbase 18915 2012-09-21 04:29 stats.json
      rw-rr- 1 couchbase couchbase 0 2012-09-21 06:44 stats.json.new
      rw-rr- 1 couchbase couchbase 18915 2012-09-21 04:28 stats.json.old
      root@ubuntu1104-64:/opt/couchbase/var/lib/couchbase/data/bucket-2# ^C

      1. 3a0ad78b-cf67-4e70-8d24-31eba49e4a41-10.3.121.92-diag.txt.gz
        6.74 MB
        Andrei Baranouski
      2. 3a0ad78b-cf67-4e70-8d24-31eba49e4a41-10.3.121.93-diag.txt.gz
        8.99 MB
        Andrei Baranouski
      3. 3a0ad78b-cf67-4e70-8d24-31eba49e4a41-10.3.121.95-diag.txt.gz
        7.90 MB
        Andrei Baranouski
      4. 3a0ad78b-cf67-4e70-8d24-31eba49e4a41-10.3.121.96-diag.txt.gz
        7.51 MB
        Andrei Baranouski
      5. 3a0ad78b-cf67-4e70-8d24-31eba49e4a41-10.3.121.97-diag.txt.gz
        7.69 MB
        Andrei Baranouski
      6. 3a0ad78b-cf67-4e70-8d24-31eba49e4a41-10.3.121.98-diag.txt.gz
        8.21 MB
        Andrei Baranouski
      7. logs_94_1.tar.gz
        15.14 MB
        Andrei Baranouski
      8. logs_94.tar.gz
        14.93 MB
        Andrei Baranouski
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        andreibaranouski Andrei Baranouski created issue -
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Looks like that directory listing race that we fixed in capi view manager but from perspective of ep-engine.

        I.e. you can see that vbucket 294 is at file revision 11 while ep-engine apparently tries to write to 1st.

        My guess is that ep-engine did not see 294.11 due to race in vbucket listing versus end of compaction, and started writing revision 1 assuming there's no vbucket file. And perhaps then something is broken in retry logic

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Looks like that directory listing race that we fixed in capi view manager but from perspective of ep-engine. I.e. you can see that vbucket 294 is at file revision 11 while ep-engine apparently tries to write to 1st. My guess is that ep-engine did not see 294.11 due to race in vbucket listing versus end of compaction, and started writing revision 1 assuming there's no vbucket file. And perhaps then something is broken in retry logic
        alkondratenko Aleksey Kondratenko (Inactive) made changes -
        Field Original Value New Value
        Assignee Aleksey Kondratenko [ alkondratenko ] Chiyoung Seo [ chiyoung ]
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        I think this is due to persistence not working on the latest builds

        Show
        farshid Farshid Ghods (Inactive) added a comment - I think this is due to persistence not working on the latest builds
        farshid Farshid Ghods (Inactive) made changes -
        Assignee Chiyoung Seo [ chiyoung ] Aleksey Kondratenko [ alkondratenko ]
        Component/s couchbase-bucket [ 10173 ]
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        Assigning to epengine team

        Show
        farshid Farshid Ghods (Inactive) added a comment - Assigning to epengine team
        farshid Farshid Ghods (Inactive) made changes -
        Assignee Aleksey Kondratenko [ alkondratenko ] Chiyoung Seo [ chiyoung ]
        Hide
        andreibaranouski Andrei Baranouski added a comment -

        set this bug as critical because there are many tests that are failed and this issue causes the opening of new bugs:
        MB-6710 Deleting bucket on a cluster gives error "Some nodes are still deleting bucket"
        MB-6712 API pools/default/buckets/ doesn't return any buckets but attempt to create bucket gives: Bucket with given name still exists

        tests:
        1)
        http://qa.hq.northscale.net/job/centos-64-2.0-view-query-extended-tests/70/consoleFull

        2012-09-22 17:39:44,219] - [rest_client:96] INFO - existing buckets : []
        [2012-09-22 17:39:44,226] - [rest_client:1234] INFO - http://10.2.2.60:8091/pools/default/buckets with param: proxyPort=11211&bucketType=membase&authType=sasl&replicaIndex=1&name=default&saslPassword=&replicaNumber=1&ramQuotaMB=1456
        [2012-09-22 17:39:44,236] - [rest_client:582] ERROR - http://10.2.2.60:8091/pools/default/buckets error 503 reason: unknown

        {"_":"Bucket with given name still exists"}

        [2012-09-22 17:39:44,246] - [bucket_helper:124] INFO - deleting existing buckets on [ip:10.2.2.60 port:8091 ssh_username:root, ip:10.2.2.108 port:8091 ssh_username:root, ip:10.2.2.63 port:8091 ssh_username:root, ip:10.2.2.64 port:8091 ssh_username:root, ip:10.2.2.65 port:8091 ssh_username:root]
        [2012-09-22 17:39:44,321] - [cluster_helper:199] INFO - rebalancing all nodes in order to remove nodes
        [2012-09-22 17:39:44,326] - [rest_client:826] INFO - rebalance params : password=password&ejectedNodes=ns_1%4010.2.2.108&user=Administrator&knownNodes=ns_1%4010.2.2.60%2Cns_1%4010.2.2.108
        [2012-09-22 17:39:44,331] - [rest_client:833] INFO - rebalance operation started
        [2012-09-22 17:39:44,336] - [rest_client:929] INFO - rebalance percentage : 0 %
        [2012-09-22 17:39:46,341] - [rest_client:929] INFO - rebalance percentage : 0 %
        [2012-09-22 17:39:48,345] - [rest_client:929] INFO - rebalance percentage : 0 %
        [2012-09-22 17:39:50,350] - [rest_client:929] INFO - rebalance percentage : 0 %
        [2012-09-22 17:39:52,354] - [rest_client:929] INFO - rebalance percentage : 0 %
        [2012-09-22 17:39:54,359] - [rest_client:929] INFO - rebalance percentage : 0 %
        [2012-09-22 17:39:56,368] - [rest_client:929] INFO - rebalance percentage : 0 %
        [2012-09-22 17:39:58,372] - [rest_client:929] INFO - rebalance percentage : 0 %
        [2012-09-22 17:40:00,377] - [rest_client:929] INFO - rebalance percentage : 0 %
        [2012-09-22 17:40:02,381] - [rest_client:929] INFO - rebalance percentage : 0 %
        [2012-09-22 17:40:04,386] - [rest_client:914] ERROR -

        {u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'}

        - rebalance failed
        ERROR

        this show that user can trigger rebalance ( that will be failed) when 'deleted bucket is not deleted' (separate bug?)

        2)http://qa.hq.northscale.net/job/centos-64-2.0-new-rebalance/77/consoleFull

        rebalance hangs on the same progress, disk size is growing and rebalance will falls due to lack of space

        andrei ~/repository/testrunner $ scripts/ssh.py -i andrei_rebalance.ini "ls -la /opt/couchbase/var/lib/couchbase/data/default"
        10.3.3.91
        total 29165028
        drwxr-xr-x 2 couchbase couchbase 4096 Sep 22 17:25 .
        drwxr-xr-x 4 couchbase couchbase 4096 Sep 22 16:55 ..
        rw-rr- 1 couchbase couchbase 82011 Sep 22 16:56 56.couch.2
        ...
        rw-rr- 1 couchbase couchbase 29833224378 Sep 23 01:30 95.couch.1
        rw-rr- 1 couchbase couchbase 77915 Sep 22 16:56 95.couch.2
        ....

        ls: /opt/couchbase/var/lib/couchbase/data/default: No such file or directory
        10.3.3.99
        total 32114992
        drwxr-xr-x 2 couchbase couchbase 4096 Sep 22 17:24 .
        drwxr-xr-x 4 couchbase couchbase 4096 Sep 22 16:54 ..
        rw-rr- 1 couchbase couchbase 77915 Sep 22 16:57 103.couch.2
        ....
        rw-rr- 1 couchbase couchbase 82011 Sep 22 16:57 69.couch.2
        rw-rr- 1 couchbase couchbase 32847953966 Sep 23 01:30 70.couch.1
        rw-rr- 1 couchbase couchbase 82011 Sep 22 16:57 70.couch.2
        .......

        10.3.3.82
        total 29765164
        drwxr-xr-x 2 couchbase couchbase 4096 Sep 22 17:25 .
        drwxr-xr-x 4 couchbase couchbase 4096 Sep 22 16:55 ..
        rw-rr- 1 couchbase couchbase 77915 Sep 22 16:56 100.couch.2
        .....
        rw-rr- 1 couchbase couchbase 82011 Sep 22 16:56 94.couch.2
        rw-rr- 1 couchbase couchbase 30447755449 Sep 23 01:30 95.couch.1
        rw-rr- 1 couchbase couchbase 77915 Sep 22 16:56 95.couch.2
        ......

        10.3.3.93
        total 35250148
        drwxr-xr-x 2 couchbase couchbase 4096 Sep 22 16:55 .
        drwxr-xr-x 4 couchbase couchbase 4096 Sep 22 16:54 ..
        rw-rr- 1 couchbase couchbase 82011 Sep 22 16:55 0.couch.3
        .....
        rw-rr- 1 couchbase couchbase 77915 Sep 22 16:55 5.couch.3
        rw-rr- 1 couchbase couchbase 36052365358 Sep 23 01:30 50.couch.1
        rw-rr- 1 couchbase couchbase 82011 Sep 22 16:54 50.couch.2
        ..

        Show
        andreibaranouski Andrei Baranouski added a comment - set this bug as critical because there are many tests that are failed and this issue causes the opening of new bugs: MB-6710 Deleting bucket on a cluster gives error "Some nodes are still deleting bucket" MB-6712 API pools/default/buckets/ doesn't return any buckets but attempt to create bucket gives: Bucket with given name still exists tests: 1) http://qa.hq.northscale.net/job/centos-64-2.0-view-query-extended-tests/70/consoleFull 2012-09-22 17:39:44,219] - [rest_client:96] INFO - existing buckets : [] [2012-09-22 17:39:44,226] - [rest_client:1234] INFO - http://10.2.2.60:8091/pools/default/buckets with param: proxyPort=11211&bucketType=membase&authType=sasl&replicaIndex=1&name=default&saslPassword=&replicaNumber=1&ramQuotaMB=1456 [2012-09-22 17:39:44,236] - [rest_client:582] ERROR - http://10.2.2.60:8091/pools/default/buckets error 503 reason: unknown {"_":"Bucket with given name still exists"} [2012-09-22 17:39:44,246] - [bucket_helper:124] INFO - deleting existing buckets on [ip:10.2.2.60 port:8091 ssh_username:root, ip:10.2.2.108 port:8091 ssh_username:root, ip:10.2.2.63 port:8091 ssh_username:root, ip:10.2.2.64 port:8091 ssh_username:root, ip:10.2.2.65 port:8091 ssh_username:root] [2012-09-22 17:39:44,321] - [cluster_helper:199] INFO - rebalancing all nodes in order to remove nodes [2012-09-22 17:39:44,326] - [rest_client:826] INFO - rebalance params : password=password&ejectedNodes=ns_1%4010.2.2.108&user=Administrator&knownNodes=ns_1%4010.2.2.60%2Cns_1%4010.2.2.108 [2012-09-22 17:39:44,331] - [rest_client:833] INFO - rebalance operation started [2012-09-22 17:39:44,336] - [rest_client:929] INFO - rebalance percentage : 0 % [2012-09-22 17:39:46,341] - [rest_client:929] INFO - rebalance percentage : 0 % [2012-09-22 17:39:48,345] - [rest_client:929] INFO - rebalance percentage : 0 % [2012-09-22 17:39:50,350] - [rest_client:929] INFO - rebalance percentage : 0 % [2012-09-22 17:39:52,354] - [rest_client:929] INFO - rebalance percentage : 0 % [2012-09-22 17:39:54,359] - [rest_client:929] INFO - rebalance percentage : 0 % [2012-09-22 17:39:56,368] - [rest_client:929] INFO - rebalance percentage : 0 % [2012-09-22 17:39:58,372] - [rest_client:929] INFO - rebalance percentage : 0 % [2012-09-22 17:40:00,377] - [rest_client:929] INFO - rebalance percentage : 0 % [2012-09-22 17:40:02,381] - [rest_client:929] INFO - rebalance percentage : 0 % [2012-09-22 17:40:04,386] - [rest_client:914] ERROR - {u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'} - rebalance failed ERROR this show that user can trigger rebalance ( that will be failed) when 'deleted bucket is not deleted' (separate bug?) 2) http://qa.hq.northscale.net/job/centos-64-2.0-new-rebalance/77/consoleFull rebalance hangs on the same progress, disk size is growing and rebalance will falls due to lack of space andrei ~/repository/testrunner $ scripts/ssh.py -i andrei_rebalance.ini "ls -la /opt/couchbase/var/lib/couchbase/data/default" 10.3.3.91 total 29165028 drwxr-xr-x 2 couchbase couchbase 4096 Sep 22 17:25 . drwxr-xr-x 4 couchbase couchbase 4096 Sep 22 16:55 .. rw-r r - 1 couchbase couchbase 82011 Sep 22 16:56 56.couch.2 ... rw-r r - 1 couchbase couchbase 29833224378 Sep 23 01:30 95.couch.1 rw-r r - 1 couchbase couchbase 77915 Sep 22 16:56 95.couch.2 .... ls: /opt/couchbase/var/lib/couchbase/data/default: No such file or directory 10.3.3.99 total 32114992 drwxr-xr-x 2 couchbase couchbase 4096 Sep 22 17:24 . drwxr-xr-x 4 couchbase couchbase 4096 Sep 22 16:54 .. rw-r r - 1 couchbase couchbase 77915 Sep 22 16:57 103.couch.2 .... rw-r r - 1 couchbase couchbase 82011 Sep 22 16:57 69.couch.2 rw-r r - 1 couchbase couchbase 32847953966 Sep 23 01:30 70.couch.1 rw-r r - 1 couchbase couchbase 82011 Sep 22 16:57 70.couch.2 ....... 10.3.3.82 total 29765164 drwxr-xr-x 2 couchbase couchbase 4096 Sep 22 17:25 . drwxr-xr-x 4 couchbase couchbase 4096 Sep 22 16:55 .. rw-r r - 1 couchbase couchbase 77915 Sep 22 16:56 100.couch.2 ..... rw-r r - 1 couchbase couchbase 82011 Sep 22 16:56 94.couch.2 rw-r r - 1 couchbase couchbase 30447755449 Sep 23 01:30 95.couch.1 rw-r r - 1 couchbase couchbase 77915 Sep 22 16:56 95.couch.2 ...... 10.3.3.93 total 35250148 drwxr-xr-x 2 couchbase couchbase 4096 Sep 22 16:55 . drwxr-xr-x 4 couchbase couchbase 4096 Sep 22 16:54 .. rw-r r - 1 couchbase couchbase 82011 Sep 22 16:55 0.couch.3 ..... rw-r r - 1 couchbase couchbase 77915 Sep 22 16:55 5.couch.3 rw-r r - 1 couchbase couchbase 36052365358 Sep 23 01:30 50.couch.1 rw-r r - 1 couchbase couchbase 82011 Sep 22 16:54 50.couch.2 ..
        andreibaranouski Andrei Baranouski made changes -
        Priority Major [ 3 ] Blocker [ 1 ]
        Hide
        andreibaranouski Andrei Baranouski added a comment -

        even as Blocker

        Show
        andreibaranouski Andrei Baranouski added a comment - even as Blocker
        Hide
        chiyoung Chiyoung Seo added a comment -

        Jin,

        I think this is a regression from our recent changes that fixed the windows issue.

        Show
        chiyoung Chiyoung Seo added a comment - Jin, I think this is a regression from our recent changes that fixed the windows issue.
        chiyoung Chiyoung Seo made changes -
        Assignee Chiyoung Seo [ chiyoung ] Jin Lim [ jin ]
        chiyoung Chiyoung Seo made changes -
        Sprint Status Current Sprint
        farshid Farshid Ghods (Inactive) made changes -
        Labels trunk-green-blockers
        Hide
        farshid Farshid Ghods (Inactive) added a comment -
        Show
        farshid Farshid Ghods (Inactive) added a comment - http://review.couchbase.org/#/c/21056/ fix was merged
        farshid Farshid Ghods (Inactive) made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        thuan Thuan Nguyen added a comment -

        Integrated in github-ep-engine-2-0 #433 (See http://qa.hq.northscale.net/job/github-ep-engine-2-0/433/)
        MB-6711 Do not create new db file with old revision number (Revision cbc03de3b0cda2b7d7bb8bbddcdf161e4b6c0f84)

        Result = SUCCESS
        Jin Lim :
        Files :

        • src/couch-kvstore/couch-kvstore.cc
        Show
        thuan Thuan Nguyen added a comment - Integrated in github-ep-engine-2-0 #433 (See http://qa.hq.northscale.net/job/github-ep-engine-2-0/433/ ) MB-6711 Do not create new db file with old revision number (Revision cbc03de3b0cda2b7d7bb8bbddcdf161e4b6c0f84) Result = SUCCESS Jin Lim : Files : src/couch-kvstore/couch-kvstore.cc
        chiyoung Chiyoung Seo made changes -
        Sprint Status Current Sprint
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        Andrei
        Please close the issue if it's resolved in the latest builds

        Show
        farshid Farshid Ghods (Inactive) added a comment - Andrei Please close the issue if it's resolved in the latest builds
        Hide
        andreibaranouski Andrei Baranouski added a comment -

        yes, it was fixed in 1757

        Show
        andreibaranouski Andrei Baranouski added a comment - yes, it was fixed in 1757
        andreibaranouski Andrei Baranouski made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hide
        kzeller kzeller added a comment -

        RN: Replication had exited with replicator_died message after
        multiple attempts to update. Problem was caused by using
        old revision numbers for new database files. Now new database
        files use new revision numbers, resolving the problem.

        Show
        kzeller kzeller added a comment - RN: Replication had exited with replicator_died message after multiple attempts to update. Problem was caused by using old revision numbers for new database files. Now new database files use new revision numbers, resolving the problem.

          People

          • Assignee:
            jin Jin Lim
            Reporter:
            andreibaranouski Andrei Baranouski
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes