Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7919

[windows] Rebalance exited with reason {bulk_set_vbucket_state_failed, {timeout, when data mutation is in progress

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Incomplete
    • Affects Version/s: 2.0.1
    • Fix Version/s: 2.1.0
    • Component/s: ns_server
    • Security Level: Public
    • Labels:
    • Environment:
      2.0.1-179-rel on EC2

      Description

      Rebalance exited with below reason when data mutation is in progress with rebalance

      {{bulk_set_vbucket_state_failed,
      [{'ns_1@10.142.85.118',
      {'EXIT',
      {{{timeout,

      Test to repro:
      nohup ./testrunner -i win-aws.ini -t rebalance.rebalancein.RebalanceInTests.incremental_rebalance_in_with_ops,replicas=2,items=100000,doc_ops=update &

      The test failed at 2013-03-15 20:53:08 and the info.18 log file at the orchestrator nodes has the logs. Below crash reports can be seen.

      =========================CRASH REPORT=========================
      crasher:
      initial call: ns_single_vbucket_mover:mover/6
      pid: <0.32365.663>
      registered_name: []
      exception exit: {{{timeout,
      {gen_server,call,
      [<21714.30742.56>,
      {start_vbucket_filter_change,

      =========================CRASH REPORT=========================
      crasher:
      initial call: ns_single_vbucket_mover:mover/6
      pid: <0.32167.663>
      registered_name: []
      exception exit: {{{timeout,
      {gen_server,call,
      [<21714.30742.56>,
      {start_vbucket_filter_change,

      =========================CRASH REPORT=========================
      crasher:
      initial call: ns_single_vbucket_mover:mover/6
      pid: <0.31877.663>
      registered_name: []
      exception exit: {unexpected_exit,
      {'EXIT',<0.48.664>,
      {{wait_checkpoint_persisted_failed,"default",853,4,
      [{'ns_1@10.142.85.118',
      {'EXIT',
      {{{timeout,
      {gen_server,call,
      [<21714.30742.56>,
      {start_vbucket_filter_change,

      Attaching the testrunner logs and collect_info.zip.

      I executed the same test again and it passed so this is not consistently reproducible.

      1. ebucketmigrator_srv.beam
        39 kB
        Aleksey Kondratenko
      2. nohup.out.rebalance_timeout
        1.25 MB
        Deepkaran Salooja
      3. ns_memcached.beam
        50 kB
        Aleksey Kondratenko
      4. ns_replicas_builder.beam
        9 kB
        Aleksey Kondratenko
      # Subject Project Status CR V
      For Gerrit Dashboard: &For+MB-7919=message:MB-7919

        Activity

        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        See above.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - See above.
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        meanwhile (for ticket watchers) code of this potential fix can be found at https://github.com/alk/ns_server/tree/wip-windows-connections-fix

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - meanwhile (for ticket watchers) code of this potential fix can be found at https://github.com/alk/ns_server/tree/wip-windows-connections-fix
        Hide
        deepkaran.salooja Deepkaran Salooja added a comment -

        Thanks Alk. Replaced the 3 .beam files at the below location as per the instructions(Couchbase service stopped/started):
        C:\Program Files\Couchbase\Server\lib\ns_server-2.0.1_macosx_228_gc20ba21\ebin

        Rerunning the tests. Will update when I see failures.

        Show
        deepkaran.salooja Deepkaran Salooja added a comment - Thanks Alk. Replaced the 3 .beam files at the below location as per the instructions(Couchbase service stopped/started): C:\Program Files\Couchbase\Server\lib\ns_server-2.0.1_macosx_228_gc20ba21\ebin Rerunning the tests. Will update when I see failures.
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Closing this ticket as cannot reproduce because this ticket is about timeouts on windows.

        I am aware that some recent comments on this ticket are actually related to different bug (http://www.couchbase.com/issues/browse/MB-7902) I'll make sure useful stuff from this ticket is not lost.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Closing this ticket as cannot reproduce because this ticket is about timeouts on windows. I am aware that some recent comments on this ticket are actually related to different bug ( http://www.couchbase.com/issues/browse/MB-7902 ) I'll make sure useful stuff from this ticket is not lost.
        Hide
        maria Maria McDuff (Inactive) added a comment -

        Tracking MB-7902

        Show
        maria Maria McDuff (Inactive) added a comment - Tracking MB-7902

          People

          • Assignee:
            alkondratenko Aleksey Kondratenko (Inactive)
            Reporter:
            deepkaran.salooja Deepkaran Salooja
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes