Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-40370

[Doc_Isolation]: xattr::utils::check_len(2634346613) exceeds 287

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 1
    • No

    Description

       

      Build: 6.6.0-7861

      Scenario:

      1. Single node cluster, Couchbase bucket (replica=0)
      2. Start two parallel transactions (one will succeed and other will rollback)
      3. Rebalance in 3 nodes into the cluster in parallel to the transactions

      Observation:

      Seeing rebalance failure with reason "mover crashed - bulk_set_vbucket_state_failed".

      cbcollect logs:
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/rebalance_failure/collectinfo-2020-07-09T180702-ns_1%40172.23.105.155.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/rebalance_failure/collectinfo-2020-07-09T180702-ns_1%40172.23.105.159.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/rebalance_failure/collectinfo-2020-07-09T180702-ns_1%40172.23.105.205.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/rebalance_failure/collectinfo-2020-07-09T180702-ns_1%40172.23.105.206.zip
      Rebalance failure prints:

      Rebalance exited with reason {mover_crashed,
      {unexpected_exit,
      {'EXIT',<0.23194.0>,
      {{bulk_set_vbucket_state_failed,
      [{'ns_1@172.23.105.206',
      {'EXIT',
      {{{{{badmatch, [{<17504.5562.0>,
      {done,exit, {socket_closed,
      {gen_server,call,
      [<17504.5215.0>,
      {maybe_close_stream,905}, infinity]}},
      [{gen_server,call,3,
      [{file,"gen_server.erl"}, {line,214}]},
      {dcp_replicator, '-handle_call/3-fun-1-',2, 
      [{file, "src/dcp_replicator.erl"}, {line,128}]},
      {dcp_replicator, '-spawn_and_wait/1-fun-0-',1,
      [{file, "src/dcp_replicator.erl"}, {line,243}]}]}}]},
      [{misc, sync_shutdown_many_i_am_trapping_exits, 1,
      [{file,"src/misc.erl"}, {line,1374}]},
      {dcp_replicator,spawn_and_wait,1,
      [{file,"src/dcp_replicator.erl"}, {line,265}]},
      {dcp_replicator,handle_call,3,
      [{file,"src/dcp_replicator.erl"}, {line,127}]},
      {gen_server,try_handle_call,4,
      [{file,"gen_server.erl"}, {line,636}]},
      {gen_server,handle_msg,6,
      [{file,"gen_server.erl"}, {line,665}]},
      {proc_lib,init_p_do_apply,3,
      [{file,"proc_lib.erl"}, {line,247}]}]},
      {gen_server,call,
      [<17504.5214.0>,get_partitions, infinity]}},
      {gen_server,call,
      ['dcp_replication_manager-default',
      {get_replicator_pid,903},
      infinity]}},
      {gen_server,call,
      [{'janitor_agent-default',
      'ns_1@172.23.105.206'},
      {if_rebalance,<0.3370.0>,
      {update_vbucket_state,900,pending,
      passive,'ns_1@172.23.105.155'}},
      infinity]}}}}]},
      [{janitor_agent,bulk_set_vbucket_state,4,
      [{file,"src/janitor_agent.erl"}, {line,403}]},
      {proc_lib,init_p,3,
      [{file,"proc_lib.erl"},{line,232}]}]}}}}.
      Rebalance Operation Id = 7a1aaa2b7b2c47644b91cb745f74d3be
       
      Worker <0.16520.0> (for action 
      {move,{433,
      ['ns_1@172.23.105.155'],
      ['ns_1@172.23.105.159'],
      []}}) 
      exited with reason {unexpected_exit,
      {'EXIT', <0.17484.0>,
      {{{{{badmatch, [{<25299.4924.0>,
      {done, exit,
      {socket_closed,
      {gen_server, call,
      [<25299.4462.0>, {maybe_close_stream, 433}, infinity]}},
      [{gen_server, call, 3,
      [{file, "gen_server.erl"}, {line, 214}]},
      {dcp_replicator, '-handle_call/3-fun-1-', 2,
      [{file, "src/dcp_replicator.erl"}, {line, 128}]},
      {dcp_replicator, '-spawn_and_wait/1-fun-0-', 1,
      [{file, "src/dcp_replicator.erl"}, {line, 243}]}]}}]},
      [{misc, sync_shutdown_many_i_am_trapping_exits, 1,
      [{file, "src/misc.erl"}, {line, 1374}]},
      {dcp_replicator, spawn_and_wait, 1,
      [{file, "src/dcp_replicator.erl"}, {line, 265}]},
      {dcp_replicator, handle_call, 3,
      [{file, "src/dcp_replicator.erl"}, {line, 127}]},
      {gen_server, try_handle_call, 4,
      [{file, "gen_server.erl"}, {line, 636}]},
      {gen_server, handle_msg, 6,
      [{file, "gen_server.erl"}, {line, 665}]},
      {proc_lib, init_p_do_apply, 3,
      [{file, "proc_lib.erl"}, {line, 247}]}]},
      {gen_server, call,
      [<25299.4461.0>,
      get_partitions, infinity]}},
      {gen_server, call,
      ['dcp_replication_manager-default',
      {get_replicator_pid, 431}, infinity]}},
      {gen_server, call,
      [{'janitor_agent-default', 'ns_1@172.23.105.159'},
      {if_rebalance, <0.4111.0>,
      {dcp_takeover, 'ns_1@172.23.105.155', 433}}, infinity]}}}} 

      Testcase:

      ./testrunner -i /tmp/5-centos-nodes-jython.ini rerun=False,get-cbcollect-info=False -t Atomicity.doc_isolation.IsolationDocTest.test_transaction_with_rebalance,nodes_init=1,replicas=0,rebalance_type=in,nodes_in=3,doc_op=create,GROUP=P1
      

       

       

      Attachments

        1. __dcp_prepare.png
          355 kB
          Paolo Cocchi
        2. __subdoc.png
          298 kB
          Paolo Cocchi

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Not seeing this issue on latest MH build.

            Verified using 6.6.0-7880. Closing this ticket.

            ashwin.govindarajulu Ashwin Govindarajulu added a comment - Not seeing this issue on latest MH build. Verified using 6.6.0-7880. Closing this ticket.

            Build couchbase-server-7.0.0-2647 contains kv_engine commit 5c64d40 with commit message:
            MB-40370: Remove unused code in xattr/utils.cc

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-2647 contains kv_engine commit 5c64d40 with commit message: MB-40370 : Remove unused code in xattr/utils.cc

            Build couchbase-server-7.0.0-2647 contains kv_engine commit e31f273 with commit message:
            MB-40370: Make cb::xattr::get_body_size resilient to compressed payloads

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-2647 contains kv_engine commit e31f273 with commit message: MB-40370 : Make cb::xattr::get_body_size resilient to compressed payloads
            mihir.kamdar Mihir Kamdar (Inactive) added a comment - - edited

            Reopening and resolving the issue again to update the Fix Version (to include 7.0 as well).

            Ashwin Govindarajulu pls verify the fix on 7.0 as well.

            mihir.kamdar Mihir Kamdar (Inactive) added a comment - - edited Reopening and resolving the issue again to update the Fix Version (to include 7.0 as well). Ashwin Govindarajulu pls verify the fix on 7.0 as well.

            Works fine on latest builds. Verified using 6.6.0-7883.

            Closing this ticket.

            ashwin.govindarajulu Ashwin Govindarajulu added a comment - Works fine on latest builds. Verified using 6.6.0-7883. Closing this ticket.

            People

              ashwin.govindarajulu Ashwin Govindarajulu
              ashwin.govindarajulu Ashwin Govindarajulu
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty