Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-40371

[Doc_Isolation] xattr::utils::check_len(2617569397) exceeds 286

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 1
    • No

    Description

       

      Build: 6.6.0-7861

      Scenario:

      • 4 node cluster, Couchbase bucket (replicas=2)
      • Start two transactions in parallel (One will succeed and other will rollback)
      • Rebalance out 1 node from the cluster
      • +----------------+----------------------+--------------+
        | Nodes          | Services             | Status       |
        +----------------+----------------------+--------------+
        | 172.23.105.205 | kv                   | Cluster node |
        | 172.23.105.155 | fts, index, kv, n1ql | Cluster node |
        | 172.23.105.206 | [u'kv']              | --- OUT ---> |
        | 172.23.105.159 | kv                   | Cluster node |
        +----------------+----------------------+--------------+
        

      Observation:

      During rebalance-out, rebalance operation fails with reason "mover crash - child interrupted-socket closed" message

      Failure logs:

      Rebalance exited with reason {mover_crashed,
      {unexpected_exit,
      {'EXIT',<0.15736.1>,
      {{{{{child_interrupted,
      {'EXIT',<0.3124.0>,socket_closed}},
      [{dcp_replicator,spawn_and_wait,1,
      [{file,"src/dcp_replicator.erl"}, {line,266}]},
      {dcp_replicator,handle_call,3,
      [{file,"src/dcp_replicator.erl"}, {line,121}]},
      {gen_server,try_handle_call,4,
      [{file,"gen_server.erl"},{line,636}]},
      {gen_server,handle_msg,6,
      [{file,"gen_server.erl"},{line,665}]},
      {proc_lib,init_p_do_apply,3,
      [{file,"proc_lib.erl"},{line,247}]}]},
      {gen_server,call, [<0.3123.0>,get_partitions,infinity]}},
      {gen_server,call, ['dcp_replication_manager-default',
      {get_replicator_pid,418}, infinity]}},
      {gen_server,call,
      [{'janitor_agent-default', 'ns_1@172.23.105.155'},
      {if_rebalance,<0.3785.0>,
      {update_vbucket_state,779,active,
      undefined,undefined,undefined}}, infinity]}}}}}.
      Rebalance Operation Id = 03fc94d6190f5b191aa426459e553a9c
       
      Worker <0.14842.1> (for action {move,{779,
      ['ns_1@172.23.105.206',
      'ns_1@172.23.105.155',
      'ns_1@172.23.105.159'],
      ['ns_1@172.23.105.155',
      'ns_1@172.23.105.159',
      'ns_1@172.23.105.205'],
      []}}) exited with reason {unexpected_exit,
      {'EXIT', <0.15736.1>,
      {{{{{child_interrupted,
      {'EXIT', <0.3124.0>,
      socket_closed}},
      [{dcp_replicator, spawn_and_wait, 1,
      [{file, "src/dcp_replicator.erl"}, {line, 266}]},
      {dcp_replicator, handle_call, 3,
      [{file, "src/dcp_replicator.erl"}, {line, 121}]},
      {gen_server, try_handle_call, 4,
      [{file, "gen_server.erl"}, {line, 636}]},
      {gen_server, handle_msg, 6,
      [{file, "gen_server.erl"}, {line, 665}]},
      {proc_lib, init_p_do_apply, 3,
      [{file, "proc_lib.erl"}, {line, 247}]}]},
      {gen_server, call, [<0.3123.0>,
      get_partitions, infinity]}},
      {gen_server, call,
      ['dcp_replication_manager-default',
      {get_replicator_pid, 418}, infinity]}},
      {gen_server, call,
      [{'janitor_agent-default',
      'ns_1@172.23.105.155'},
      {if_rebalance, <0.3785.0>,
      {update_vbucket_state, 779, active, undefined, undefined, undefined}},
      infinity]}}}} 

      cbcollect logs:

       
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/rebalance_failure/collectinfo-2020-07-09T182212-ns_1%40172.23.105.155.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/rebalance_failure/collectinfo-2020-07-09T182212-ns_1%40172.23.105.159.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/rebalance_failure/collectinfo-2020-07-09T182212-ns_1%40172.23.105.205.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/rebalance_failure/collectinfo-2020-07-09T182212-ns_1%40172.23.105.206.zip
      Testcase:

      ./testrunner -i /tmp/5-centos-nodes-jython.ini rerun=False,get-cbcollect-info=False -t Atomicity.doc_isolation.IsolationDocTest.test_transaction_with_rebalance,nodes_init=4,replicas=2,rebalance_type=out,nodes_out=1,doc_op=create,GROUP=P1
      

      Attachments

        Issue Links

          Activity

            People

              ashwin.govindarajulu Ashwin Govindarajulu
              ashwin.govindarajulu Ashwin Govindarajulu
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty