Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-40480

Non-complete, unpersisted, "deleted" prepare can be removed from HashTable by the persistence of previous abort

    XMLWordPrintable

    Details

      Description

      Build: 6.6.0-7880-enterprise

      Scenario:

      • 4 node cluster, Couchbase bucket (replica=2)
      • Rebalance out 1 node from the cluster
      • Initiate transaction in parallel to rebalance_out operation

        +----------------+-----------------+--------------+
        | Nodes          | Services        | Status       |
        +----------------+-----------------+--------------+
        | 172.23.107.52  | index, kv, n1ql | Cluster node |
        | 172.23.123.101 | kv              | --- OUT ---> |
        | 172.23.123.102 | kv              | Cluster node |
        | 172.23.123.100 | kv              | Cluster node |
        +----------------+-----------------+--------------+

      Observation:

      Seeing rebalance failure followed by memcached crash on master node - 172.23.107.52

      Service 'memcached' exited with status 134. Restarting. Messages:
      2020-07-14T23:30:54.406403-07:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f2e4bcfd000+0x8f213]
      2020-07-14T23:30:54.406414-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xccc10]
      2020-07-14T23:30:54.406426-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xc805a]
      2020-07-14T23:30:54.406434-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xca463]
      2020-07-14T23:30:54.406441-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0x18f5a0]
      2020-07-14T23:30:54.406447-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xcf98d]
      2020-07-14T23:30:54.406454-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0x12b864]
      2020-07-14T23:30:54.406459-07:00 CRITICAL /opt/couchbase/bin/../lib/libplatform_so.so.0.1.0() [0x7f2e4ddac000+0x8f17]
      2020-07-14T23:30:54.406467-07:00 CRITICAL /lib64/libpthread.so.0() [0x7f2e4b5c8000+0x7dd5]
      2020-07-14T23:30:54.406499-07:00 CRITICAL /lib64/libc.so.6(clone+0x6d) [0x7f2e4b1fb000+0xfdead]
       
      Rebalance exited with reason {mover_crashed,
      {unexpected_exit,
      {'EXIT',<0.6670.0>,
      {{{{{child_interrupted,
      {'EXIT',<17502.2478.0>,socket_closed}},
      [{dcp_replicator,spawn_and_wait,1,
      [{file,"src/dcp_replicator.erl"}, {line,266}]},
      {dcp_replicator,handle_call,3,
      [{file,"src/dcp_replicator.erl"}, {line,121}]},
      {gen_server,try_handle_call,4,
      [{file,"gen_server.erl"},{line,636}]},
      {gen_server,handle_msg,6,
      [{file,"gen_server.erl"},{line,665}]},
      {proc_lib,init_p_do_apply,3,
      [{file,"proc_lib.erl"},{line,247}]}]},
      {gen_server,call,
      [<17502.2476.0>,get_partitions,infinity]}},
      {gen_server,call,
      ['dcp_replication_manager-default',
      {get_replicator_pid,543}, infinity]}},
      {gen_server,call,
      [{'janitor_agent-default',
      'ns_1@172.23.123.102'},
      {if_rebalance,<0.3620.0>,
      {update_vbucket_state,979,active,paused, undefined,
      [['ns_1@172.23.123.102', 'ns_1@172.23.123.101', 'ns_1@172.23.107.52']]}},
      infinity]}}}}}.
      Rebalance Operation Id = 322d92a2335598e144eb0bb97f14f1a3 
       
      Worker <0.6325.0> (for action {move,{979,
      ['ns_1@172.23.123.102',
      'ns_1@172.23.123.101',
      'ns_1@172.23.107.52'],
      ['ns_1@172.23.107.52',
      'ns_1@172.23.123.100',
      'ns_1@172.23.123.102'],
      []}}) exited with reason {unexpected_exit,
      {'EXIT', <0.6670.0>,
      {{{{{child_interrupted,
      {'EXIT', <17502.2478.0>, socket_closed}},
      [{dcp_replicator, spawn_and_wait, 1,
      [{file, "src/dcp_replicator.erl"}, {line, 266}]},
      {dcp_replicator, handle_call, 3,
      [{file, "src/dcp_replicator.erl"}, {line, 121}]},
      {gen_server, try_handle_call, 4,
      [{file, "gen_server.erl"}, {line, 636}]},
      {gen_server, handle_msg, 6,
      [{file, "gen_server.erl"}, {line, 665}]},
      {proc_lib, init_p_do_apply, 3,
      [{file, "proc_lib.erl"}, {line, 247}]}]},
      {gen_server, call,
      [<17502.2476.0>,
      get_partitions, infinity]}},
      {gen_server, call,
      ['dcp_replication_manager-default',
      {get_replicator_pid, 543}, infinity]}},
      {gen_server, call,
      [{'janitor_agent-default',
      'ns_1@172.23.123.102'},
      {if_rebalance, <0.3620.0>,
      {update_vbucket_state,
      979, active, paused, undefined,
      [['ns_1@172.23.123.102',
      'ns_1@172.23.123.101',
      'ns_1@172.23.107.52']]}},
      infinity]}}}}

      Test to run:

      Atomicity.doc_isolation.IsolationDocTest.test_transaction_with_rebalance,nodes_init=4,replicas=2,num_items=20000,rebalance_type=out,nodes_out=1,doc_op=create,durability=PERSIST_TO_MAJORITY,services_init=kv;n1ql;index,rerun=False
      

        Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

          ashwin.govindarajulu Ashwin Govindarajulu created issue -
          ritam.sharma Ritam Sharma made changes -
          Field Original Value New Value
          Description *Build*: 6.6.0-7880-enterprise

          *Scenario*:
           * 4 node cluster, Couchbase bucket (replica=2)
           * Rebalance out 1 node from the cluster
           * Initiate transaction in parallel to rebalance_out operation
          {noformat}
          +----------------+-----------------+--------------+
          | Nodes | Services | Status |
          +----------------+-----------------+--------------+
          | 172.23.107.52 | index, kv, n1ql | Cluster node |
          | 172.23.123.101 | kv   | --- OUT ---> |
          | 172.23.123.102 | kv | Cluster node |
          | 172.23.123.100 | kv | Cluster node |
          +----------------+-----------------+--------------+{noformat}

          *Observation:*

          Seeing rebalance failure followed by memcached crash on master node
          {noformat}
          Service 'memcached' exited with status 134. Restarting. Messages:
          2020-07-14T23:30:54.406403-07:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f2e4bcfd000+0x8f213]
          2020-07-14T23:30:54.406414-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xccc10]
          2020-07-14T23:30:54.406426-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xc805a]
          2020-07-14T23:30:54.406434-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xca463]
          2020-07-14T23:30:54.406441-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0x18f5a0]
          2020-07-14T23:30:54.406447-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xcf98d]
          2020-07-14T23:30:54.406454-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0x12b864]
          2020-07-14T23:30:54.406459-07:00 CRITICAL /opt/couchbase/bin/../lib/libplatform_so.so.0.1.0() [0x7f2e4ddac000+0x8f17]
          2020-07-14T23:30:54.406467-07:00 CRITICAL /lib64/libpthread.so.0() [0x7f2e4b5c8000+0x7dd5]
          2020-07-14T23:30:54.406499-07:00 CRITICAL /lib64/libc.so.6(clone+0x6d) [0x7f2e4b1fb000+0xfdead]

          Rebalance exited with reason {mover_crashed,
          {unexpected_exit,
          {'EXIT',<0.6670.0>,
          {{{{{child_interrupted,
          {'EXIT',<17502.2478.0>,socket_closed}},
          [{dcp_replicator,spawn_and_wait,1,
          [{file,"src/dcp_replicator.erl"}, {line,266}]},
          {dcp_replicator,handle_call,3,
          [{file,"src/dcp_replicator.erl"}, {line,121}]},
          {gen_server,try_handle_call,4,
          [{file,"gen_server.erl"},{line,636}]},
          {gen_server,handle_msg,6,
          [{file,"gen_server.erl"},{line,665}]},
          {proc_lib,init_p_do_apply,3,
          [{file,"proc_lib.erl"},{line,247}]}]},
          {gen_server,call,
          [<17502.2476.0>,get_partitions,infinity]}},
          {gen_server,call,
          ['dcp_replication_manager-default',
          {get_replicator_pid,543}, infinity]}},
          {gen_server,call,
          [{'janitor_agent-default',
          'ns_1@172.23.123.102'},
          {if_rebalance,<0.3620.0>,
          {update_vbucket_state,979,active,paused, undefined,
          [['ns_1@172.23.123.102', 'ns_1@172.23.123.101', 'ns_1@172.23.107.52']]}},
          infinity]}}}}}.
          Rebalance Operation Id = 322d92a2335598e144eb0bb97f14f1a3 

          Worker <0.6325.0> (for action {move,{979,
          ['ns_1@172.23.123.102',
          'ns_1@172.23.123.101',
          'ns_1@172.23.107.52'],
          ['ns_1@172.23.107.52',
          'ns_1@172.23.123.100',
          'ns_1@172.23.123.102'],
          []}}) exited with reason {unexpected_exit,
          {'EXIT', <0.6670.0>,
          {{{{{child_interrupted,
          {'EXIT', <17502.2478.0>, socket_closed}},
          [{dcp_replicator, spawn_and_wait, 1,
          [{file, "src/dcp_replicator.erl"}, {line, 266}]},
          {dcp_replicator, handle_call, 3,
          [{file, "src/dcp_replicator.erl"}, {line, 121}]},
          {gen_server, try_handle_call, 4,
          [{file, "gen_server.erl"}, {line, 636}]},
          {gen_server, handle_msg, 6,
          [{file, "gen_server.erl"}, {line, 665}]},
          {proc_lib, init_p_do_apply, 3,
          [{file, "proc_lib.erl"}, {line, 247}]}]},
          {gen_server, call,
          [<17502.2476.0>,
          get_partitions, infinity]}},
          {gen_server, call,
          ['dcp_replication_manager-default',
          {get_replicator_pid, 543}, infinity]}},
          {gen_server, call,
          [{'janitor_agent-default',
          'ns_1@172.23.123.102'},
          {if_rebalance, <0.3620.0>,
          {update_vbucket_state,
          979, active, paused, undefined,
          [['ns_1@172.23.123.102',
          'ns_1@172.23.123.101',
          'ns_1@172.23.107.52']]}},
          infinity]}}}}{noformat}
          *Build*: 6.6.0-7880-enterprise

          *Scenario*:
           * 4 node cluster, Couchbase bucket (replica=2)
           * Rebalance out 1 node from the cluster
           * Initiate transaction in parallel to rebalance_out operation
          {noformat}
          +----------------+-----------------+--------------+
          | Nodes | Services | Status |
          +----------------+-----------------+--------------+
          | 172.23.107.52 | index, kv, n1ql | Cluster node |
          | 172.23.123.101 | kv   | --- OUT ---> |
          | 172.23.123.102 | kv | Cluster node |
          | 172.23.123.100 | kv | Cluster node |
          +----------------+-----------------+--------------+{noformat}

          *Observation:*

          Seeing rebalance failure followed by memcached crash on master node - 172.23.107.52
          {noformat}
          Service 'memcached' exited with status 134. Restarting. Messages:
          2020-07-14T23:30:54.406403-07:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f2e4bcfd000+0x8f213]
          2020-07-14T23:30:54.406414-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xccc10]
          2020-07-14T23:30:54.406426-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xc805a]
          2020-07-14T23:30:54.406434-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xca463]
          2020-07-14T23:30:54.406441-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0x18f5a0]
          2020-07-14T23:30:54.406447-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xcf98d]
          2020-07-14T23:30:54.406454-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0x12b864]
          2020-07-14T23:30:54.406459-07:00 CRITICAL /opt/couchbase/bin/../lib/libplatform_so.so.0.1.0() [0x7f2e4ddac000+0x8f17]
          2020-07-14T23:30:54.406467-07:00 CRITICAL /lib64/libpthread.so.0() [0x7f2e4b5c8000+0x7dd5]
          2020-07-14T23:30:54.406499-07:00 CRITICAL /lib64/libc.so.6(clone+0x6d) [0x7f2e4b1fb000+0xfdead]

          Rebalance exited with reason {mover_crashed,
          {unexpected_exit,
          {'EXIT',<0.6670.0>,
          {{{{{child_interrupted,
          {'EXIT',<17502.2478.0>,socket_closed}},
          [{dcp_replicator,spawn_and_wait,1,
          [{file,"src/dcp_replicator.erl"}, {line,266}]},
          {dcp_replicator,handle_call,3,
          [{file,"src/dcp_replicator.erl"}, {line,121}]},
          {gen_server,try_handle_call,4,
          [{file,"gen_server.erl"},{line,636}]},
          {gen_server,handle_msg,6,
          [{file,"gen_server.erl"},{line,665}]},
          {proc_lib,init_p_do_apply,3,
          [{file,"proc_lib.erl"},{line,247}]}]},
          {gen_server,call,
          [<17502.2476.0>,get_partitions,infinity]}},
          {gen_server,call,
          ['dcp_replication_manager-default',
          {get_replicator_pid,543}, infinity]}},
          {gen_server,call,
          [{'janitor_agent-default',
          'ns_1@172.23.123.102'},
          {if_rebalance,<0.3620.0>,
          {update_vbucket_state,979,active,paused, undefined,
          [['ns_1@172.23.123.102', 'ns_1@172.23.123.101', 'ns_1@172.23.107.52']]}},
          infinity]}}}}}.
          Rebalance Operation Id = 322d92a2335598e144eb0bb97f14f1a3 

          Worker <0.6325.0> (for action {move,{979,
          ['ns_1@172.23.123.102',
          'ns_1@172.23.123.101',
          'ns_1@172.23.107.52'],
          ['ns_1@172.23.107.52',
          'ns_1@172.23.123.100',
          'ns_1@172.23.123.102'],
          []}}) exited with reason {unexpected_exit,
          {'EXIT', <0.6670.0>,
          {{{{{child_interrupted,
          {'EXIT', <17502.2478.0>, socket_closed}},
          [{dcp_replicator, spawn_and_wait, 1,
          [{file, "src/dcp_replicator.erl"}, {line, 266}]},
          {dcp_replicator, handle_call, 3,
          [{file, "src/dcp_replicator.erl"}, {line, 121}]},
          {gen_server, try_handle_call, 4,
          [{file, "gen_server.erl"}, {line, 636}]},
          {gen_server, handle_msg, 6,
          [{file, "gen_server.erl"}, {line, 665}]},
          {proc_lib, init_p_do_apply, 3,
          [{file, "proc_lib.erl"}, {line, 247}]}]},
          {gen_server, call,
          [<17502.2476.0>,
          get_partitions, infinity]}},
          {gen_server, call,
          ['dcp_replication_manager-default',
          {get_replicator_pid, 543}, infinity]}},
          {gen_server, call,
          [{'janitor_agent-default',
          'ns_1@172.23.123.102'},
          {if_rebalance, <0.3620.0>,
          {update_vbucket_state,
          979, active, paused, undefined,
          [['ns_1@172.23.123.102',
          'ns_1@172.23.123.101',
          'ns_1@172.23.107.52']]}},
          infinity]}}}}{noformat}
          owend Daniel Owen made changes -
          Assignee Daniel Owen [ owend ] Ben Huddleston [ ben.huddleston ]
          owend Daniel Owen made changes -
          Due Date 20/Jul/20
          ashwin.govindarajulu Ashwin Govindarajulu made changes -
          Description *Build*: 6.6.0-7880-enterprise

          *Scenario*:
           * 4 node cluster, Couchbase bucket (replica=2)
           * Rebalance out 1 node from the cluster
           * Initiate transaction in parallel to rebalance_out operation
          {noformat}
          +----------------+-----------------+--------------+
          | Nodes | Services | Status |
          +----------------+-----------------+--------------+
          | 172.23.107.52 | index, kv, n1ql | Cluster node |
          | 172.23.123.101 | kv   | --- OUT ---> |
          | 172.23.123.102 | kv | Cluster node |
          | 172.23.123.100 | kv | Cluster node |
          +----------------+-----------------+--------------+{noformat}

          *Observation:*

          Seeing rebalance failure followed by memcached crash on master node - 172.23.107.52
          {noformat}
          Service 'memcached' exited with status 134. Restarting. Messages:
          2020-07-14T23:30:54.406403-07:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f2e4bcfd000+0x8f213]
          2020-07-14T23:30:54.406414-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xccc10]
          2020-07-14T23:30:54.406426-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xc805a]
          2020-07-14T23:30:54.406434-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xca463]
          2020-07-14T23:30:54.406441-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0x18f5a0]
          2020-07-14T23:30:54.406447-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xcf98d]
          2020-07-14T23:30:54.406454-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0x12b864]
          2020-07-14T23:30:54.406459-07:00 CRITICAL /opt/couchbase/bin/../lib/libplatform_so.so.0.1.0() [0x7f2e4ddac000+0x8f17]
          2020-07-14T23:30:54.406467-07:00 CRITICAL /lib64/libpthread.so.0() [0x7f2e4b5c8000+0x7dd5]
          2020-07-14T23:30:54.406499-07:00 CRITICAL /lib64/libc.so.6(clone+0x6d) [0x7f2e4b1fb000+0xfdead]

          Rebalance exited with reason {mover_crashed,
          {unexpected_exit,
          {'EXIT',<0.6670.0>,
          {{{{{child_interrupted,
          {'EXIT',<17502.2478.0>,socket_closed}},
          [{dcp_replicator,spawn_and_wait,1,
          [{file,"src/dcp_replicator.erl"}, {line,266}]},
          {dcp_replicator,handle_call,3,
          [{file,"src/dcp_replicator.erl"}, {line,121}]},
          {gen_server,try_handle_call,4,
          [{file,"gen_server.erl"},{line,636}]},
          {gen_server,handle_msg,6,
          [{file,"gen_server.erl"},{line,665}]},
          {proc_lib,init_p_do_apply,3,
          [{file,"proc_lib.erl"},{line,247}]}]},
          {gen_server,call,
          [<17502.2476.0>,get_partitions,infinity]}},
          {gen_server,call,
          ['dcp_replication_manager-default',
          {get_replicator_pid,543}, infinity]}},
          {gen_server,call,
          [{'janitor_agent-default',
          'ns_1@172.23.123.102'},
          {if_rebalance,<0.3620.0>,
          {update_vbucket_state,979,active,paused, undefined,
          [['ns_1@172.23.123.102', 'ns_1@172.23.123.101', 'ns_1@172.23.107.52']]}},
          infinity]}}}}}.
          Rebalance Operation Id = 322d92a2335598e144eb0bb97f14f1a3 

          Worker <0.6325.0> (for action {move,{979,
          ['ns_1@172.23.123.102',
          'ns_1@172.23.123.101',
          'ns_1@172.23.107.52'],
          ['ns_1@172.23.107.52',
          'ns_1@172.23.123.100',
          'ns_1@172.23.123.102'],
          []}}) exited with reason {unexpected_exit,
          {'EXIT', <0.6670.0>,
          {{{{{child_interrupted,
          {'EXIT', <17502.2478.0>, socket_closed}},
          [{dcp_replicator, spawn_and_wait, 1,
          [{file, "src/dcp_replicator.erl"}, {line, 266}]},
          {dcp_replicator, handle_call, 3,
          [{file, "src/dcp_replicator.erl"}, {line, 121}]},
          {gen_server, try_handle_call, 4,
          [{file, "gen_server.erl"}, {line, 636}]},
          {gen_server, handle_msg, 6,
          [{file, "gen_server.erl"}, {line, 665}]},
          {proc_lib, init_p_do_apply, 3,
          [{file, "proc_lib.erl"}, {line, 247}]}]},
          {gen_server, call,
          [<17502.2476.0>,
          get_partitions, infinity]}},
          {gen_server, call,
          ['dcp_replication_manager-default',
          {get_replicator_pid, 543}, infinity]}},
          {gen_server, call,
          [{'janitor_agent-default',
          'ns_1@172.23.123.102'},
          {if_rebalance, <0.3620.0>,
          {update_vbucket_state,
          979, active, paused, undefined,
          [['ns_1@172.23.123.102',
          'ns_1@172.23.123.101',
          'ns_1@172.23.107.52']]}},
          infinity]}}}}{noformat}
          *Build*: 6.6.0-7880-enterprise

          *Scenario*:
           * 4 node cluster, Couchbase bucket (replica=2)
           * Rebalance out 1 node from the cluster
           * Initiate transaction in parallel to rebalance_out operation
          {noformat}+----------------+-----------------+--------------+
          | Nodes | Services | Status |
          +----------------+-----------------+--------------+
          | 172.23.107.52 | index, kv, n1ql | Cluster node |
          | 172.23.123.101 | kv   | --- OUT ---> |
          | 172.23.123.102 | kv | Cluster node |
          | 172.23.123.100 | kv | Cluster node |
          +----------------+-----------------+--------------+{noformat}

          *Observation:*

          Seeing rebalance failure followed by memcached crash on master node - 172.23.107.52
          {noformat}Service 'memcached' exited with status 134. Restarting. Messages:
          2020-07-14T23:30:54.406403-07:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f2e4bcfd000+0x8f213]
          2020-07-14T23:30:54.406414-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xccc10]
          2020-07-14T23:30:54.406426-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xc805a]
          2020-07-14T23:30:54.406434-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xca463]
          2020-07-14T23:30:54.406441-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0x18f5a0]
          2020-07-14T23:30:54.406447-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0xcf98d]
          2020-07-14T23:30:54.406454-07:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7f2e466a5000+0x12b864]
          2020-07-14T23:30:54.406459-07:00 CRITICAL /opt/couchbase/bin/../lib/libplatform_so.so.0.1.0() [0x7f2e4ddac000+0x8f17]
          2020-07-14T23:30:54.406467-07:00 CRITICAL /lib64/libpthread.so.0() [0x7f2e4b5c8000+0x7dd5]
          2020-07-14T23:30:54.406499-07:00 CRITICAL /lib64/libc.so.6(clone+0x6d) [0x7f2e4b1fb000+0xfdead]

          Rebalance exited with reason {mover_crashed,
          {unexpected_exit,
          {'EXIT',<0.6670.0>,
          {{{{{child_interrupted,
          {'EXIT',<17502.2478.0>,socket_closed}},
          [{dcp_replicator,spawn_and_wait,1,
          [{file,"src/dcp_replicator.erl"}, {line,266}]},
          {dcp_replicator,handle_call,3,
          [{file,"src/dcp_replicator.erl"}, {line,121}]},
          {gen_server,try_handle_call,4,
          [{file,"gen_server.erl"},{line,636}]},
          {gen_server,handle_msg,6,
          [{file,"gen_server.erl"},{line,665}]},
          {proc_lib,init_p_do_apply,3,
          [{file,"proc_lib.erl"},{line,247}]}]},
          {gen_server,call,
          [<17502.2476.0>,get_partitions,infinity]}},
          {gen_server,call,
          ['dcp_replication_manager-default',
          {get_replicator_pid,543}, infinity]}},
          {gen_server,call,
          [{'janitor_agent-default',
          'ns_1@172.23.123.102'},
          {if_rebalance,<0.3620.0>,
          {update_vbucket_state,979,active,paused, undefined,
          [['ns_1@172.23.123.102', 'ns_1@172.23.123.101', 'ns_1@172.23.107.52']]}},
          infinity]}}}}}.
          Rebalance Operation Id = 322d92a2335598e144eb0bb97f14f1a3 

          Worker <0.6325.0> (for action {move,{979,
          ['ns_1@172.23.123.102',
          'ns_1@172.23.123.101',
          'ns_1@172.23.107.52'],
          ['ns_1@172.23.107.52',
          'ns_1@172.23.123.100',
          'ns_1@172.23.123.102'],
          []}}) exited with reason {unexpected_exit,
          {'EXIT', <0.6670.0>,
          {{{{{child_interrupted,
          {'EXIT', <17502.2478.0>, socket_closed}},
          [{dcp_replicator, spawn_and_wait, 1,
          [{file, "src/dcp_replicator.erl"}, {line, 266}]},
          {dcp_replicator, handle_call, 3,
          [{file, "src/dcp_replicator.erl"}, {line, 121}]},
          {gen_server, try_handle_call, 4,
          [{file, "gen_server.erl"}, {line, 636}]},
          {gen_server, handle_msg, 6,
          [{file, "gen_server.erl"}, {line, 665}]},
          {proc_lib, init_p_do_apply, 3,
          [{file, "proc_lib.erl"}, {line, 247}]}]},
          {gen_server, call,
          [<17502.2476.0>,
          get_partitions, infinity]}},
          {gen_server, call,
          ['dcp_replication_manager-default',
          {get_replicator_pid, 543}, infinity]}},
          {gen_server, call,
          [{'janitor_agent-default',
          'ns_1@172.23.123.102'},
          {if_rebalance, <0.3620.0>,
          {update_vbucket_state,
          979, active, paused, undefined,
          [['ns_1@172.23.123.102',
          'ns_1@172.23.123.101',
          'ns_1@172.23.107.52']]}},
          infinity]}}}}{noformat}
          *Test to run:*
          {noformat}
          Atomicity.doc_isolation.IsolationDocTest.test_transaction_with_rebalance,nodes_init=4,replicas=2,num_items=20000,rebalance_type=out,nodes_out=1,doc_op=create,durability=PERSIST_TO_MAJORITY,services_init=kv;n1ql;index,rerun=False
          {noformat}
          owend Daniel Owen made changes -
          Summary [Doc isolation] Seeing rebalance failure with reason "mover crashed" followed by memcached crash [Doc isolation] failed as no HashTable item found with key:<ud>cid:0x0:test_docs-00020907</ud> prepare_seqno:29, commit_seqno: 30
          owend Daniel Owen made changes -
          Summary [Doc isolation] failed as no HashTable item found with key:<ud>cid:0x0:test_docs-00020907</ud> prepare_seqno:29, commit_seqno: 30 [Doc isolation] failed as no HashTable item found with key:.... prepare_seqno:29, commit_seqno: 30
          owend Daniel Owen made changes -
          Due Date 20/Jul/20 24/Jul/20
          lynn.straus Lynn Straus made changes -
          Labels 6.6.0 Transactions functional-test 6.6.0 Transactions approved-for-6.6.0 functional-test
          lynn.straus Lynn Straus made changes -
          Link This issue blocks MB-38724 [ MB-38724 ]
          ben.huddleston Ben Huddleston made changes -
          ben.huddleston Ben Huddleston made changes -
          ben.huddleston Ben Huddleston made changes -
          Affects Version/s 6.5.0 [ 15037 ]
          Affects Version/s 6.5.1 [ 16622 ]
          ben.huddleston Ben Huddleston made changes -
          Summary [Doc isolation] failed as no HashTable item found with key:.... prepare_seqno:29, commit_seqno: 30 Non-complete, unpersisted, "deleted" prepare can be removed from HashTable by the persistence of previous abort
          drigby Dave Rigby made changes -
          Assignee Ben Huddleston [ ben.huddleston ] Ashwin Govindarajulu [ ashwin.govindarajulu ]
          Resolution Fixed [ 1 ]
          Status Open [ 1 ] Resolved [ 5 ]
          ashwin.govindarajulu Ashwin Govindarajulu made changes -
          VERIFICATION STEPS Ran the same test few time and not hitting this issue.
          Job: http://qa.sc.couchbase.com/job/oel6-4node-rebalance_in_jython/1128/console
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            Assignee:
            ashwin.govindarajulu Ashwin Govindarajulu
            Reporter:
            ashwin.govindarajulu Ashwin Govindarajulu
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Due:
              Created:
              Updated:
              Resolved:

                Gerrit Reviews

                There are no open Gerrit changes

                  PagerDuty