Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-5343

memcached dropped connections: Rebalance failed due to replicator_died: exited (ns_single_vbucket_mover)

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.8.1-release-candidate
    • Fix Version/s: 2.0
    • Component/s: couchbase-bucket
    • Security Level: Public
    • Environment:
      Ubuntu 64 bit
      181-831-rel

      Description

      Failing test is:-
      rebalancetests.RebalanceInOutWithParallelLoad.test_load,get-logs:True,replica:2,num_nodes:7

      [ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15105.17>:ns_replicas_builder:kill_a_bunch_of_tap_names:209] Killed the following tap names on 'ns_1@10.1.3.109': [<<"replication_building_62_'ns_1@10.1.3.112'">>]
      [ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_single_vbucket_mover:mover_inner:88] Got exit message (parent is <0.13796.17>). Exiting...
      {'EXIT',<0.15105.17>,{replicator_died,

      {'EXIT',<16541.21868.10>,normal}}}
      [ns_server:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_replicas_builder:sync_shutdown_many:147] Shutdown of the following failed: [{<0.15105.17>,
      {replicator_died,
      {'EXIT',<16541.21868.10>,normal}

      }}]
      [error_logger:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:error_logger:ale_error_logger_handler:log_report:72]
      =========================CRASH REPORT=========================
      crasher:
      initial call: erlang:apply/2
      pid: <0.15105.17>
      registered_name: []
      exception exit: {replicator_died,{'EXIT',<16541.21868.10>,normal}}
      in function ns_replicas_builder:'build_replicas_main/6-fun-0'/1
      in call from ns_replicas_builder:observe_wait_all_done_tail/5
      in call from ns_replicas_builder:observe_wait_all_done/5
      in call from ns_replicas_builder:'build_replicas_main/6-fun-1'/8
      in call from ns_replicas_builder:try_with_maybe_ignorant_after/2
      in call from ns_replicas_builder:build_replicas_main/6
      ancestors: [<0.15104.17>,<0.13796.17>,<0.13763.17>]
      messages: [

      {'EXIT',<16541.21868.10>,normal}]
      links: [<0.15104.17>]
      dictionary: []
      trap_exit: true
      status: running
      heap_size: 121393
      stack_size: 24
      reductions: 12423
      neighbours:

      [ns_server:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_replicas_builder:try_with_maybe_ignorant_after:68] Eating exception from ignorant after-block:
      {error,{badmatch,[{<0.15105.17>,
      {replicator_died,{'EXIT',<16541.21868.10>,normal}

      }}]},
      [

      {ns_replicas_builder,sync_shutdown_many,1}

      ,

      {ns_replicas_builder,try_with_maybe_ignorant_after,2}

      ,

      {ns_single_vbucket_mover,mover,6}

      ,

      {proc_lib,init_p_do_apply,3}

      ]}
      [rebalance:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.13796.17>:ns_vbucket_mover:handle_info:158] <0.15104.17> exited with {exited,
      {'EXIT',<0.15105.17>,
      {replicator_died,
      {'EXIT',<16541.21868.10>,normal}}}}
      [ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.9348.1>:ns_port_server:log:161] memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Schedule the backfill for vbucket 61
      memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "opaque_enable_auto_nack" and vbucket 0
      memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "enable_checkpoint_sync" and vbucket 0
      memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_VBUCKET_SET with vbucket 61 and state "pending"
      memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "initial_vbucket_stream" and vbucket 61
      memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Backfill is completed with VBuckets 61,
      memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "close_backfill" and vbucket 61
      memcached<0.9348.1>: Vbucket <61> is going dead.
      memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_VBUCKET_SET with vbucket 61 and state "active"
      memcached<0.9348.1>: TAP takeover is completed. Disconnecting tap stream <eq_tapq:rebalance_61>
      memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Schedule the backfill for vbucket 62
      memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "opaque_enable_auto_nack" and vbucket 0
      memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "enable_checkpoint_sync" and vbucket 0
      memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "initial_vbucket_stream" and vbucket 62

      [error_logger:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:error_logger:ale_error_logger_handler:log_report:72]
      =========================CRASH REPORT=========================
      crasher:
      initial call: ns_single_vbucket_mover:mover/6
      pid: <0.15104.17>
      registered_name: []
      exception exit: {exited,
      {'EXIT',<0.15105.17>,
      {replicator_died,
      {'EXIT',<16541.21868.10>,normal}}}}
      in function ns_single_vbucket_mover:mover_inner/6
      in call from ns_replicas_builder:try_with_maybe_ignorant_after/2
      in call from ns_single_vbucket_mover:mover/6
      ancestors: [<0.13796.17>,<0.13763.17>]
      messages: []
      links: [<0.13796.17>]
      dictionary: [

      {cleanup_list,[<0.15105.17>]}

      ]
      trap_exit: true
      status: running
      heap_size: 987
      stack_size: 24
      reductions: 4014

      1. f9bf72eb-4450-44a3-ae0e-f6328be40592-10.1.3.121-diag.gz
        4.30 MB
        Karan Kumar
      2. f9bf72eb-4450-44a3-ae0e-f6328be40592-10.1.3.160-diag.gz
        2.63 MB
        Karan Kumar
      3. f9bf72eb-4450-44a3-ae0e-f6328be40592-10.1.3.109-diag.gz
        7.07 MB
        Karan Kumar
      4. f9bf72eb-4450-44a3-ae0e-f6328be40592-10.1.3.111-diag.gz
        3.69 MB
        Karan Kumar
      5. f9bf72eb-4450-44a3-ae0e-f6328be40592-10.1.3.112-diag.gz
        4.41 MB
        Karan Kumar
      6. f9bf72eb-4450-44a3-ae0e-f6328be40592-10.1.3.119-diag.gz
        3.93 MB
        Karan Kumar
      7. f9bf72eb-4450-44a3-ae0e-f6328be40592-10.1.3.120-diag.gz
        2.55 MB
        Karan Kumar
      8. 2ae93467-b468-4d0e-966e-7f14cdc3bb1f-10.1.3.85-diag.gz
        17.31 MB
        Karan Kumar
      9. 2ae93467-b468-4d0e-966e-7f14cdc3bb1f-10.1.3.82-diag.gz
        17.11 MB
        Karan Kumar
      10. 2ae93467-b468-4d0e-966e-7f14cdc3bb1f-10.1.3.83-diag.gz
        14.09 MB
        Karan Kumar
      11. 2ae93467-b468-4d0e-966e-7f14cdc3bb1f-10.1.3.84-diag.gz
        18.24 MB
        Karan Kumar
      12. 2ae93467-b468-4d0e-966e-7f14cdc3bb1f-10.1.3.86-diag.gz
        19.52 MB
        Karan Kumar
      13. 32326eea-21d4-4ef0-bf82-233b681a5336-10.1.3.111-diag.gz
        2.46 MB
        Karan Kumar
      14. 32326eea-21d4-4ef0-bf82-233b681a5336-10.1.3.120-diag.gz
        884 kB
        Karan Kumar
      15. 32326eea-21d4-4ef0-bf82-233b681a5336-10.1.3.119-diag.gz
        1.21 MB
        Karan Kumar
      16. 32326eea-21d4-4ef0-bf82-233b681a5336-10.1.3.112-diag.gz
        6.90 MB
        Karan Kumar
      17. 32326eea-21d4-4ef0-bf82-233b681a5336-10.1.3.121-diag.gz
        1.64 MB
        Karan Kumar
      18. 10.5.2.13-8091-diag.txt.gz
        2.10 MB
        Andrei Baranouski
      19. 10.5.2.14-8091-diag.txt.gz
        950 kB
        Andrei Baranouski
      20. 10.5.2.15-8091-diag.txt.gz
        955 kB
        Andrei Baranouski
      21. 10.5.2.16-8091-diag.txt.gz
        954 kB
        Andrei Baranouski
      22. 10.5.2.18-8091-diag.txt.gz
        977 kB
        Andrei Baranouski
      23. 10.5.2.19-8091-diag.txt.gz
        1.73 MB
        Andrei Baranouski
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        ketaki Ketaki Gangal added a comment -

        jira-error : Not able to delete above comment- Please ignore.

        Show
        ketaki Ketaki Gangal added a comment - jira-error : Not able to delete above comment- Please ignore.
        Hide
        peter peter added a comment -

        Let's see if we catch in 2.0 regression runs.

        Show
        peter peter added a comment - Let's see if we catch in 2.0 regression runs.
        Hide
        andreibaranouski Andrei Baranouski added a comment -

        got the same error on build-1554 for test:
        ./testrunner -i andrei.ini -t view.createdeleteview.CreateDeleteViewTests.rebalance_in_and_out_with_ddoc_ops,ddoc_ops=delete,test_with_view=True,num_ddocs=2,num_views_per_ddoc=3,items=2000

        [error_logger:error] [2012-08-08 12:54:03] [ns_1@10.5.2.11:error_logger:ale_error_logger_handler:log_report:72]
        =========================CRASH REPORT=========================
        crasher:
        initial call: ns_single_vbucket_mover:mover/6
        pid: <0.17517.2>
        registered_name: []
        exception error: {bulk_set_vbucket_state_failed,
        [{'ns_1@10.5.2.19',
        {'EXIT',
        {{{normal,
        {gen_server,call,
        [<20615.11732.0>,

        {set_state,[],"k", [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17, 18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, 33,34,35,36,37,38,39,40,41,42,43,44,45,46,47, 48,49,50,51,52,53,54,55,56,57,58,59,60,61,62, 63,64,65,66,67,68,69,70,71,72,73,74,75,76,77, 78,79,80,81,82,83,84,85,86,87,88,89,90,91,92, 93,94,95,96,97,98,99,100,101,102,103,104,105, 106,108,109,110,111,112,113,114,115,116,117, 118,119,120,121,122,123,124,125,126,127]}

        ,
        infinity]}},
        {gen_server,call,

        attached logs

        Show
        andreibaranouski Andrei Baranouski added a comment - got the same error on build-1554 for test: ./testrunner -i andrei.ini -t view.createdeleteview.CreateDeleteViewTests.rebalance_in_and_out_with_ddoc_ops,ddoc_ops=delete,test_with_view=True,num_ddocs=2,num_views_per_ddoc=3,items=2000 [error_logger:error] [2012-08-08 12:54:03] [ns_1@10.5.2.11:error_logger:ale_error_logger_handler:log_report:72] =========================CRASH REPORT========================= crasher: initial call: ns_single_vbucket_mover:mover/6 pid: <0.17517.2> registered_name: [] exception error: {bulk_set_vbucket_state_failed, [{'ns_1@10.5.2.19', {'EXIT', {{{normal, {gen_server,call, [<20615.11732.0>, {set_state,[],"k", [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17, 18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, 33,34,35,36,37,38,39,40,41,42,43,44,45,46,47, 48,49,50,51,52,53,54,55,56,57,58,59,60,61,62, 63,64,65,66,67,68,69,70,71,72,73,74,75,76,77, 78,79,80,81,82,83,84,85,86,87,88,89,90,91,92, 93,94,95,96,97,98,99,100,101,102,103,104,105, 106,108,109,110,111,112,113,114,115,116,117, 118,119,120,121,122,123,124,125,126,127]} , infinity]}}, {gen_server,call, attached logs
        Hide
        andreibaranouski Andrei Baranouski added a comment -

        It seems like it's another issue, fill a separate bug http://www.couchbase.com/issues/browse/MB-6160 . please, skip previous comment

        Show
        andreibaranouski Andrei Baranouski added a comment - It seems like it's another issue, fill a separate bug http://www.couchbase.com/issues/browse/MB-6160 . please, skip previous comment
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Hm. less than year has passed but I'm failing to locate relevant commits

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Hm. less than year has passed but I'm failing to locate relevant commits

          People

          • Assignee:
            frank Frank Weigel (Inactive)
            Reporter:
            karan Karan Kumar (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes