Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-44516

Rebalance failed due to mover crashed during setup_replication.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • 7.0.0
    • Cheshire-Cat
    • couchbase-bucket
    • 7.0.0-4509

    Description

      1. Create a 2 node cluster
      2. Create required buckets and collections. num_buckets=10,num_scopes/bucket=5,num_collections/scope=5
      3. Create 100000000 items sequentially
      4. Rebalance in with Loading of docs
      5. Rebalance reached >20.0% in 123.189999819 seconds. Abort and Restart.
      6. Rebalance reached >40.4007759236% in 92.4159998894 seconds. Abort and Restart.
      7. Rebalance reached >60.0% in 285.265000105 seconds. Abort and Restart.
      8. Rebalance reached >80.0586510638% in 641.375999928 seconds. Abort and Restart.
      9. Rebalance completed with progress: 100% in 2480.62899995 sec
      10. Sleep 61 seconds. Reason: Iteration:0 waiting to kill memc on all nodes
      11. Rebalance Out with Loading of docs
      12. Rebalance reached >20.0% in 1275.79499984 seconds. Abort and Restart.
      13. Rebalance reached >40.0% in 878.126000166 seconds. Abort and Restart.
      14. Rebalance reached >60.0% in 632.766000032 seconds. Abort and Restart.
      15. Rebalance reached >80.0% in 1003.15499997 seconds. Abort and Restart.
      16. Rebalance completed with progress: 100% in 779.811000109 sec
      17. Sleep 111 seconds. Reason: Iteration:0 waiting to kill memc on all nodes
      18. Rebalance In_Out with Loading of docs
      19. Rebalance reached >20.0% in 3003.11400008 seconds. Abort and Restart.
      20. Rebalance reached >40.0% in 2779.15100002 seconds. Abort and Restart.
      21. Rebalance reached >60.0% in 793.217999935 seconds. Abort and Restart.

      Rebalance failed after restarting in the last step.

      Rebalance Failed

      Rebalance exited with reason {mover_crashed,
      {unexpected_exit,
      {'EXIT',<0.18933.114>,
      {{{{{child_interrupted,
      {'EXIT',<16395.11880.74>,socket_closed}},
      [{dcp_replicator,spawn_and_wait,1,
      [{file,"src/dcp_replicator.erl"},
      {line,265}]},
      {dcp_replicator,handle_call,3,
      [{file,"src/dcp_replicator.erl"},
      {line,121}]},
      {gen_server,try_handle_call,4,
      [{file,"gen_server.erl"},{line,661}]},
      {gen_server,handle_msg,6,
      [{file,"gen_server.erl"},{line,690}]},
      {proc_lib,init_p_do_apply,3,
      [{file,"proc_lib.erl"},{line,249}]}]},
      {gen_server,call,
      [<16395.11885.74>,
      {setup_replication,
      [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,
      16,17,18,19,20,21,22,23,24,25,26,27,28,
      29,30,31,32,33,34,35,36,37,38,39,40,41,
      42,43,44,45,46,47,48,49,50,51,52,53,54,
      55,56,57,58,59,60,61,62,63,64,65,66,67,
      68,69,70,71,72,73,74,75,76,77,78,79,80,
      81,82,83,84,85,86,87,88,89,90,91,92,93,
      94,95,96,97,98,99,100,101,102,103,104,
      105,106,107,108,109,110,111,112,113,
      114,115,116,117,118,119,120,121,122,
      123,124,125,126,127,128,129,130,131,
      132,341,683,684,685,686,687,688,689,
      690,691,692,693,694,695,696,697,698,
      699,700,701,702,703,704,705,706,707,
      708,709,710,711,712,713,714,715,716,
      717,718,719,720,721,722,723,724,725,
      726,727,728,729,730,731,732,733,734,
      735,736,737,738,739,740,741,742,743,
      744,745,746,747,748,749,750,751,752,
      753,754,755,756,757,758,759,760,761,
      762,763,764,765,766,767,768,769,770,
      771,772,773,774,775,776,777,778,779,
      780,781,782,783,784,785,786,787,788,
      789,790,791,792,793,794,795,796,797,
      798,799,800,801,802,803,804,805,806,
      807,808,809,810,811,812,813,814,815,
      816,817,818,819,820,821,822,823,824,
      825,826,827,828,829,830,831,832,833,
      834,835,836,837,838,839,840,841,842,
      843,844,845,846,847,848,849,850,851,
      852,854,855,856,857,858,859,860,861,
      862,863,864,865,866,867,868,869,870,
      871,872,873,874,875,876,877,878,879,
      880,881,882,883,884,885,886,887,888,
      889,890,891,892,893,894,895,896,897,
      898,899,900,901,902,903,904,905,906,
      907,908,909,910,911,912,913,914,915,
      916,917,918,919,920,921,922,923,924,
      925,926,927,928,929,930,931,932,933,
      934,935,936,937,938,939,940,941,942,
      943,944,945,946,947,948,949,950,951,
      952,953,954,955,956,957,958,959,960,
      961,962,963,964,965,966,967,968,969,
      970,971,972,973,974,975,976,977,978,
      979,980,981,982,983,984,985,986,987,
      988,989,990,991,992,993,994,995,996,
      997,998,999,1000,1001,1002,1003,1004,
      1005,1006,1007,1008,1009,1010,1011,
      1012,1013,1014,1015,1016,1017,1018,
      1019,1020,1021,1022,1023]},
      infinity]}},
      {gen_server,call,
      ['replication_manager-GleamBookUsers6',
      {change_vbucket_replication,133,undefined},
      infinity]}},
      {gen_server,call,
      [{'janitor_agent-GleamBookUsers6',
      'ns_1@172.23.97.38'},
      {if_rebalance,<0.2575.111>,
      {update_vbucket_state,639,active,paused,
      undefined,
      [['ns_1@172.23.97.38',
      'ns_1@172.23.97.37'],
      ['ns_1@172.23.97.39',
      'ns_1@172.23.97.37']]}},
      infinity]}}}}}.
      Rebalance Operation Id = 82785d3beda4c65cc2689a157e38e786
      

      QE Test

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P "args=-i /tmp/magma_temp_job1.ini -p bucket_storage=couchstore,bucket_eviction_policy=fullEviction -t volumetests.Magma.volume.SystemTestMagma,nodes_init=2,replicas=1,skip_cleanup=True,num_items=100000000,doc_size=256,bucket_type=membase,compression_mode=off,iterations=10,crashes=0,batch_size=1000,sdk_timeout=60,log_level=debug,infra_log_level=info,rerun=False,skip_cleanup=True,key_size=18,randomize_doc_size=False,randomize_value=True,assert_crashes_on_load=True,maxttl=1200,num_buckets=10,num_scopes=5,num_collections=5,doc_ops=create:update:delete:expiry,durability=Majority,sdk_client_pool=True,pc=1 -m rest"
       
      Cluster:
      [_1]
      ip:172.23.97.37
      services:kv
       
      [_2]
      ip:172.23.97.38
       
      [_3]
      ip:172.23.97.39
       
      [_4]
      ip:172.23.97.40
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ritesh.agarwal Ritesh Agarwal
              ritesh.agarwal Ritesh Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty