Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6562

rebalance constantly failed with pre_rebalance_config_synchronization_failed after warmup

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • 2.0-beta
    • 2.0-beta
    • ns_server
    • Security Level: Public
    • None
    • CentOS 5.8

    Description

      build-1693
      4GB RAM, 4 core
      1:10.3.3.6
      2:10.3.3.7
      3:10.3.3.8
      4:10.3.3.9
      5:10.3.3.10
      6:10.3.3.15

      1 1:10.3.3.6, 3 buckets: default, sasl and standart bucket, start data loading
      2 +' 2:10.3.3.7
      3 +' 2:10.3.3.8,10.3.3.9 "-' 10.3.3.7
      4 failover 10.3.3.8, "+" 10.3.3.7
      5 add 3 view in each bucket (default:2295598 doc( active resident ratio 24 $) ,sasl:1660742( 68 %),standart:1658229(58 %)
      6 reboot 10.3.3.9
      7 after #6 "+" 10.3.3.8 & 10.3.3.10
      10:50:17 - Fri Sep 7, 2012 Rebalance exited with reason

      {not_all_nodes_are_ready_yet,['ns_1@10.3.3.9']}

      - I guess it's expected. but it would be good to check the status of nodes before rebalance and to give the user a message that nodes are not ready
      8 rebalance with the same set of nodes( didn't wait while warmup completed)
      9 wait when warmup completed and rebalance

      result :

      [couchdb:info,2012-09-07T11:52:35.259,ns_1@10.3.3.6:<0.28603.100>:couch_log:info:39]Updater reading changes from active partitions to update main set view group `_design/dev_1` from set `default`
      [ns_server:debug,2012-09-07T11:52:35.322,ns_1@10.3.3.6:'couch_stats_reader-sasl':couch_stats_reader:vbuckets_aggregation_loop:117]Failed to open vbucket: 306 (

      {not_found,no_db_file}). Ignoring
      [couchdb:info,2012-09-07T11:52:35.323,ns_1@10.3.3.6:<0.27920.85>:couch_log:info:39]Stopping updater for set view `default`, main group `_design/dev_1` (doing initial index build)
      [ns_server:debug,2012-09-07T11:52:35.371,ns_1@10.3.3.6:'couch_stats_reader-sasl':couch_stats_reader:vbuckets_aggregation_loop:117]Failed to open vbucket: 307 ({not_found,no_db_file}

      ). Ignoring
      [ns_server:error,2012-09-07T11:52:35.374,ns_1@10.3.3.6:<0.27765.100>:ns_config_rep:synchronize_remote:287]Failed to synchronize config to some nodes:
      ['ns_1@10.3.3.7']
      [user:info,2012-09-07T11:52:35.375,ns_1@10.3.3.6:<0.7585.8>:ns_orchestrator:handle_info:295]Rebalance exited with reason

      {pre_rebalance_config_synchronization_failed, ['ns_1@10.3.3.7']}

      [ns_server:debug,2012-09-07T11:52:35.403,ns_1@10.3.3.6:'couch_stats_reader-sasl':couch_stats_reader:vbuckets_aggregation_loop:117]Failed to open vbucket: 308 (

      {not_found,no_db_file}). Ignoring
      [couchdb:info,2012-09-07T11:52:35.409,ns_1@10.3.3.6:<0.27920.85>:couch_log:info:39]Set view `default`, main group `_design/dev_1`, partition states updated
      active partitions before: [512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643,644,645,646,647,648,649,650,651,652,653,654,655,656,657,658,659,660,661,662,663,664,665,666,667,668,669,670,671,672,673,674,675,676,677,678,679,680,681,682,683,684,685,686,687,688,689,690,691,692,693,694,695,696,697,698,699,700,701,702,703,704,705,706,707,708,709,710,711,712,713,714,715,716,717,718,719,720,721,722,723,724,725,726,727,728,729,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,746,747,748,749,750,751,752,753,754,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,771,772,773,774,775,776,777,778,779,780,781,782,783,784,785,786,787,788,789,790,791,792,793,794,795,796,797,798,799,800,801,802,803,804,805,806,807,808,809,810,811,812,813,814,815,816,817,818,819,820,821,822,823,824,825,826,827,828,829,830,831,832,833,834,835,836,837,838,839,840,841,842,843,844,845,846,847,848,849,850,851,852,853]
      active partitions after:



      ns_server:debug,2012-09-07T11:52:35.538,ns_1@10.3.3.6:'couch_stats_reader-sasl':couch_stats_reader:vbuckets_aggregation_loop:117]Failed to open vbucket: 313 ({not_found,no_db_file}

      ). Ignoring
      [ns_server:debug,2012-09-07T11:52:35.542,ns_1@10.3.3.6:'capi_set_view_manager-default':capi_set_view_manager:handle_info:330]doing replicate_newnodes_docs
      [ns_server:debug,2012-09-07T11:52:35.542,ns_1@10.3.3.6:ns_config_log:ns_config_log:log_common:111]config change:
      counters ->
      [

      {rebalance_fail,3}

      ,

      {rebalance_start,6}

      ,

      {rebalance_success,3}

      ,

      {failover_node,1}

      ]
      [ns_server:debug,2012-09-07T11:52:35.542,ns_1@10.3.3.6:'capi_set_view_manager-sasl':capi_set_view_manager:handle_info:330]doing replicate_newnodes_docs
      [couchdb:info,2012-09-07T11:52:35.556,ns_1@10.3.3.6:<0.29402.85>:couch_log:info:39]Stopping updater for set view `standart`, main group `_design/dev_1` (doing initial index build)
      [ns_server:debug,2012-09-07T11:52:35.607,ns_1@10.3.3.6:ns_config_log:ns_config_log:log_common:111]config change:
      rebalancer_pid ->
      undefined
      [ns_server:debug,2012-09-07T11:52:35.615,ns_1@10.3.3.6:ns_config_log:ns_config_log:log_common:111]config change:
      rebalance_status ->

      {none,<<"Rebalance failed. See logs for detailed reason. You can try rebalance again.">>}

      all diags are attached + atop info

      Attachments

        1. 10.3.3.10-8091-diag.txt.gz
          546 kB
        2. 10.3.3.6-8091-diag.txt.gz
          7.24 MB
        3. 10.3.3.7-8091-diag.txt.gz
          9.17 MB
        4. 10.3.3.8-8091-diag.txt.gz
          16.51 MB
        5. 10.3.3.9-8091-diag.txt.gz
          8.12 MB
        6. atop_info.txt
          14 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            farshid Farshid Ghods (Inactive)
            andreibaranouski Andrei Baranouski
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty