Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6385

Rebalance 5->4 nodes is failed with reason not_all_nodes_are_ready_yet

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0-beta
    • Component/s: ns_server
    • Security Level: Public
    • Labels:
    • Environment:
      centOS, 64 -bit, 4 cores VMs, build #1620

      Description

      1.Rebalance in 1->5 nodes
      2. Load data (1M), no views or ddocs are created,
      data doc is like: {{ "age": some_integer, "first_name": some_string }}
      3. Start rebalance out
      4. Created 2 ddocs, 3 view per ddoc, map function: 'function (doc)

      { emit(null, doc);}

      '
      ddoc_40->views0, views1,
      ddoc_41->views0, views1,
      ddoc_42->views0,views1
      5. Rebalance is failed

      2012-08-22 18:47:09.143 menelaus_sup:1:info:web start ok(ns_1@10.3.3.73) - Couchbase Server has started on web port 8091 on node 'ns_1@10.3.3.73'.
      2012-08-22 18:47:09.143 ns_node_disco:4:info:node up(ns_1@10.3.3.73) - Node 'ns_1@10.3.3.73' saw that node 'ns_1@10.3.3.71' came up.
      2012-08-22 18:47:09.214 ns_cluster:3:info:message(ns_1@10.3.3.73) - Node ns_1@10.3.3.73 joined cluster
      2012-08-22 18:47:09.249 ns_orchestrator:4:info:message(ns_1@10.3.3.58) - Starting rebalance, KeepNodes = ['ns_1@10.3.3.73','ns_1@10.3.3.68',
      'ns_1@10.3.3.58','ns_1@10.3.3.71'], EjectNodes = ['ns_1@10.3.3.64']

      2012-08-22 18:47:09.331 ns_rebalancer:0:info:message(ns_1@10.3.3.58) - Started rebalancing bucket default
      2012-08-22 18:47:09.591 ns_memcached:2:info:message(ns_1@10.3.3.73) - Shutting down bucket "default" on 'ns_1@10.3.3.73' for server shutdown
      2012-08-22 18:48:09.080 ns_memcached:2:info:message(ns_1@10.3.3.73) - Shutting down bucket "default" on 'ns_1@10.3.3.73' for server shutdown (repeated 1544 times)
      2012-08-22 18:48:09.105 ns_memcached:2:info:message(ns_1@10.3.3.73) - Shutting down bucket "default" on 'ns_1@10.3.3.73' for server shutdown
      2012-08-22 18:48:09.418 ns_orchestrator:2:info:message(ns_1@10.3.3.58) - Rebalance exited with reason

      {not_all_nodes_are_ready_yet,['ns_1@10.3.3.73']}
      1. 10.3.3.58-8091-diag.txt.gz
        15.72 MB
        Iryna
      2. 10.3.3.64-8091-diag.txt.gz
        6.19 MB
        Iryna
      3. 10.3.3.68-8091-diag.txt.gz
        14.42 MB
        Iryna
      4. 10.3.3.71-8091-diag.txt.gz
        5.83 MB
        Iryna
      5. 10.3.3.73-8091-diag.txt.gz
        3.59 MB
        Iryna
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        FilipeManana Filipe Manana (Inactive) added a comment -

        Yep, if master database is deleted, and couch_set_view process is alive, it will delete index files.

        What I see in the log of .73 is that the crash happens at line 194890:

        [couchdb:info,2012-08-22T17:46:08.160,ns_1@10.3.3.73:<0.4315.4>:couch_log:error:42]Error opening database set `default`: {db_open_error,<<"default/154">>,

        {not_found,no_db_file},
        <<"Couldn't open database `default/154`, reason: {not_found,no_db_file}

        ">>}

        Before the crash several databases are deleted, but the master database is deleted only after the crash, at line 198096:

        [couchdb:info,2012-08-22T17:46:19.251,ns_1@10.3.3.73:couch_server:couch_log:info:39]Deleting database default/master

        Show
        FilipeManana Filipe Manana (Inactive) added a comment - Yep, if master database is deleted, and couch_set_view process is alive, it will delete index files. What I see in the log of .73 is that the crash happens at line 194890: [couchdb:info,2012-08-22T17:46:08.160,ns_1@10.3.3.73:<0.4315.4>:couch_log:error:42] Error opening database set `default`: {db_open_error,<<"default/154">>, {not_found,no_db_file}, <<"Couldn't open database `default/154`, reason: {not_found,no_db_file} ">>} Before the crash several databases are deleted, but the master database is deleted only after the crash, at line 198096: [couchdb:info,2012-08-22T17:46:19.251,ns_1@10.3.3.73:couch_server:couch_log:info:39] Deleting database default/master
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        We merged fix we think

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - We merged fix we think
        Hide
        thuan Thuan Nguyen added a comment -

        Integrated in github-ns-server-2-0 #453 (See http://qa.hq.northscale.net/job/github-ns-server-2-0/453/)
        MB-6385: delete remains of index files on bucket deletion (Revision aff9d82954b7accf1a8c4151889d29356ae2718b)

        Result = SUCCESS
        Aliaksey Kandratsenka :
        Files :

        • src/ns_storage_conf.erl
        Show
        thuan Thuan Nguyen added a comment - Integrated in github-ns-server-2-0 #453 (See http://qa.hq.northscale.net/job/github-ns-server-2-0/453/ ) MB-6385 : delete remains of index files on bucket deletion (Revision aff9d82954b7accf1a8c4151889d29356ae2718b) Result = SUCCESS Aliaksey Kandratsenka : Files : src/ns_storage_conf.erl
        Hide
        ketaki Ketaki Gangal added a comment -

        Seeing this rebalance failure on build 1661, aug 30 on rebalancing nodes.

        Show
        ketaki Ketaki Gangal added a comment - Seeing this rebalance failure on build 1661, aug 30 on rebalancing nodes.
        Hide
        iryna iryna added a comment -

        verified

        Show
        iryna iryna added a comment - verified

          People

          • Assignee:
            iryna iryna
            Reporter:
            iryna iryna
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes