Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-15972

Rebalance fails for swap-in while a client makes View requests

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 3.1.1
    • 3.0.3, 3.1.0
    • view-engine
    • Security Level: Public
    • None
    • Untriaged
    • Ubuntu 64-bit
    • Unknown

    Description

      When you try to add two additional nodes to a two node cluster while a client is making view requests, the rebalance will fail repeatedly. In version 3.0.3 the client making the view request will receive an HTTP 404 with "The remote server returned an error: (404) Not Found."; in version 3.1.0 the client will receive a HTTP 503 with "The request was aborted: The operation has timed out."

      The server logs for 3.0.3 are (I truncated the output...the "missing" and "undefined" go on for some time):

      Rebalance exited with reason {unexpected_exit,
      {'EXIT',<0.7543.4>,
      {{{{badmatch,{error,{missing_partition,922}}},
      [{capi_set_view_manager,handle_call,3,
      [{file,"src/capi_set_view_manager.erl"},
      {line,222}]},
      {gen_server,handle_msg,5,
      [{file,"gen_server.erl"},{line,585}]},
      {gen_server,init_it,6,
      [{file,"gen_server.erl"},{line,304}]},
      {proc_lib,init_p_do_apply,3,
      [{file,"proc_lib.erl"},{line,239}]}]},
      {gen_server,call,
      ['capi_set_view_manager-default',
      {set_vbucket_states,
      [missing,missing,missing,missing,missing,
      missing,missing,missing,missing,missing,
      missing,missing,missing,missing,missing,
       
      ...
       
      undefined,undefined,undefined,undefined,
      undefined,undefined,undefined,undefined,
      undefined,undefined,undefined]},
      infinity]}},
      {gen_server,call,
      [{'janitor_agent-default','ns_1@192.168.106.103'},
      {if_rebalance,<0.6559.4>,{wait_index_updated,921}},
      infinity]}}}}
      

      The server logs for 3.1.0 are:

      Rebalance exited with reason {unexpected_exit,
      {'EXIT',<0.16846.8>,
      {{{{badmatch,{error,{missing_partition,441}}},
      [{capi_set_view_manager,handle_call,3,
      [{file,"src/capi_set_view_manager.erl"},
      {line,222}]},
      {gen_server,handle_msg,5,
      [{file,"gen_server.erl"},{line,585}]},
      {gen_server,init_it,6,
      [{file,"gen_server.erl"},{line,304}]},
      {proc_lib,init_p_do_apply,3,
      [{file,"proc_lib.erl"},{line,239}]}]},
      {gen_server,call,
      ['capi_set_view_manager-default',
      {wait_index_updated,441},
      infinity]}},
      {gen_server,call,
      [{'janitor_agent-default',
      'ns_1@192.168.107.104'},
      {if_rebalance,<0.4363.8>,
      {wait_index_updated,441}},
      infinity]}}}}
       
      and 
       
      <0.16697.8> exited with {unexpected_exit,
      {'EXIT',<0.16846.8>,
      {{{{badmatch,{error,{missing_partition,441}}},
      [{capi_set_view_manager,handle_call,3,
      [{file,"src/capi_set_view_manager.erl"},
      {line,222}]},
      {gen_server,handle_msg,5,
      [{file,"gen_server.erl"},{line,585}]},
      {gen_server,init_it,6,
      [{file,"gen_server.erl"},{line,304}]},
      {proc_lib,init_p_do_apply,3,
      [{file,"proc_lib.erl"},{line,239}]}]},
      {gen_server,call,
      ['capi_set_view_manager-default',
      {wait_index_updated,441},
      infinity]}},
      {gen_server,call,
      [{'janitor_agent-default','ns_1@192.168.107.104'},
      {if_rebalance,<0.4363.8>,
      {wait_index_updated,441}},
      infinity]}}}
      
      

      Steps to reproduce
      1 - Using https://github.com/couchbaselabs/vagrants provision four nodes.
      2 - Create a cluster with two of the nodes using all default settings.
      3 - In the default bucket add a view called "test_view" in a design doc called "test" - use the default view created.
      4 - Add a single document to the bucket using the mgmt console called "doc-1" - use the default doc created.
      5 - Create a client application that loops around calling the view defined in step 3.
      6 - Add the two additional nodes to the cluster and hit rebalance. After a few minutes the rebalance will fail.

      Example client

      using System;
      using System.Collections.Generic;
      using System.Threading;
      using Couchbase;
      using Couchbase.Configuration.Client;
       
      namespace ConsoleApplication1
      {
          class Program
          {
              static void Main(string[] args)
              {
                  var config = new ClientConfiguration
                  {
                      Servers = new List<Uri>
                      {
                          new Uri("http://192.168.107.101:8091")
                      }
                  };
                  using (var cluster = new Cluster(config))
                  {
                      using (var bucket = cluster.OpenBucket())
                      {
                          for (int i = 0; i < 10000000; i++)
                          {
                              var request = bucket.CreateQuery("test", "test_view");
                              var results = bucket.Query<dynamic>(request);
                              Console.WriteLine(results.StatusCode);
                              if (results.Exception != null)
                              {
                                  Console.WriteLine(results.Exception.Message);
                              }
                              Thread.Sleep(500);
                          }
       
                          Console.Read();
                      }
                  }
              }
          }
      }
      
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ericcooper Eric Cooper (Inactive)
              jmorris Jeff Morris
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty