Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-5546

Increasing the default timeouts on ns_server to avoid rebalance failures due to ep-engine stats timeout issues in large cluster or clusters where some nodes are actively using swap

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.8.1-release-candidate
    • Fix Version/s: 1.8.1
    • Component/s: ns_server
    • Security Level: Public
    • Labels:
      None
    • Environment:
      Windows small/large cluster
      Linux small/large cluster

      Bucket 1, default
      vbuckets 1024
      RAM 18.7G
      Nodes 4 ( 2 form the base-cluster)
      Items Setup for 20M items

      Description

      Related issues:-
      MB-5360
      MB-5352

      We have multiple bugs related to the timeouts we are hitting on ns_server :-
      1) When in swap
      2) On windows even on a small cluster.

      This bug is to recommend increasing the default timeouts.

      We used the following timeouts on most of the params, its not all in one solution, but hopefully would cover basic secnarios.
      ns_memcached_outer, 60000
      ns_memcached_open_checkpoint, 60000
      ns_memcached_outer_heavy, 60000
      ns_memcached_outer_very_heavy, 120000
      ns_memcached_connected, 10000
      ebucketmigrator_connect, 60000

      Summary, some error messages and fixes that worked:-
      1) Rebalance exited with reason

      {exited}

      {'EXIT',<0.22700.12>,{timeout,{gen_server,call,[

      {'ns_memcached-default','ns_1@10.3.2.81'},{stats,<<"tap">>},30000]}}}}
      Fix : adjust timeout value - 120sec - ns_memcached_outer_very_heavy
      2) Rebalance exited with reason {exited,
      {replicator_died,

      Fix: Adjust timeout value - 120 sec - ns_memcached_outer_heavy

      3) Rebalance exited with reason {exited,
      {'EXIT',<0.24287.15>,
      {timeout,
      {gen_server,call,
      [{'ns_memcached-default','ns_1@10.3.2.81'}

      ,

      {stats,<<"tap">>}

      ,
      30000]}}}}

      Fix : Adjust timeout to 120sec
      4) Rebalance exited with reason {{change_filter_failed,
      {'EXIT',
      {timeout,

      Fix : Adjust timeout values -
      ebucketmigrator_connect 120 secs
      ns_memcached_connected 1 sec

      # Subject Project Status CR V
      For Gerrit Dashboard: &For+MB-5546=message:MB-5546

        Activity

          People

          • Assignee:
            alkondratenko Aleksey Kondratenko (Inactive)
            Reporter:
            karan Karan Kumar (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes