Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-43921

[BP 6.5.2] - listReplicaCount is taking more than 10Sec and timing out

    XMLWordPrintable

Details

    • 1

    Description

      We see that listReplicaCount gets timed out as below
      2020-10-08T15:20:31.868+00:00 [Verbose] GetWithAuth: <url to get listReplicaCount> elapsed 10.000339766s
      2020-10-08T15:20:31.868+00:00 [Error] Planner::getIndexNumReplica: Error from reading index num replica for node <removed>. Error = Get <url to get listReplicaCount> net/http: request canceled (Client.Timeout exceeded while awaiting headers)

      This call is made by Index Planner to get optimal load distribution.
      Tried reproducing this issue locally and looks like the http calls to MetaKv to check for DDL tokens is causing this delay.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Build couchbase-server-6.5.2-6625 contains indexing commit e7559c2 with commit message:
            MB-43921 - [BP] Reduce time taken for listReplicaCount

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.2-6625 contains indexing commit e7559c2 with commit message: MB-43921 - [BP] Reduce time taken for listReplicaCount

            Build couchbase-server-6.5.2-6625 contains indexing commit 8db3659 with commit message:
            MB-43921 : [BP] Make Timeout for GetWithCbauth Configurable.

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.2-6625 contains indexing commit 8db3659 with commit message: MB-43921 : [BP] Make Timeout for GetWithCbauth Configurable.

            Verified below scenarios with 6.5.2-6625

            • Cluster with 3 kv, 2 index+n1ql, 4 search nodes
            • 6 bkts with 5000 docs each
            • Built 200 GSI indexes with replica 1 (50 indexes on 4 buckets)
            • Created 30 fts custom indexes (10 indexes on 3 buckets), just to add more entries to metakv
            • Create and Drop 100 gsi indexes sequentially on 4 buckets ( so this would be adding more entries of create/drop of 400 indexes)
            • Creation of indexes initially faster in milli sec but eventually it got slower taking 30 – 40 sec even with defer_build = true
            • So this took around 4 – 5 hrs
            • Now, did a rebalance, adding a search node

            Additional testing with clusterops of FTS and GSI nodes - Failover, rebalance in/out and swap rebalance.

            girish.benakappa Girish Benakappa added a comment - Verified below scenarios with 6.5.2-6625 Cluster with 3 kv, 2 index+n1ql, 4 search nodes 6 bkts with 5000 docs each Built 200 GSI indexes with replica 1 (50 indexes on 4 buckets) Created 30 fts custom indexes (10 indexes on 3 buckets), just to add more entries to metakv Create and Drop 100 gsi indexes sequentially on 4 buckets ( so this would be adding more entries of create/drop of 400 indexes) Creation of indexes initially faster in milli sec but eventually it got slower taking 30 – 40 sec even with defer_build = true So this took around 4 – 5 hrs Now, did a rebalance, adding a search node Additional testing with clusterops of FTS and GSI nodes - Failover, rebalance in/out and swap rebalance.

            People

              girish.benakappa Girish Benakappa
              jeelan.poola Jeelan Poola
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty