Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50193

Scans failing with timeout and indexes not found with rebalance-in + indexer DGM

    XMLWordPrintable

Details

    Description

      Build: 7.1.0-1960

      Test: Rebalance-in (min) 3->4, 1 bucket x 55M x 1KB, 100 indexes, 10K KV ops/sec, Scan Workload concurrency=128, Plasma, s=1 c=10 
      2021-12-21T16:09:42 [INFO] Average RR over all Indexes : 0.3339540786186384

      05:36:15 2021-12-21T16:09:06.663-08:00 [Error] [GsiScanClient:"172.23.110.55:9101"] Range(./opt/couchbase/bin/cbindexperf5002328) response failed `Index scan timed out`
      05:36:15 2021-12-21T16:09:06.663-08:00 [Error] PickRandom: Fail to find indexer for all index partitions. Num partition 3.  Partition with instances 2 
      05:36:15 2021-12-21T16:09:06.768-08:00 [Error] [GsiScanClient:"172.23.110.55:9101"] Range(./opt/couchbase/bin/cbindexperf5002336) response failed `Index scan timed out`
      05:36:15 2021-12-21T16:09:06.768-08:00 [Error] PickRandom: Fail to find indexer for all index partitions. Num partition 3.  Partition with instances 2 
      05:36:15 2021-12-21T16:09:06.967-08:00 [Error] [GsiScanClient:"172.23.110.55:9101"] Range(./opt/couchbase/bin/cbindexperf5002338) response failed `Index scan timed out`
      05:36:15 2021-12-21T16:09:06.967-08:00 [Error] PickRandom: Fail to find indexer for all index partitions. Num partition 3.  Partition with instances 1 
      05:36:15 2021-12-21T16:09:07.288-08:00 [Error] [GsiScanClient:"172.23.110.55:9101"] Range(./opt/couchbase/bin/cbindexperf5002356) response failed `Index scan timed out`
      05:36:15 2021-12-21T16:09:07.288-08:00 [Error] PickRandom: Fail to find indexer for all index partitions. Num partition 3.  Partition with instances 1 
      05:36:16 2021-12-21T16:09:07.814-08:00 [Error] [GsiScanClient:"172.23.110.55:9101"] Range(./opt/couchbase/bin/cbindexperf5002368) response failed `Index scan timed out`
      05:36:16 2021-12-21T16:09:07.814-08:00 [Error] PickRandom: Fail to find indexer for all index partitions. Num partition 3.  Partition with instances 1 
      05:36:16 2021-12-21T16:09:07.878-08:00 [Error] [GsiScanClient:"172.23.110.55:9101"] Range(./opt/couchbase/bin/cbindexperf5001816) response failed `Index scan timed out`
      05:36:16 2021-12-21T16:09:07.878-08:00 [Error] PickRandom: Fail to find indexer for all index partitions. Num partition 3.  Partition with instances 2  

      Test job: http://perf.jenkins.couchbase.com/job/aether/1291/consoleFull 

      cbmonitor: http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=aether_710-1960_rebalance_571a 

      Another interesting part is we have more indexes after rebalance
      23:09:13 2021-12-21T09:41:53 [INFO] 172.23.110.72 : 215 Indexes
      23:09:13 2021-12-21T09:41:53 [INFO] 172.23.110.55 : 187 Indexes
      23:09:13 2021-12-21T09:41:53 [INFO] 172.23.110.56 : 198 Indexes

      05:36:50 2021-12-21T16:09:41 [INFO] Indexes after rebalance
      05:36:50 2021-12-21T16:09:41 [INFO] 172.23.110.72 : 177 Indexes
      05:36:50 2021-12-21T16:09:41 [INFO] 172.23.110.55 : 159 Indexes
      05:36:50 2021-12-21T16:09:41 [INFO] 172.23.110.56 : 156 Indexes
      05:36:50 2021-12-21T16:09:41 [INFO] 172.23.110.71 : 110 Indexes

       

      600 indexes before, 602 after.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              vikas.chaudhary Vikas Chaudhary
              vikas.chaudhary Vikas Chaudhary
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty