Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-53172

[6.6.5 build 10104] - Secondary Index rollback to zero after KV node auto failover

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • 6.6.5
    • couchbase-bucket
    • None
    • Enterprise Edition 6.6.5 build 10104
    • Untriaged
    • Centos 64-bit
    • 1
    • Unknown

    Description

      Steps to Repro
      1. Create a 6 node cluster with 3kv, 2 indexing and 1 n1ql nodes.
      2. Create buckets/data/indexes and push buckets to dgm and ensure indexes are in DGM as well. Start running queries in background with request_plus consistency level.
      3. Ran the following script to validate MB-53057 which kills memcached(on 172.23.100.34), waits for AF to kick in, does full recovery and then rebalances in an infinite loop.

      #!/bin/bash
      while :
      do
          echo "killing memcached..."
          kill -9 `pidof memcached`
          echo "Waiting for auto failover to kick in..."
          sleep 180
          echo "Listing node status post Auto failover..."
          /opt/couchbase/bin/couchbase-cli server-list -c localhost:8091 --username Administrator --password password
          sleep 30
          echo "Starting full recovery..."    
          /opt/couchbase/bin/couchbase-cli recovery -c localhost:8091 --username Administrator --password password --server-recovery 172.23.100.34:8091 --recovery-type full
          sleep 30
          echo "Starting Rebalance after recovering a failed over node..."  
          /opt/couchbase/bin/couchbase-cli rebalance -c localhost:8091 --username Administrator --password password
          sleep 4000
          echo "Listing rebalance status..."
          /opt/couchbase/bin/couchbase-cli rebalance-status -c localhost:8091 --username Administrator --password password
          sleep 30
          echo "Listing node status post rebalance..."
          /opt/couchbase/bin/couchbase-cli server-list -c localhost:8091 --username Administrator --password password
          sleep 300
      done
      

      172.23.106.159

      -bash-4.2# grep rollbackAllToZero *
      indexer.log:2022-07-28T23:21:27.611-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test6
      indexer.log:2022-07-28T23:25:12.092-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test7
      grep: rebalance: Is a directory
      -bash-4.2# 
      -bash-4.2# grep "RESPONSE status:(error" indexer.log 
      2022-07-28T20:52:09.953-07:00 [Info] SCAN##6 RESPONSE status:(error = Indexer rollback), requestId: a2a386e2-c8be-4e09-b3e3-08f7d07b7664
      2022-07-28T21:34:01.142-07:00 [Info] SCAN##33 RESPONSE status:(error = Index scan timed out), requestId: 8176078c-a052-49d2-8ed2-e17c7fc39e11
      2022-07-28T21:37:30.653-07:00 [Info] SCAN##34 RESPONSE status:(error = Index scan timed out), requestId: 06f18bfe-3121-4160-95dc-4910d9044dbf
      2022-07-28T21:43:30.666-07:00 [Info] SCAN##35 RESPONSE status:(error = Index scan timed out), requestId: 06f18bfe-3121-4160-95dc-4910d9044dbf
      2022-07-28T21:56:50.734-07:00 [Info] SCAN##36 RESPONSE status:(error = Index scan timed out), requestId: e2771988-044c-42ee-8be6-dd087450b1f8
      2022-07-28T21:58:50.746-07:00 [Info] SCAN##37 RESPONSE status:(error = Index scan timed out), requestId: e2771988-044c-42ee-8be6-dd087450b1f8
      2022-07-28T22:05:10.760-07:00 [Info] SCAN##38 RESPONSE status:(error = Index scan timed out), requestId: 784fbd84-f159-4c97-83bc-ecea954b0570
      2022-07-28T22:07:10.772-07:00 [Info] SCAN##39 RESPONSE status:(error = Index scan timed out), requestId: 784fbd84-f159-4c97-83bc-ecea954b0570
      2022-07-28T22:13:30.786-07:00 [Info] SCAN##40 RESPONSE status:(error = Index scan timed out), requestId: fb1c5f7d-b40c-4a10-b29f-93d906062552
      2022-07-28T22:17:30.798-07:00 [Info] SCAN##41 RESPONSE status:(error = Index scan timed out), requestId: fb1c5f7d-b40c-4a10-b29f-93d906062552
      2022-07-28T22:19:50.810-07:00 [Info] SCAN##42 RESPONSE status:(error = Index scan timed out), requestId: ff34ed61-1a6d-4b1e-9c9f-4e03392133de
      2022-07-28T22:25:50.823-07:00 [Info] SCAN##43 RESPONSE status:(error = Index scan timed out), requestId: ff34ed61-1a6d-4b1e-9c9f-4e03392133de
      2022-07-28T22:28:30.836-07:00 [Info] SCAN##44 RESPONSE status:(error = Index scan timed out), requestId: 1eedd0ca-72cb-489b-9b69-89cde7a369ec
      2022-07-28T22:55:31.234-07:00 [Info] SCAN##53 RESPONSE status:(error = Index scan timed out), requestId: fe917f09-c561-4796-8b33-567c11c73d1c
      2022-07-28T22:57:31.245-07:00 [Info] SCAN##54 RESPONSE status:(error = Index scan timed out), requestId: fe917f09-c561-4796-8b33-567c11c73d1c
      2022-07-28T23:24:18.134-07:00 [Info] SCAN##68 RESPONSE status:(error = Indexer rollback), requestId: f6fe61c3-c77c-4fa5-8852-44c9704f6f1e
      2022-07-28T23:28:27.462-07:00 [Info] SCAN##69 RESPONSE status:(error = Index scan timed out), requestId: f6fe61c3-c77c-4fa5-8852-44c9704f6f1e
      

      172.23.106.163

      -bash-4.2# grep rollbackAllToZero *
      indexer.log:2022-07-28T23:24:27.242-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test7
      grep: rebalance: Is a directory
      -bash-4.2# 
      -bash-4.2# grep "RESPONSE status:(error" indexer.log 
      2022-07-28T23:22:18.924-07:00 [Info] SCAN##112 RESPONSE status:(error = Indexer rollback), requestId: aaee71c1-8153-45f2-a10d-5b30b542d882
      2022-07-28T23:24:27.449-07:00 [Info] SCAN##116 RESPONSE status:(error = Indexer rollback), requestId: f6fe61c3-c77c-4fa5-8852-44c9704f6f1e
      2022-07-28T23:26:27.460-07:00 [Info] SCAN##117 RESPONSE status:(error = Index scan timed out), requestId: f6fe61c3-c77c-4fa5-8852-44c9704f6f1e
      -bash-4.2# 
      

      From the UI :-

      cbcollect_info attached.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              Balakumaran.Gopal Balakumaran Gopal
              Balakumaran.Gopal Balakumaran Gopal
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty