Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-53097

Index rollback to zero after KV node autofailedover

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • 6.6.6
    • 6.6.2
    • None
    • 6.6.2-9588-enterprise
    • Untriaged
    • 1
    • Unknown

    Description

      Steps to Repro
      1. Create 3 buckets.
      2. Load data in to the buckets to push them in to DGM of around 10%.
      3. Create indexes(with 1 replica). Ensure indexes are in DGM as well.
      4. Start bursty data load using following command.

      cbc-pillowfight -u Administrator -P password -U couchbase://172.23.100.15,172.23.100.16,172.23.100.17/test -I 200000 -m 144 -M 144 -B 1000 --random-body --json 
      

      5. Run n1ql queries in parallel using the following shell script.

      while true
      do
      	echo "Press [CTRL+C] to stop.."
      	sleep 20
      	curl --location --request POST 'http://172.23.121.215:8093/query/service' --data-urlencode 'statement=select * from test where Field_1 like "1%" limit 1' --data-urlencode 'scan_consistency=request_plus' -u Administrator:password
      	sleep 20
      	curl --location --request POST 'http://172.23.121.215:8093/query/service' --data-urlencode 'statement=select * from test2 where Field_1 like "1%" limit 1' --data-urlencode 'scan_consistency=request_plus' -u Administrator:password
      	sleep 20
      	curl --location --request POST 'http://172.23.121.215:8093/query/service' --data-urlencode 'statement=select * from test3 where Field_1 like "1%" limit 1' --data-urlencode 'scan_consistency=request_plus' -u Administrator:password
      done
      

      6. Set autofailover timeout to 5 secs.

      Was running a rebalance(to increase replicas I believe), AF happened on .15

      ns_1@172.23.100.19 10:01:21 PM   22 Jul, 2022

      Rebalance interrupted due to auto-failover of nodes ['ns_1@172.23.100.15'].
      Rebalance Operation Id = db0bc9697971d63eeff46fb3d85cfbb9
      

      ns_1@172.23.100.19 10:01:21 PM   22 Jul, 2022

      Starting failing over ['ns_1@172.23.100.15']
      

      172.23.100.19 10:02:44 PM   22 Jul, 2022

      Failover completed successfully.
      Rebalance Operation Id = 93eb3e51ccb22e74bdeda7acdd90e0af
      

      172.23.100.19

      grep rollbackAllToZero cbcollect_info_ns_1@172.23.100.*/* indexer* 
      cbcollect_info_ns_1@172.23.100.19_20220723-050655/ns_server.indexer.log:2022-07-22T22:04:05.873-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test3
      cbcollect_info_ns_1@172.23.100.19_20220723-050655/ns_server.indexer.log:2022-07-22T22:04:13.535-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test2
      

      172.23.100.22

      cbcollect_info_ns_1@172.23.100.22_20220723-050655/ns_server.indexer.log:2022-07-22T22:03:23.992-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test3
      cbcollect_info_ns_1@172.23.100.22_20220723-050655/ns_server.indexer.log:2022-07-22T22:03:33.771-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test2
      

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            varun.velamuri Varun Velamuri
            Balakumaran.Gopal Balakumaran Gopal
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty