Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-54050

Terminate ns_vbucket_mover synchronously when rebalance is stopped.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 7.6.0
    • None
    • ns_server
    • None
    • 1

    Description

      This code isn't executed when a rebalance is stopped since we run rebalance_body via an async and the async code converts all terminate requests Reason to a "shutdown".

      In this particular case an user intitated shutdown Reason "{shutdown, stop}" is converted to a "shutdown".

      Here are logs from a rebalance stopped in the baseline code and this log isn't printed.

      fd "debug.log" | xargs -I {} sh -c "rg \"Got rebalance stop\" {}"
       
      Rebalance stopped via user log in ns_orchestrator:
       
      fd "debug.log" | xargs -I {} sh -c "rg \"Rebalance stopped\" {}"
      29150:[user:info,2022-10-07T11:31:04.021-07:00,n_0@192.168.86.150:<0.4997.0>:ns_orchestrator:log_rebalance_completion:1448]Rebalance stopped by user.
      
      

       

      Based on a discussion with Artem - terminating ns_vbucket_mover async is dangerous and can potentially lead to data-loss, moving the fix to match on shutdown instead of {shutdown, stop}.

      Artem Stemkovski  1:03 PMThe consequences of the orphaned mover doing something could be quite serious. Like data corruption

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              hareen.kancharla Hareen Kancharla
              hareen.kancharla Hareen Kancharla
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty