Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-20852

set_vbucket is raceful with other set_vbucket invocations and with sync vbucket deletions (and likely with lots of other stuff)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 4.6.0
    • 3.1.6, 4.0.0, 4.1.2, 4.5.0, 4.5.1
    • couchbase-bucket
    • None
    • Untriaged
    • Unknown

    Description

      1. Apply the attached patch to ep-engine (scheduling may be arbitrary, so I can introduce arbitrary delays in the code and it still should behave correctly).

      2. Create a bucket upload some data, restart the server couple of times to get some failover history.

      $ ~/dev/membase/repo-watson/install/bin/cbstats 127.0.0.1:12000 failovers 7
       vb_7:0:id:        215814349521305
       vb_7:0:seq:       160
       vb_7:1:id:        155329856619529
       vb_7:1:seq:       160
       vb_7:2:id:        91747243693536
       vb_7:2:seq:       160
       vb_7:3:id:        128962012805783
       vb_7:3:seq:       160
       vb_7:4:id:        275970465686046
       vb_7:4:seq:       160
       vb_7:5:id:        241099843010628
       vb_7:5:seq:       160
       vb_7:6:id:        50511930730683
       vb_7:6:seq:       160
       vb_7:7:id:        280675982653774
       vb_7:7:seq:       0
       vb_7:num_entries: 8
      

      3. Run the following via /diag/eval.

      V = 7, ns_memcached:set_vbucket("default", V, dead),  ns_memcached:sync_delete_vbucket("default", V).
      

      This is what happens during bucket flush for all vbuckets.

      4. Let ns_server recreate the vbucket, wait for 10 second sleep to expire.

      You can watch the local docs to see when it happens.

      $ ../install/bin/couch_dbdump --local --json data/n_0/data/default/7.couch.1
      Dumping "data/n_0/data/default/7.couch.1":
      {"id":"_local/vbstate",
       "value":"{\"state\": \"active\",
                 \"checkpoint_id\": \"0\",
                 \"max_deleted_seqno\": \"0\",
                 \"failover_table\": [{\"id\":199363149644834,\"seq\":0}],
                 \"snap_start\": \"0\",\"snap_end\": \"0\",
                 \"max_cas\": \"0\",
                 \"drift_counter\": \"-140737488355328\"}"}
       
      Total docs: 1
      

      And then vbucket goes back to dead state.

      $ ../install/bin/couch_dbdump --local --json data/n_0/data/default/7.couch.1
      Dumping "data/n_0/data/default/7.couch.1":
      {"id":"_local/vbstate",
       "value":"{\"state\": \"dead\",
                \"checkpoint_id\": \"0\",
                \"max_deleted_seqno\": \"0\",
                \"failover_table\": [{\"id\":215814349521305,\"seq\":160},
                                     {\"id\":155329856619529,\"seq\":160},
                                     {\"id\":91747243693536,\"seq\":160},
                                     {\"id\":128962012805783,\"seq\":160},
                                     {\"id\":275970465686046,\"seq\":160},
                                     {\"id\":241099843010628,\"seq\":160},
                                     {\"id\":50511930730683,\"seq\":160},
                                     {\"id\":280675982653774,\"seq\":0}],
                \"snap_start\": \"160\",\"snap_end\": \"160\",
                \"max_cas\": \"1473396979655770112\",
                \"drift_counter\": \"-140737488355328\"}"}
       
      Total docs: 1
      

      5. At this point stats show fresh failover history. Restart the server, observe that failover history from the deleted vbucket resurrects.

      $ ~/dev/membase/repo-watson/install/bin/cbstats 127.0.0.1:12000 failovers 7
       vb_7:0:id:        215814349521305
       vb_7:0:seq:       160
       vb_7:1:id:        155329856619529
       vb_7:1:seq:       160
       vb_7:2:id:        91747243693536
       vb_7:2:seq:       160
       vb_7:3:id:        128962012805783
       vb_7:3:seq:       160
       vb_7:4:id:        275970465686046
       vb_7:4:seq:       160
       vb_7:5:id:        241099843010628
       vb_7:5:seq:       160
       vb_7:6:id:        50511930730683
       vb_7:6:seq:       160
       vb_7:7:id:        280675982653774
       vb_7:7:seq:       0
       vb_7:num_entries: 8
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              drigby Dave Rigby (Inactive)
              Aliaksey Artamonau Aliaksey Artamonau (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty