Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-48063

ns_server should do a quorum read on collection manifests before performing checks

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown

    Description

      I'm filing this as a bug as currently my thinking it's something that needs to be fixed, but this can also be considered some kind of improvement.

      In any case, I hit the issue this ticket describes when I was running some tests that created and dropped scopes and collections in quick succession.

      I tried to create bucket c_0 on scope s1 in bucket b_3487 and it failed stating that c_0 already exists:

      [ns_server:debug,2021-08-19T11:51:57.149-07:00,n_2@127.0.0.1:collections<0.823.0>:collections:do_update:284]Performing operation {create_collection,"s1","c_0",[]} on bucket "b_3487"
      [ns_server:debug,2021-08-19T11:51:57.149-07:00,n_2@127.0.0.1:kv<0.248.0>:collections:do_update_with_manifest:326]Perform operation {create_collection,"s1","c_0",[]} on manifest 34 of bucket "b_3487"
      ...
      [ns_server:debug,2021-08-19T11:51:57.149-07:00,n_2@127.0.0.1:kv<0.248.0>:collections:perform_operations:367]Operation {create_collection,"s1","c_0",[]} failed with error {collection_already_exists,
                                                                     "s1","c_0"}
      

      You can see it's operating on manifest 34.

      However, collection c_0 was dropped from the manifest a 90 milliseconds earlier on n_1. (Note that these nodes are all running on the same machine so the timestamps are pretty comparable.)

      [ns_server:debug,2021-08-19T11:51:57.059-07:00,n_1@127.0.0.1:kv<0.249.0>:collections:do_update_with_manifest:326]Perform operation {drop_collection,"s1","c_0"} on manifest 34 of bucket "b_3487"
      ...
      [ns_server:debug,2021-08-19T11:51:57.141-07:00,n_1@127.0.0.1:ns_audit<0.619.0>:ns_audit:handle_call:148]Audit drop_collection: [{local,{[{ip,<<"127.0.0.1">>},{port,9001}]}},
      			{remote,{[{ip,<<"127.0.0.1">>},{port,52842}]}},
      			{real_userid,{[{domain,builtin},
                                             {user,<<"<ud>Administrator</ud>">>}]}},
      			{timestamp,<<"2021-08-19T11:51:57.141-07:00">>},
      			{new_manifest_uid,<<"23">>},
      			{collection_name,<<"c_0">>},
      			{scope_name,<<"s1">>},
      			{bucket_name,<<"b_3487">>}]
      

      This was also against manifest 34 and it clearly succeeded.

      This is a 3 node cluster so collection manifest updates only need to reach 2 nodes before the change is considered committed, which means it's possible the third node hasn't received the updates before another manifest update arrives.

      This wouldn't have happened if the client I used (the Java client) sent the collection changes to the same nodes every time. However, on occasion the client will need to switch servers for these kinds of requests due to failover etc, so I don't believe it's a principled fix to this issue to change the client to always target the same server node.

      My current view that the way to address this behavior is to do a quorum read on the manifest before performing checks as I think this is quite a bit nicer for users and changes to the manifest should in general be infrequent enough that we can afford the quorum read.

      Alternatively we could add some kind of read-consistency option to the collection / scope management REST APIs. Though even in this case, I think the default should be to quorum read.

      I am interested in people's opinions on this topic.

      Attachments

        1. n_0.zip
          10.32 MB
        2. n_1.zip
          6.75 MB
        3. n_2.zip
          9.88 MB

        Issue Links

          For Gerrit Dashboard: MB-48063
          # Subject Branch Project Status CR V

          Activity

            People

              dfinlay Dave Finlay
              dfinlay Dave Finlay
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty