Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-22002

Metadata store that is immediately consistent

    XMLWordPrintable

Details

    • Improvement
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 5.0.0
    • 7.0.0
    • ns_server
    • Security Level: Public
    • Cluster Setup:
      2 nodes

      Node 1: kv index n1ql
      Node 2: index

    Description

      We need a metadata store that is immediately consistent.

      The metadata store should begin with nodes, buckets and collections and should eventually include things such as GSI indexes. An example use case for indexing that fails today that this would fix would be as follows:


      [GSI] Indexes can be created with duplicate index_names when one of the indexer node is in failed over state

      Steps:
      1. Configure cluster. Create bucket default and load the bucket with 10k items.
      Sample document:

      {
        "name": "pymc0",
        "age": 0,
        "index": 0,
        "body": "VTKGNKUHMP"
      }
      

      2. Create index ind_3 on default(name). It gets created on node 2.
      3. Failover Node 2.
      4. Create index ind_3 on default(age). It gets created on node 1.
      5. Rebalance in Node 2 with full recovery.

      Now there are 2 indexes with same name "ind_3" with different index definitions.
      when dropping ind_3, later one gets dropped. There is no way of controlling drop behaviour in such cases.

      Expectation is, indexes with duplicate index_names shouldn't be allowed to be created.

      Snapshot attached.
      Log Location:

      https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.101.zip
      https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.102.zip
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            This is a known limitation as we do not have a globally consistent metadata store, please see DOC-664.

            There is a plan for ns_server team to provide a consistent metadata store. I'm not sure when it will happen.

            siri Sriram Melkote (Inactive) added a comment - This is a known limitation as we do not have a globally consistent metadata store, please see DOC-664 . There is a plan for ns_server team to provide a consistent metadata store. I'm not sure when it will happen.
            siri Sriram Melkote (Inactive) added a comment - - edited

            This also causes us problems trying to operate reasonably when some nodes have failed. As we do not have leader election, this means that each node is authoritative of (only) the indexes it hosts. When a node fails, this leads to unpredictable (timing dependent) behavior on if other nodes see the indexes it contained as offline (i.e., metadata seen but node absent) or missing (never saw metadata).

            siri Sriram Melkote (Inactive) added a comment - - edited This also causes us problems trying to operate reasonably when some nodes have failed. As we do not have leader election, this means that each node is authoritative of (only) the indexes it hosts. When a node fails, this leads to unpredictable (timing dependent) behavior on if other nodes see the indexes it contained as offline (i.e., metadata seen but node absent) or missing (never saw metadata).
            jliang John Liang added a comment - - edited

            There are a couple of things that a consistent store can provide

            1) unique constraint check

            2) linearizability.    Let's say there are two requests (A and B) happened concurrently, and those requests are ordered such that request A is performed before B.    The same request ordering must be preserved for every peer, such that each peer will arrive at the same internal state after processing those requests.  In other words, each peer would handle request A first, then request B.    For index, if there are 2 alter-index statements happen concurrently.   We should either reject the second index statements, or allow each indexer to execute both alter-index statements in the same exact order.   For each peer, it cannot (1) miss a request (2) execute requests more than once (3) execute requests in different orders.

            -------------

            Dave brought up a question before regarding client availability/consistency semantics (in case master is unavailable).   We probably should think about this a little more.    For a client that is a follower (a peer that does not have voting rights), it could mean no write availability but read availability (on stale metadata).    

             

             

            jliang John Liang added a comment - - edited There are a couple of things that a consistent store can provide 1) unique constraint check 2) linearizability.    Let's say there are two requests (A and B) happened concurrently, and those requests are ordered such that request A is performed before B.    The same request ordering must be preserved for every peer, such that each peer will arrive at the same internal state after processing those requests.  In other words, each peer would handle request A first, then request B.    For index, if there are 2 alter-index statements happen concurrently.   We should either reject the second index statements, or allow each indexer to execute both alter-index statements in the same exact order.   For each peer, it cannot (1) miss a request (2) execute requests more than once (3) execute requests in different orders. ------------- Dave brought up a question before regarding client availability/consistency semantics (in case master is unavailable).   We probably should think about this a little more.    For a client that is a follower (a peer that does not have voting rights), it could mean no write availability but read availability (on stale metadata).        
            dfinlay Dave Finlay added a comment -

            This is an significant feature that we won't get to in Vulcan. I'm moving it to be considered in Mad Hatter - the next Server release.

            dfinlay Dave Finlay added a comment - This is an significant feature that we won't get to in Vulcan. I'm moving it to be considered in Mad Hatter - the next Server release.

            Hi - can you please involve me in prioritizing this for Mad Hatter? I don't want it to get bumped off without giving me a chance to make a case for this feature.

            siri Sriram Melkote (Inactive) added a comment - Hi - can you please involve me in prioritizing this for Mad Hatter? I don't want it to get bumped off without giving me a chance to make a case for this feature.
            dfinlay Dave Finlay added a comment -

            With quorum failover merged, I will resolve this ticket.

            dfinlay Dave Finlay added a comment - With quorum failover merged, I will resolve this ticket.

            Hi Dave Finlay, please answer my question above when you have a minute.

            nirvair.singh Nirvair Singh Bhinder added a comment - Hi Dave Finlay , please answer my question above when you have a minute.

            Validated on 7.0.0-5017

             

            Duplicate indexes are getting created.

             

            Steps to reproduce:

            1. Create a 3 node cluster kv:n1ql-index-index.
            2. Load 10k docs
            3. create index idx - 

              CREATE INDEX idx ON test.test_scope_1.test_collection_1(name);
              

            1. Failover node on which index resides
            2. Create index idx again - 

              CREATE INDEX idx ON test.test_scope_1.test_collection_1(join_day);
              

            1. Add Failover node with full recovery.
            2. Check the both indexes are available in cluster.

             

            hemant.rajput Hemant Rajput added a comment - Validated on 7.0.0-5017   Duplicate indexes are getting created.   Steps to reproduce: Create a 3 node cluster kv:n1ql-index-index. Load 10k docs create index idx -  CREATE INDEX idx ON test.test_scope_1.test_collection_1(name); Failover node on which index resides Create index idx again -  CREATE INDEX idx ON test.test_scope_1.test_collection_1(join_day); Add Failover node with full recovery. Check the both indexes are available in cluster.  
            dfinlay Dave Finlay added a comment - - edited

            Hi Hemant - we have a metadata store that is immediately consistent, but just not for indexes yet. This will happen post CC. I'll clone this ticket for that purpose and then resolve this one.

            To verify this, you can try creating buckets or collections with duplicate names.

            dfinlay Dave Finlay added a comment - - edited Hi Hemant - we have a metadata store that is immediately consistent, but just not for indexes yet. This will happen post CC. I'll clone this ticket for that purpose and then resolve this one. To verify this, you can try creating buckets or collections with duplicate names.

            Thanks Dave. Balakumaran Gopal I think you have tests for Bucket and collection duplicate names. Can you pls validate the ticket based on that ?

            mihir.kamdar Mihir Kamdar added a comment - Thanks Dave. Balakumaran Gopal I think you have tests for Bucket and collection duplicate names. Can you pls validate the ticket based on that ?

            Marking this closed based on the tests of drop/recreate for collections.

            Balakumaran.Gopal Balakumaran Gopal added a comment - Marking this closed based on the tests of drop/recreate for collections.

            People

              Balakumaran.Gopal Balakumaran Gopal
              prasanna.gholap Prasanna Gholap [X] (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              25 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty