Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-22002

Metadata store that is immediately consistent

    XMLWordPrintable

Details

    • Improvement
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 5.0.0
    • 7.0.0
    • ns_server
    • Security Level: Public
    • Cluster Setup:
      2 nodes

      Node 1: kv index n1ql
      Node 2: index

    Description

      We need a metadata store that is immediately consistent.

      The metadata store should begin with nodes, buckets and collections and should eventually include things such as GSI indexes. An example use case for indexing that fails today that this would fix would be as follows:


      [GSI] Indexes can be created with duplicate index_names when one of the indexer node is in failed over state

      Steps:
      1. Configure cluster. Create bucket default and load the bucket with 10k items.
      Sample document:

      {
        "name": "pymc0",
        "age": 0,
        "index": 0,
        "body": "VTKGNKUHMP"
      }
      

      2. Create index ind_3 on default(name). It gets created on node 2.
      3. Failover Node 2.
      4. Create index ind_3 on default(age). It gets created on node 1.
      5. Rebalance in Node 2 with full recovery.

      Now there are 2 indexes with same name "ind_3" with different index definitions.
      when dropping ind_3, later one gets dropped. There is no way of controlling drop behaviour in such cases.

      Expectation is, indexes with duplicate index_names shouldn't be allowed to be created.

      Snapshot attached.
      Log Location:

      https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.101.zip
      https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.102.zip
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            prasanna.gholap Prasanna Gholap [X] (Inactive) created issue -
            siri Sriram Melkote (Inactive) made changes -
            Field Original Value New Value
            Fix Version/s Spock [ 12927 ]
            siri Sriram Melkote (Inactive) made changes -
            Priority Major [ 3 ] Critical [ 2 ]

            This is a known limitation as we do not have a globally consistent metadata store, please see DOC-664.

            There is a plan for ns_server team to provide a consistent metadata store. I'm not sure when it will happen.

            siri Sriram Melkote (Inactive) added a comment - This is a known limitation as we do not have a globally consistent metadata store, please see DOC-664 . There is a plan for ns_server team to provide a consistent metadata store. I'm not sure when it will happen.
            siri Sriram Melkote (Inactive) made changes -
            Assignee Sriram Melkote [ siri ] Dave Finlay [ dfinlay ]
            siri Sriram Melkote (Inactive) made changes -
            Description Steps:
            1. Configure cluster. Create bucket default and load the bucket with 10k items.
            Sample document:
            {noformat}
            {
              "name": "pymc0",
              "age": 0,
              "index": 0,
              "body": "VTKGNKUHMP"
            }
            {noformat}
            2. Create index ind_3 on default(name). It gets created on node 2.
            3. Failover Node 2.
            4. Create index ind_3 on default(age). It gets created on node 1.
            5. Rebalance in Node 2 with full recovery.

            Now there are 2 indexes with same name "ind_3" with different index definitions.
            when dropping ind_3, later one gets dropped. There is no way of controlling drop behaviour in such cases.

            Expectation is, indexes with duplicate index_names shouldn't be allowed to be created.

            Snapshot attached.
            Log Location:
            {noformat}
            https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.101.zip
            https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.102.zip
            {noformat}
            We need a metadata store that has better ACID properties.

            -----

            Original Problem:
            [GSI] Indexes can be created with duplicate index_names when one of the indexer node is in failed over state

            Steps:
            1. Configure cluster. Create bucket default and load the bucket with 10k items.
            Sample document:
            {noformat}
            {
              "name": "pymc0",
              "age": 0,
              "index": 0,
              "body": "VTKGNKUHMP"
            }
            {noformat}
            2. Create index ind_3 on default(name). It gets created on node 2.
            3. Failover Node 2.
            4. Create index ind_3 on default(age). It gets created on node 1.
            5. Rebalance in Node 2 with full recovery.

            Now there are 2 indexes with same name "ind_3" with different index definitions.
            when dropping ind_3, later one gets dropped. There is no way of controlling drop behaviour in such cases.

            Expectation is, indexes with duplicate index_names shouldn't be allowed to be created.

            Snapshot attached.
            Log Location:
            {noformat}
            https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.101.zip
            https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.102.zip
            {noformat}
            siri Sriram Melkote (Inactive) made changes -
            Summary [GSI] Indexes can be created with duplicate index_names when one of the indexer node is in failed over state Metadata store that has better ACID properties.
            siri Sriram Melkote (Inactive) made changes -
            Component/s secondary-index [ 11211 ]
            Component/s ns_server [ 10019 ]
            siri Sriram Melkote (Inactive) made changes -
            Fix Version/s Spock [ 12927 ]
            raju Raju Suravarjjala made changes -
            Fix Version/s Spock.Next [ 13624 ]
            siri Sriram Melkote (Inactive) made changes -
            Link This issue blocks MB-24011 [ MB-24011 ]
            siri Sriram Melkote (Inactive) added a comment - - edited

            This also causes us problems trying to operate reasonably when some nodes have failed. As we do not have leader election, this means that each node is authoritative of (only) the indexes it hosts. When a node fails, this leads to unpredictable (timing dependent) behavior on if other nodes see the indexes it contained as offline (i.e., metadata seen but node absent) or missing (never saw metadata).

            siri Sriram Melkote (Inactive) added a comment - - edited This also causes us problems trying to operate reasonably when some nodes have failed. As we do not have leader election, this means that each node is authoritative of (only) the indexes it hosts. When a node fails, this leads to unpredictable (timing dependent) behavior on if other nodes see the indexes it contained as offline (i.e., metadata seen but node absent) or missing (never saw metadata).
            siri Sriram Melkote (Inactive) made changes -
            Security Private [ 10010 ]
            siri Sriram Melkote (Inactive) made changes -
            Summary Metadata store that has better ACID properties. Metadata store that is immediately consistent
            siri Sriram Melkote (Inactive) made changes -
            Description We need a metadata store that has better ACID properties.

            -----

            Original Problem:
            [GSI] Indexes can be created with duplicate index_names when one of the indexer node is in failed over state

            Steps:
            1. Configure cluster. Create bucket default and load the bucket with 10k items.
            Sample document:
            {noformat}
            {
              "name": "pymc0",
              "age": 0,
              "index": 0,
              "body": "VTKGNKUHMP"
            }
            {noformat}
            2. Create index ind_3 on default(name). It gets created on node 2.
            3. Failover Node 2.
            4. Create index ind_3 on default(age). It gets created on node 1.
            5. Rebalance in Node 2 with full recovery.

            Now there are 2 indexes with same name "ind_3" with different index definitions.
            when dropping ind_3, later one gets dropped. There is no way of controlling drop behaviour in such cases.

            Expectation is, indexes with duplicate index_names shouldn't be allowed to be created.

            Snapshot attached.
            Log Location:
            {noformat}
            https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.101.zip
            https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.102.zip
            {noformat}
            We need a metadata store that is immediately consistent.

            -----

            Original Problem:
            [GSI] Indexes can be created with duplicate index_names when one of the indexer node is in failed over state

            Steps:
            1. Configure cluster. Create bucket default and load the bucket with 10k items.
            Sample document:
            {noformat}
            {
              "name": "pymc0",
              "age": 0,
              "index": 0,
              "body": "VTKGNKUHMP"
            }
            {noformat}
            2. Create index ind_3 on default(name). It gets created on node 2.
            3. Failover Node 2.
            4. Create index ind_3 on default(age). It gets created on node 1.
            5. Rebalance in Node 2 with full recovery.

            Now there are 2 indexes with same name "ind_3" with different index definitions.
            when dropping ind_3, later one gets dropped. There is no way of controlling drop behaviour in such cases.

            Expectation is, indexes with duplicate index_names shouldn't be allowed to be created.

            Snapshot attached.
            Log Location:
            {noformat}
            https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.101.zip
            https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.102.zip
            {noformat}
            jliang John Liang added a comment - - edited

            There are a couple of things that a consistent store can provide

            1) unique constraint check

            2) linearizability.    Let's say there are two requests (A and B) happened concurrently, and those requests are ordered such that request A is performed before B.    The same request ordering must be preserved for every peer, such that each peer will arrive at the same internal state after processing those requests.  In other words, each peer would handle request A first, then request B.    For index, if there are 2 alter-index statements happen concurrently.   We should either reject the second index statements, or allow each indexer to execute both alter-index statements in the same exact order.   For each peer, it cannot (1) miss a request (2) execute requests more than once (3) execute requests in different orders.

            -------------

            Dave brought up a question before regarding client availability/consistency semantics (in case master is unavailable).   We probably should think about this a little more.    For a client that is a follower (a peer that does not have voting rights), it could mean no write availability but read availability (on stale metadata).    

             

             

            jliang John Liang added a comment - - edited There are a couple of things that a consistent store can provide 1) unique constraint check 2) linearizability.    Let's say there are two requests (A and B) happened concurrently, and those requests are ordered such that request A is performed before B.    The same request ordering must be preserved for every peer, such that each peer will arrive at the same internal state after processing those requests.  In other words, each peer would handle request A first, then request B.    For index, if there are 2 alter-index statements happen concurrently.   We should either reject the second index statements, or allow each indexer to execute both alter-index statements in the same exact order.   For each peer, it cannot (1) miss a request (2) execute requests more than once (3) execute requests in different orders. ------------- Dave brought up a question before regarding client availability/consistency semantics (in case master is unavailable).   We probably should think about this a little more.    For a client that is a follower (a peer that does not have voting rights), it could mean no write availability but read availability (on stale metadata).        
            lynn.straus Lynn Straus made changes -
            Fix Version/s vulcan [ 14610 ]
            dfinlay Dave Finlay made changes -
            Issue Type Bug [ 1 ] Improvement [ 4 ]
            dfinlay Dave Finlay added a comment -

            This is an significant feature that we won't get to in Vulcan. I'm moving it to be considered in Mad Hatter - the next Server release.

            dfinlay Dave Finlay added a comment - This is an significant feature that we won't get to in Vulcan. I'm moving it to be considered in Mad Hatter - the next Server release.
            dfinlay Dave Finlay made changes -
            Fix Version/s Mad-Hatter [ 15037 ]
            Fix Version/s Spock.Next [ 13624 ]
            Fix Version/s vulcan [ 14610 ]

            Hi - can you please involve me in prioritizing this for Mad Hatter? I don't want it to get bumped off without giving me a chance to make a case for this feature.

            siri Sriram Melkote (Inactive) added a comment - Hi - can you please involve me in prioritizing this for Mad Hatter? I don't want it to get bumped off without giving me a chance to make a case for this feature.
            siri Sriram Melkote (Inactive) made changes -
            Link This issue relates to MB-31074 [ MB-31074 ]
            prathibha Prathibha Bisarahalli (Inactive) made changes -
            Link This issue blocks CBSE-5852 [ CBSE-5852 ]
            dfinlay Dave Finlay made changes -
            Fix Version/s Cheshire-Cat [ 15915 ]
            Fix Version/s Mad-Hatter [ 15037 ]
            dfinlay Dave Finlay made changes -
            Security Private [ 10010 ] Public [ 10011 ]
            ajit.yagaty Ajit Yagaty [X] (Inactive) made changes -
            Link This issue relates to CBSE-6979 [ CBSE-6979 ]
            avleen.khanuja Avleen Singh Khanuja (Inactive) made changes -
            Link This issue blocks CBSE-8567 [ CBSE-8567 ]
            dfinlay Dave Finlay made changes -
            Labels functional-test Cheshire-Cat-Committed functional-test
            nirvair.singh Nirvair Singh Bhinder made changes -
            Link This issue relates to CBSE-8879 [ CBSE-8879 ]
            dfinlay Dave Finlay made changes -
            Assignee Dave Finlay [ dfinlay ] Aliaksey Artamonau [ aliaksey artamonau ]
            dfinlay Dave Finlay made changes -
            Link This issue blocks MB-27384 [ MB-27384 ]
            simon.dew Simon Dew made changes -
            Remote Link This issue links to "Page (Couchbase, Inc. Wiki)" [ 20706 ]
            dfinlay Dave Finlay added a comment -

            With quorum failover merged, I will resolve this ticket.

            dfinlay Dave Finlay added a comment - With quorum failover merged, I will resolve this ticket.
            dfinlay Dave Finlay made changes -
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Resolved [ 5 ]
            Balakumaran.Gopal Balakumaran Gopal made changes -
            Assignee Aliaksey Artamonau [ aliaksey artamonau ] Mihir Kamdar [ mihir.kamdar ]

            Hi Dave Finlay, please answer my question above when you have a minute.

            nirvair.singh Nirvair Singh Bhinder added a comment - Hi Dave Finlay , please answer my question above when you have a minute.
            mihir.kamdar Mihir Kamdar made changes -
            Assignee Mihir Kamdar [ mihir.kamdar ] Balakumaran Gopal [ balakumaran.gopal ]
            Balakumaran.Gopal Balakumaran Gopal made changes -
            Assignee Balakumaran Gopal [ balakumaran.gopal ] Mihir Kamdar [ mihir.kamdar ]
            mihir.kamdar Mihir Kamdar made changes -
            Assignee Mihir Kamdar [ mihir.kamdar ] Hemant Rajput [ hemant.rajput ]

            Validated on 7.0.0-5017

             

            Duplicate indexes are getting created.

             

            Steps to reproduce:

            1. Create a 3 node cluster kv:n1ql-index-index.
            2. Load 10k docs
            3. create index idx - 

              CREATE INDEX idx ON test.test_scope_1.test_collection_1(name);
              

            1. Failover node on which index resides
            2. Create index idx again - 

              CREATE INDEX idx ON test.test_scope_1.test_collection_1(join_day);
              

            1. Add Failover node with full recovery.
            2. Check the both indexes are available in cluster.

             

            hemant.rajput Hemant Rajput added a comment - Validated on 7.0.0-5017   Duplicate indexes are getting created.   Steps to reproduce: Create a 3 node cluster kv:n1ql-index-index. Load 10k docs create index idx -  CREATE INDEX idx ON test.test_scope_1.test_collection_1(name); Failover node on which index resides Create index idx again -  CREATE INDEX idx ON test.test_scope_1.test_collection_1(join_day); Add Failover node with full recovery. Check the both indexes are available in cluster.  
            hemant.rajput Hemant Rajput made changes -
            Resolution Fixed [ 1 ]
            Status Resolved [ 5 ] Reopened [ 4 ]
            hemant.rajput Hemant Rajput made changes -
            Attachment node2-cb600-centos7.vagrants.zip [ 137965 ]
            Attachment node3-cb600-centos7.vagrants.zip [ 137966 ]
            hemant.rajput Hemant Rajput made changes -
            Attachment node1-cb600-centos7.vagrants.zip [ 137967 ]
            hemant.rajput Hemant Rajput made changes -
            dfinlay Dave Finlay added a comment - - edited

            Hi Hemant - we have a metadata store that is immediately consistent, but just not for indexes yet. This will happen post CC. I'll clone this ticket for that purpose and then resolve this one.

            To verify this, you can try creating buckets or collections with duplicate names.

            dfinlay Dave Finlay added a comment - - edited Hi Hemant - we have a metadata store that is immediately consistent, but just not for indexes yet. This will happen post CC. I'll clone this ticket for that purpose and then resolve this one. To verify this, you can try creating buckets or collections with duplicate names.
            dfinlay Dave Finlay made changes -
            Link This issue is cloned by MB-46007 [ MB-46007 ]
            dfinlay Dave Finlay made changes -
            Description We need a metadata store that is immediately consistent.

            -----

            Original Problem:
            [GSI] Indexes can be created with duplicate index_names when one of the indexer node is in failed over state

            Steps:
            1. Configure cluster. Create bucket default and load the bucket with 10k items.
            Sample document:
            {noformat}
            {
              "name": "pymc0",
              "age": 0,
              "index": 0,
              "body": "VTKGNKUHMP"
            }
            {noformat}
            2. Create index ind_3 on default(name). It gets created on node 2.
            3. Failover Node 2.
            4. Create index ind_3 on default(age). It gets created on node 1.
            5. Rebalance in Node 2 with full recovery.

            Now there are 2 indexes with same name "ind_3" with different index definitions.
            when dropping ind_3, later one gets dropped. There is no way of controlling drop behaviour in such cases.

            Expectation is, indexes with duplicate index_names shouldn't be allowed to be created.

            Snapshot attached.
            Log Location:
            {noformat}
            https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.101.zip
            https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.102.zip
            {noformat}
            We need a metadata store that is immediately consistent.

            The metadata store should begin with nodes, buckets and collections and should eventually include things such as GSI indexes. An example use case for indexing that fails today that this would fix would be as follows:

            -----
            [GSI] Indexes can be created with duplicate index_names when one of the indexer node is in failed over state

            Steps:
            1. Configure cluster. Create bucket default and load the bucket with 10k items.
            Sample document:
            {noformat}
            {
              "name": "pymc0",
              "age": 0,
              "index": 0,
              "body": "VTKGNKUHMP"
            }
            {noformat}
            2. Create index ind_3 on default(name). It gets created on node 2.
            3. Failover Node 2.
            4. Create index ind_3 on default(age). It gets created on node 1.
            5. Rebalance in Node 2 with full recovery.

            Now there are 2 indexes with same name "ind_3" with different index definitions.
            when dropping ind_3, later one gets dropped. There is no way of controlling drop behaviour in such cases.

            Expectation is, indexes with duplicate index_names shouldn't be allowed to be created.

            Snapshot attached.
            Log Location:
            {noformat}
            https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.101.zip
            https://s3.amazonaws.com/bugdb/jira/duplicate_indexes/collectinfo-2016-12-19T082442-ns_1%4010.111.170.102.zip
            {noformat}
            dfinlay Dave Finlay made changes -
            Resolution Fixed [ 1 ]
            Status Reopened [ 4 ] Resolved [ 5 ]
            mihir.kamdar Mihir Kamdar made changes -
            Assignee Hemant Rajput [ hemant.rajput ] Balakumaran Gopal [ balakumaran.gopal ]

            Thanks Dave. Balakumaran Gopal I think you have tests for Bucket and collection duplicate names. Can you pls validate the ticket based on that ?

            mihir.kamdar Mihir Kamdar added a comment - Thanks Dave. Balakumaran Gopal I think you have tests for Bucket and collection duplicate names. Can you pls validate the ticket based on that ?

            Marking this closed based on the tests of drop/recreate for collections.

            Balakumaran.Gopal Balakumaran Gopal added a comment - Marking this closed based on the tests of drop/recreate for collections.
            Balakumaran.Gopal Balakumaran Gopal made changes -
            Status Resolved [ 5 ] Closed [ 6 ]
            lynn.straus Lynn Straus made changes -
            Fix Version/s 7.0.0 [ 17233 ]
            lynn.straus Lynn Straus made changes -
            Fix Version/s Cheshire-Cat [ 15915 ]

            People

              Balakumaran.Gopal Balakumaran Gopal
              prasanna.gholap Prasanna Gholap [X] (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              25 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty