Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-4781

start_key_docid returns unexpected (or unintuitive) results

    Details

      Description

      Below are the results from a query that returns multiple rows for a single key. The third document has id "0-8fbe114" but if I apply the start_key_docid filter I still get the same exact results with the list starting at doc id "0-14479a7." although the view should've returned a subset. I also tried start_key_docid in combination with start_key and end_key but neither of these returns a subset starting at the requested docid.

      QUERY WITH KEY = [2008,11,1]
      curl "http://10.1.2.104:8092/default/_design/dev_test_view-ed4bf91/_view/dev_test_view-ed4bf91?full_set=true&key=%5B2008%2C11%2C1%5D&connection_timeout=60000&debug=true" > query_with_key

      {"id":"0-2857e2f","key":[2008,11,1],"value":{"_id":"0-2857e2f","_rev":"1-000040a864a752c20000024c00000000","$flags":0,"$expiration":0,"name":"employee-0-...
      {"id":"0-2857e2f","key":[2008,11,1],"value":{"_id":"0-328e876","_rev":"1-000040a8b60c1357000001b800000000","$flags":0,"$expiration":0,"name":"employee-0-...
      {"id":"0-91f1a76","key":[2008,11,1],"value":{"_id":"0-91f1a76","_rev":"1-000040a6f0ddebd30000023500000000","$flags":0,"$expiration":0,"name":"employee-0-...
      {"id":"1-2857e2f","key":[2008,11,1],"value":{"_id":"1-2857e2f","_rev":"1-000040a86e8ccc2d0000024c00000000","$flags":0,"$expiration":0,"name":"employee-1-...
      {"id":"1-2857e2f","key":[2008,11,1],"value":{"_id":"1-328e876","_rev":"1-000040a8bffda165000001b800000000","$flags":0,"$expiration":0,"name":"employee-1-...
      {"id":"1-91f1a76","key":[2008,11,1],"value":{"_id":"1-91f1a76","_rev":"1-000040a7b360b9840000023500000000","$flags":0,"$expiration":0,"name":"employee-1-...
      {"id":"10-2857e2f","key":[2008,11,1],"value":{"_id":"10-2857e2f","_rev":"1-000040a9848f822f0000024f00000000","$flags":0,"$expiration":0,"name":"employee-10-...

      QUERY WITH KEY = [2008,11,1] and START_KEY_DOCID = "0-8fbe114"
      curl "http://10.1.2.104:8092/default/_design/dev_test_view-ed4bf91/_view/dev_test_view-ed4bf91?full_set=true&key=%5B2008%2C11%2C1%5D&start_key_docid=%220-8fbe114%22&connection_timeout=60000&limit=10&skip=0" > query_with_startkeydocid

      ....results are the same as previous query, although I expected them to start with the requested doc_id...

      Also noticed that "id" and "_id" are mismatch - not sure if that has something to do with the behavior of this filter.

      1. query_with_key
        208 kB
        Tommie McAfee
      2. query_with_key_and_startkeydocid
        208 kB
        Tommie McAfee
      # Subject Project Status CR V
      For Gerrit Dashboard: &For+MB-4781=message:MB-4781

        Activity

        Hide
        filipe manana filipe manana added a comment -

        "start_key_doc_id" (same as startkey_docid) is meant to be used together with "start_key" (same as startkey), not "key".

        The _id is because you're apparently emitting the documents themselves as map values. This is how it works in the Couch since ever, general rule: meta information in docs has a _ prefix, everywhere else (views, changes feed) it doesn't.

        Show
        filipe manana filipe manana added a comment - "start_key_doc_id" (same as startkey_docid) is meant to be used together with "start_key" (same as startkey), not "key". The _id is because you're apparently emitting the documents themselves as map values. This is how it works in the Couch since ever, general rule: meta information in docs has a _ prefix, everywhere else (views, changes feed) it doesn't.
        Hide
        tommie Tommie McAfee added a comment -

        Right, I was advised to try with start_key, but results are the same...and are not starting at requested id.

        Perhaps only the filters that can be used in conjunction with say "key" should be selectable in the UI. Otherwise a non-couch user may be expecting these filters to do something.

        Show
        tommie Tommie McAfee added a comment - Right, I was advised to try with start_key, but results are the same...and are not starting at requested id. Perhaps only the filters that can be used in conjunction with say "key" should be selectable in the UI. Otherwise a non-couch user may be expecting these filters to do something.
        Hide
        filipe manana filipe manana added a comment -

        Tommie, you specified "start_key_docid" - this doesn't exist - use "startkey_docid" or "start_key_doc_id".

        Originally, in couch every name uses the _ logic to separate words - all except startkey, endkey and startkey_docid and endkey_docid. For these 4, the aliases "start_key", "end_key", "start_key_doc_id" and "end_key_doc_id" were added upstream (I did it) and to our codebase.

        Show
        filipe manana filipe manana added a comment - Tommie, you specified "start_key_docid" - this doesn't exist - use "startkey_docid" or "start_key_doc_id". Originally, in couch every name uses the _ logic to separate words - all except startkey, endkey and startkey_docid and endkey_docid. For these 4, the aliases "start_key", "end_key", "start_key_doc_id" and "end_key_doc_id" were added upstream (I did it) and to our codebase.
        Hide
        tommie Tommie McAfee added a comment -

        UI bug there in variable naming as this "start_key_docid" was added to query via couchbase 2.0 UI.

        Also, I tried using startkey_docid and "start_key_doc_id" , but neither seem to be filtering the results.

        Show
        tommie Tommie McAfee added a comment - UI bug there in variable naming as this "start_key_docid" was added to query via couchbase 2.0 UI. Also, I tried using startkey_docid and "start_key_doc_id" , but neither seem to be filtering the results.
        Hide
        filipe manana filipe manana added a comment -

        Tommie, do you think you can write a simple script to create that dataset and do the 2 queries?
        I would like to try it locally.
        thanks

        Show
        filipe manana filipe manana added a comment - Tommie, do you think you can write a simple script to create that dataset and do the 2 queries? I would like to try it locally. thanks
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        Tommie,

        can you please provide a test case which Filipe can run against cluster_run with one node ?

        Show
        farshid Farshid Ghods (Inactive) added a comment - Tommie, can you please provide a test case which Filipe can run against cluster_run with one node ?
        Hide
        FilipeManana Filipe Manana (Inactive) added a comment -

        Waiting for the testrunner test or a standalone script to reproduce.

        Show
        FilipeManana Filipe Manana (Inactive) added a comment - Waiting for the testrunner test or a standalone script to reproduce.
        Hide
        tommie Tommie McAfee added a comment -

        Filipe, maybe you also have some data to give this a quick try.
        I tried it on a simple set of integers from data loaded in test runner and doesn't look like its working (startkey_docid = 684a59d-480)

        http://10.17.3.56:9500/default/_design/dev_test_view-684a59d/_view/dev_test_view-684a59d?start_key=200&startkey_docid=%22684a59d-480%22&connection_timeout=60000&limit=10&skip=0

        {"total_rows":61,"rows":[

        {"id":"684a59d-262","key":262,"value":null}

        ,

        {"id":"684a59d-480","key":480,"value":null}

        ,

        {"id":"684a59d-510","key":510,"value":null}

        ,
        .....

        also tried start_key_doc_id and start_key_docid

        Show
        tommie Tommie McAfee added a comment - Filipe, maybe you also have some data to give this a quick try. I tried it on a simple set of integers from data loaded in test runner and doesn't look like its working (startkey_docid = 684a59d-480) http://10.17.3.56:9500/default/_design/dev_test_view-684a59d/_view/dev_test_view-684a59d?start_key=200&startkey_docid=%22684a59d-480%22&connection_timeout=60000&limit=10&skip=0 {"total_rows":61,"rows":[ {"id":"684a59d-262","key":262,"value":null} , {"id":"684a59d-480","key":480,"value":null} , {"id":"684a59d-510","key":510,"value":null} , ..... also tried start_key_doc_id and start_key_docid
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        i also noticed that debug=true does not return any extra info .

        http://10.17.3.56:9500/default/_design/dev_test_view-684a59d/_view/dev_test_view-684a59d?start_key=200&startkey_docid=%22684a59d-480%22&connection_timeout=60000&limit=10&skip=0&debug=true

        {"total_rows":61,"rows":[

        {"id":"684a59d-262","key":262,"value":null}

        ,

        {"id":"684a59d-480","key":480,"value":null}

        ,

        {"id":"684a59d-510","key":510,"value":null}

        ,

        {"id":"684a59d-661","key":661,"value":null}

        ,

        {"id":"684a59d-944","key":944,"value":null}

        ,

        {"id":"684a59d-1175","key":1175,"value":null}

        ,

        {"id":"684a59d-1204","key":1204,"value":null}

        ,

        {"id":"684a59d-1394","key":1394,"value":null}

        ,

        {"id":"684a59d-1576","key":1576,"value":null}

        ,

        {"id":"684a59d-1607","key":1607,"value":null}

        ]
        }

        Show
        farshid Farshid Ghods (Inactive) added a comment - i also noticed that debug=true does not return any extra info . http://10.17.3.56:9500/default/_design/dev_test_view-684a59d/_view/dev_test_view-684a59d?start_key=200&startkey_docid=%22684a59d-480%22&connection_timeout=60000&limit=10&skip=0&debug=true {"total_rows":61,"rows":[ {"id":"684a59d-262","key":262,"value":null} , {"id":"684a59d-480","key":480,"value":null} , {"id":"684a59d-510","key":510,"value":null} , {"id":"684a59d-661","key":661,"value":null} , {"id":"684a59d-944","key":944,"value":null} , {"id":"684a59d-1175","key":1175,"value":null} , {"id":"684a59d-1204","key":1204,"value":null} , {"id":"684a59d-1394","key":1394,"value":null} , {"id":"684a59d-1576","key":1576,"value":null} , {"id":"684a59d-1607","key":1607,"value":null} ] }
        Hide
        tommie Tommie McAfee added a comment -

        Hi Filipe,

        Still not getting the start_key_docid filter to work as expected. There is now a test in testrunner that you can use to reproduce this:

        python testrunner -i <resource_file> -t viewquerytests.ViewQueryTests.test_simple_dataset_startkey_endkey_docid_queries

        2012-02-13 19:41:14,686 - root - INFO - Quering view dev_test_view-11f6a22 with params:

        {'debug': 'true', 'start_key': 5000, 'startkey_docid': '"11f6a22-5100"'}

        2012-02-13 19:41:14,687 - root - INFO - Params

        {'debug': 'true', 'start_key': 5000, 'connection_timeout': 60000, 'startkey_docid': '"11f6a22-5100"', 'full_set': True}

        2012-02-13 19:41:14,687 - root - INFO - index query url: http://10.2.2.10:8091/couchBase/default/_design/dev_test_view-11f6a22/_view/dev_test_view-11f6a22?debug=true&start_key=5000&connection_timeout=60000&startkey_docid="11f6a22-5100"&full_set=true
        2012-02-13 19:41:14,906 - root - INFO - view returned in 0.21882891655 seconds
        2012-02-13 19:41:14,906 - root - INFO - was able to get view results after trying 1 times
        2012-02-13 19:41:14,917 - root - INFO - key_set has 5000 elements
        2012-02-13 19:41:14,917 - root - INFO - retrieved 5000 keys expected: 4900

        Show
        tommie Tommie McAfee added a comment - Hi Filipe, Still not getting the start_key_docid filter to work as expected. There is now a test in testrunner that you can use to reproduce this: python testrunner -i <resource_file> -t viewquerytests.ViewQueryTests.test_simple_dataset_startkey_endkey_docid_queries 2012-02-13 19:41:14,686 - root - INFO - Quering view dev_test_view-11f6a22 with params: {'debug': 'true', 'start_key': 5000, 'startkey_docid': '"11f6a22-5100"'} 2012-02-13 19:41:14,687 - root - INFO - Params {'debug': 'true', 'start_key': 5000, 'connection_timeout': 60000, 'startkey_docid': '"11f6a22-5100"', 'full_set': True} 2012-02-13 19:41:14,687 - root - INFO - index query url: http://10.2.2.10:8091/couchBase/default/_design/dev_test_view-11f6a22/_view/dev_test_view-11f6a22?debug=true&start_key=5000&connection_timeout=60000&startkey_docid= "11f6a22-5100"&full_set=true 2012-02-13 19:41:14,906 - root - INFO - view returned in 0.21882891655 seconds 2012-02-13 19:41:14,906 - root - INFO - was able to get view results after trying 1 times 2012-02-13 19:41:14,917 - root - INFO - key_set has 5000 elements 2012-02-13 19:41:14,917 - root - INFO - retrieved 5000 keys expected: 4900
        Hide
        tommie Tommie McAfee added a comment -

        Filipe,

        I have this query result from using start_key = 20:

        {"total_rows":30000,"rows":[

        {"id":"3a25fe7-20","key":20,"value":null}

        ,

        {"id":"da9d0f6-20","key":20,"value":null}

        ,

        {"id":"eda9e3d-20","key":20,"value":null}

        ,

        {"id":"3a25fe7-21","key":21,"value":null}

        ,

        {"id":"da9d0f6-21","key":21,"value":null}

        ,

        {"id":"eda9e3d-21","key":21,"value":null}

        ,

        {"id":"3a25fe7-22","key":22,"value":null}

        ,

        {"id":"da9d0f6-22","key":22,"value":null}

        ,

        {"id":"eda9e3d-22","key":22,"value":null}

        ,

        {"id":"3a25fe7-23","key":23,"value":null}

        ]
        }

        attempting to set start_key_docid to "da9d0f6-20" returns the same number of rows, but the first duplicate key should be skipped:

        http://127.0.0.1:9500/default/_design/dev_test_view-9460592/_view/dev_test_view-9460592?full_set=true&debug=true&start_key=40&start_key_docid=%22a67408a-40%22&connection_timeout=60000&limit=10&skip=0

        Show
        tommie Tommie McAfee added a comment - Filipe, I have this query result from using start_key = 20: {"total_rows":30000,"rows":[ {"id":"3a25fe7-20","key":20,"value":null} , {"id":"da9d0f6-20","key":20,"value":null} , {"id":"eda9e3d-20","key":20,"value":null} , {"id":"3a25fe7-21","key":21,"value":null} , {"id":"da9d0f6-21","key":21,"value":null} , {"id":"eda9e3d-21","key":21,"value":null} , {"id":"3a25fe7-22","key":22,"value":null} , {"id":"da9d0f6-22","key":22,"value":null} , {"id":"eda9e3d-22","key":22,"value":null} , {"id":"3a25fe7-23","key":23,"value":null} ] } attempting to set start_key_docid to "da9d0f6-20" returns the same number of rows, but the first duplicate key should be skipped: http://127.0.0.1:9500/default/_design/dev_test_view-9460592/_view/dev_test_view-9460592?full_set=true&debug=true&start_key=40&start_key_docid=%22a67408a-40%22&connection_timeout=60000&limit=10&skip=0
        Hide
        tommie Tommie McAfee added a comment -
        Show
        tommie Tommie McAfee added a comment - This is basically a bug with UI because it uses start_key_docid instead of 'startkey_docid' this query works - http://127.0.0.1:9500/default/_design/dev_test_view-9460592/_view/dev_test_view-9460592?full_set=true&startkey=40&startkey_docid=a67408a-40&connection_timeout=60000&limit=10&skip=0
        Hide
        BigBlueHat Benjamin Young added a comment -

        Yeah, looks like it could have also been start_key_doc_id (per Filipe's first comment).

        Will fix.

        Show
        BigBlueHat Benjamin Young added a comment - Yeah, looks like it could have also been start_key_doc_id (per Filipe's first comment). Will fix.
        Hide
        BigBlueHat Benjamin Young added a comment -

        Resolved: http://review.couchbase.org/13336

        Feel free to close this ticket (or re-open it) based on the final review/merging.

        Show
        BigBlueHat Benjamin Young added a comment - Resolved: http://review.couchbase.org/13336 Feel free to close this ticket (or re-open it) based on the final review/merging.
        Hide
        thuan Thuan Nguyen added a comment -

        Integrated in github-ns-server-2-0 #303 (See http://qa.hq.northscale.net/job/github-ns-server-2-0/303/)
        removing underscores from startkey/endkey fields. MB-4781 (Revision 9bc2bde6f88d43b273f7278e18a91c3871404cf2)

        Result = SUCCESS
        Aliaksey Kandratsenka :
        Files :

        • priv/public/index.html
        Show
        thuan Thuan Nguyen added a comment - Integrated in github-ns-server-2-0 #303 (See http://qa.hq.northscale.net/job/github-ns-server-2-0/303/ ) removing underscores from startkey/endkey fields. MB-4781 (Revision 9bc2bde6f88d43b273f7278e18a91c3871404cf2) Result = SUCCESS Aliaksey Kandratsenka : Files : priv/public/index.html
        Hide
        francares francares added a comment -

        Does anyone know if this bug was fixed in DP4?
        I´m using Couchbase version: 2.0.0 community edition (build-724) and still happens. I´m using startkey_docid and startkey query params.

        Show
        francares francares added a comment - Does anyone know if this bug was fixed in DP4? I´m using Couchbase version: 2.0.0 community edition (build-724) and still happens. I´m using startkey_docid and startkey query params.
        Hide
        tommie Tommie McAfee added a comment -

        Yes, this was fixed in dp4. What do you're id's and keys look like?
        Depending on your docids, you should not have quote's around the startkey_docid, even if the id's are strings.

        Show
        tommie Tommie McAfee added a comment - Yes, this was fixed in dp4. What do you're id's and keys look like? Depending on your docids, you should not have quote's around the startkey_docid, even if the id's are strings.
        Hide
        francares francares added a comment -
        Show
        francares francares added a comment - They are GUIDs. When I call to the view with following URLs: http://10.230.58.221:8092/test/_design/dev_appsByCategory/_view/appsByCategory?startkey_docid=%2200%22&connection_timeout=60000&limit=10&skip=0 or http://10.230.58.221:8092/test/_design/dev_appsByCategory/_view/appsByCategory?startkey_docid=00&connection_timeout=60000&limit=10&skip=0 It returns keys like 03057CA7-5F27-4364-87FD-892548D8CB43, so the filter is not performed in the view.
        Hide
        francares francares added a comment -

        Same happens with string type keys.

        Show
        francares francares added a comment - Same happens with string type keys.
        Hide
        tommie Tommie McAfee added a comment -

        Well, couple of things here, as I also thought this was unintuitive before understanding how this used to work in couchdb.

        Using startkey_docid requires 2 things:
        1 that the startkey filter is also used in the same query
        2 that the results returned from using startkey contain duplicate keys

        so if I have:

        { "key0" : "val0" "key1" : "val1" <_id = k1v1> "key1" : "val2" <_id = k1v2> "key1" : "val3" <_id = k1v3> "key2" : "val4" }

        I can do something like
        starkey = key1, startkey_doid = k1v2

        and my results would be

        { "key1" : "val2" <_id = k1v2> "key1" : "val3" <_id = k1v3> "key2" : "val4" }

        Could be in your case the only thing you need is startkey if all your map functions are emitting unique keys.

        Show
        tommie Tommie McAfee added a comment - Well, couple of things here, as I also thought this was unintuitive before understanding how this used to work in couchdb. Using startkey_docid requires 2 things: 1 that the startkey filter is also used in the same query 2 that the results returned from using startkey contain duplicate keys so if I have: { "key0" : "val0" "key1" : "val1" <_id = k1v1> "key1" : "val2" <_id = k1v2> "key1" : "val3" <_id = k1v3> "key2" : "val4" } I can do something like starkey = key1, startkey_doid = k1v2 and my results would be { "key1" : "val2" <_id = k1v2> "key1" : "val3" <_id = k1v3> "key2" : "val4" } Could be in your case the only thing you need is startkey if all your map functions are emitting unique keys.

          People

          • Assignee:
            BigBlueHat Benjamin Young
            Reporter:
            tommie Tommie McAfee
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Due:
              Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes