Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Security Level: Public
    • Labels:
      None
    • Environment:
      elasticsearch adapter beta snapshot

      2 couchbase nodes
      2 elasticsearch nodes

      Description

      Applied a 20k op/sec load:
      run workload --get 50 --create 12 --update 30 --delete 8 --ops 20000

      sample 512byte document template:

      {
      "padding": "456GDVRHLDIZA6LB1MS8F90WZRB1CQQ2DKWCOYOTHBYPXAF0NY2ER6GO6OHFO8UI0EHSP7KCKIERSPKRDOUF1MNHPS9XPLW08494SN0XANL2HAGSH2DVI1SBPWBX0LJR4K34IFQHDD74MK5VPYXEEW325L25TC87E44JUXY9RIL6DNF2W1A6JE7N72R58ROG9PB6N6IS5BS6MVIZB8TD1PTNQHHOG0VLH64FO0JZW1FEKU3MY2C4KF1SD12X709BWSRQCZKPKYZ9HSYQKUXCFH6RB2ELU1EM06VE9LU27URVLWO43LL4TF1YUBYJZ84V5DCMGX089YQZHC30CLPDF8MT1423NHUW80F2SJHEA8J5WJ4HJL9EQKD077ET712PYUXGFOR98QZDFPQ2C8FRQC44Z4SIMIBCD03EB6Z6EGYSI10VJBIQCOH024Q1LYEWTABAG74ZQQ22KJW78UPN2IBE4N57B",
      "city": "mateo",
      "list": [
      1,
      2,
      3
      ],
      "hello": "world",
      "map":

      { "sample": "map", "complex": 1 }

      }

      and observing the following errors in logs:

      [2013-01-21 16:05:56,005][ERROR][transport.couchbase.capi ] [Omega Red] Unable to parse decoded base64 data as JSON, indexing stub for id: D963FBCB91C6037390BB6404
      [2013-01-21 16:05:56,006][ERROR][transport.couchbase.capi ] [Omega Red] Body was: �)+k^K^@@@@@@@@@@@@@@@@@@@@@@@^@ Parse error was: org.codehaus.jackson.JsonParseException: Unexpected character ('ð' (code 240)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
      at [Source: [B@42ff1c7f; line: 1, column: 2]
      [2013-01-21 16:05:55,787][TRACE][com.couchbase.capi.servlet.CAPIServlet] Got bulk docs request for default/988;00000000000000000000000000000000
      [2013-01-21 16:05:55,786][DEBUG][transport.couchbase.capi ] [Omega Red] Bulk doc entry is [{meta=

      {id=DB55C88BCF4D5242B5EF15E7, rev=1-00448c25e6d518f20000000000000000, expiration=0, flags=0}

      , base64=eyJwYWRkaW5nIjogIlQ2VDY3RFNUWE9XMU9aMzlPRThaTlRPVzhSUkhaRTM3NEtIWU9LREVQRkQ4VDhTQ1hDTEk2WEc3Q1pCN0xNNTg3TjBCRDI0Rkw5NUJZSzBOTkNGSE5YTERMQjRXVzVVRkxHRDYwVjlMUDVSTEk3VkhKSU9BRFA2U0NGSldNRzQ0MDZaRTVMQjYzRExUM0hFSEZXSTVOSDdOODFWSDIzWFBLU1JNMkRKUE9DM1gzWUxZTzIwODZONkJQMFI1VDMyWlhLMTJESUsxNzc0SjMyVzBRUUlKVE9VUUNNWUpRN1g5SVI0S1hMWk9NM1hJWkVYQjEzNU42WlBPNzkxNkxYQ0U3SjJKVlJCTzBGNkk2TURXOVBLWlVIQ0xKOTZPUzBWTU00MkhBTlpHMVlOME1OMlcyTjdYWTFWNjRBR1ZaT0w4SDdWMVFPVVFGTzVWTkgyWTNOR0o0RkZSRDMxQzFKSDNXVkNJS0QzRUVOQ0FSVlVMQTZWTkNLTVpEQVo0TDBDS003TEFINFNJMk5YMVVBM080UkJMQkVMOEoyTzM5TjFVNjVDVUE2RkNWRExHQk9BTThCVjVKMzIwVU85RkRIVUtaMjRQNk02QzdBSjA2TDkxWjhaSFpUVVA0QlVOTiIsICJjaXR5IjogIm1hdGVvIiwgImxpc3QiOiBbMSwgMiwgM10sICJoZWxsbyI6ICJ3b3JsZCIsICJtYXAiOiB7InNhbXBsZSI6ICJtYXAiLCAiY29tcGxleCI6IDF9fQ==}, {meta=

      {id=A49129193F3E5B5A733D9C1A, rev=4-00448c278e0946e20000000000000000, expiration=0, flags=0}

      , base64=eyJwYWRkaW5nIjogIk5QUlFGOVdNTDBFQkVCUU0zS0NVNVo3VU9ZMkRKVllDWEsyRlFCWTBPREhMUzRPTjJZRk5VSzczNkc4QlRLRFM1T0pQNkZHRjEzQ0FZRFE3MVpLSDJEVzlWTElQUTJWTFFHS1lPTFlQQTZaQUlOTThOUzUxSzdYVFBIT1o2RkxKUThJOFpQV003NE1MSTFYOTkxVENRMTZZUlBZNFlKRTZTN0NZS1czSDZJRzY0SzNVMEw4MjIwSk42R1Y4TjVUTVhDN0lORDBZU1I4SlM2Uk82UkpXRFRaUUNJWUhMRlY4V1U2VjNaU0NZN0lWS0xZVU9WWFFHS0ZXOUU4WVA2Q1RRSDJSTUJVNjI4M0ROMkZSVUJIR1FVV1c2R1NCVExVRzg4MDJKOEszSVcwTEpZREJIQjZYME5VNExKMTRETDNMREQ4SjUwRTVLTEdEOFZMQlk0VkJPOVU2TVhZM085UjFOS1ZMWkdGVFc3MjUxUlU4VEkzSkNCRkI4WkU1NE8yWFkyV0pMV0cxMlFRTEg3WjFQUUhTUDhSTzlYT1hVRFBQU01ZQTYxSUFYQ05ESUVGT1IzNzhaWVVYNlIwU01XSTdIVjAxMVozQ0w5SldDUVZXVEpTS0VMTFVUWFQ2NyIsICJjaXR5IjogIm1hdGVvIiwgImxpc3QiOiBbMSwgMiwgM10sICJoZWxsbyI6ICJ3b3JsZCIsICJtYXAiOiB7InNhbXBsZSI6ICJtYXAiLCAiY29tcGxleCI6IDF9fQ==},

      1. elasticsearch_simple.log.gz
        35 kB
        Tommie McAfee
      2. elasticsearch.log.gz
        845 kB
        Tommie McAfee
      3. xdcr.tar.gz
        1.96 MB
        Tommie McAfee
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        tommie Tommie McAfee created issue -
        Hide
        tommie Tommie McAfee added a comment -

        full logs available at:

        scp root@10.3.2.4:/tmp/elasticsearch-log-snippit.log.gz

        Show
        tommie Tommie McAfee added a comment - full logs available at: scp root@10.3.2.4:/tmp/elasticsearch-log-snippit.log.gz
        Hide
        mschoch Marty Schoch added a comment -

        I've looked at the logs and in the bulk-docs request before this error was logged, we see the following:

        {meta=

        {id=D963FBCB91C6037390BB6404, rev=5-002e64fdb7fa7f3b0000000000000000, att_reason=non-JSON mode, expiration=0, flags=0, deleted=true}

        , b
        ase64=8CkrawsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=}

        This is odd for several reasons:

        1. att_reason=non-JSON mode implies that couchbase thinks this is non-JSON, but doing a get request on the server shows it as valid JSON
        2. deleted=true but the document exists in Couchbase server now, Tommie says the keys are random so its not impossible it was recreated, but unlikely

        Partially related... the plug-in was designed to detect non-JSON mode but it contained a bug, and was instead looking for "attr_reason" which is the wrong key. This explains the error message being logged, but not the rest of the behavior.

        Show
        mschoch Marty Schoch added a comment - I've looked at the logs and in the bulk-docs request before this error was logged, we see the following: {meta= {id=D963FBCB91C6037390BB6404, rev=5-002e64fdb7fa7f3b0000000000000000, att_reason=non-JSON mode, expiration=0, flags=0, deleted=true} , b ase64=8CkrawsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=} This is odd for several reasons: 1. att_reason=non-JSON mode implies that couchbase thinks this is non-JSON, but doing a get request on the server shows it as valid JSON 2. deleted=true but the document exists in Couchbase server now, Tommie says the keys are random so its not impossible it was recreated, but unlikely Partially related... the plug-in was designed to detect non-JSON mode but it contained a bug, and was instead looking for "attr_reason" which is the wrong key. This explains the error message being logged, but not the rest of the behavior.
        Hide
        tommie Tommie McAfee added a comment -

        I've reproduced this with following functional test that creates then deletes docs at revision 1, so there are no intermediate mutations like in system testing:

        python26 testrunner -i es-3.ini -t pytests.xdcr.esXDCR.ESTests.load_with_async_ops -p items=10000,rdirection=unidirection,doc-ops=delete,end_replication_flag=1

        In the attached logs, non-JSON mode is there at time of delete. And since we are at rev-1 seems like it should be there at time of creation.

        [2013-01-22 08:51:19,658][TRACE][com.couchbase.capi.servlet.CAPIServlet] parsed value is {new_edits=false, docs=[{meta=

        {id=es_xdcr_docs-6486, rev=1-0044c3018c499ba20000000000000000, expiration=0, flags=0}

        , base64=eyJvcmRlcmluZyI6IDE0ODYsICJtdXRhdGVkIjogMCwgIl9pZCI6ICJlc194ZGNyX2RvY3MtNjQ4NiIsICJzaXRlX25hbWUiOiAieWFob28ifQ==}, {meta={id=es_xdcr_docs-8443, rev=2-0044c3026268e2930000000000000000, att_reason=non-JSON mode,

        full es logs from test attached, gathering couch xdcr as well

        Show
        tommie Tommie McAfee added a comment - I've reproduced this with following functional test that creates then deletes docs at revision 1, so there are no intermediate mutations like in system testing: python26 testrunner -i es-3.ini -t pytests.xdcr.esXDCR.ESTests.load_with_async_ops -p items=10000,rdirection=unidirection,doc-ops=delete,end_replication_flag=1 In the attached logs, non-JSON mode is there at time of delete. And since we are at rev-1 seems like it should be there at time of creation. [2013-01-22 08:51:19,658] [TRACE] [com.couchbase.capi.servlet.CAPIServlet] parsed value is {new_edits=false, docs=[{meta= {id=es_xdcr_docs-6486, rev=1-0044c3018c499ba20000000000000000, expiration=0, flags=0} , base64=eyJvcmRlcmluZyI6IDE0ODYsICJtdXRhdGVkIjogMCwgIl9pZCI6ICJlc194ZGNyX2RvY3MtNjQ4NiIsICJzaXRlX25hbWUiOiAieWFob28ifQ==}, {meta={id=es_xdcr_docs-8443, rev=2-0044c3026268e2930000000000000000, att_reason=non-JSON mode, full es logs from test attached, gathering couch xdcr as well
        tommie Tommie McAfee made changes -
        Field Original Value New Value
        Attachment elasticsearch.log.gz [ 16254 ]
        tommie Tommie McAfee made changes -
        Attachment xdcr.tar.gz [ 16255 ]
        tommie Tommie McAfee made changes -
        Assignee Marty Schoch [ mschoch ] Junyi Xie [ junyi ]
        Hide
        junyi Junyi Xie (Inactive) added a comment -

        It seems to me the JSON parser called by ep_engine or couchstore incorrectly mark the doc as non-JSON and then propagate the flag to CouchDB and XDCR.

        Talked to Chris briefly, actually neither XCDR and CouchDB would parse the doc content to determine it is a JSON or not. CouchDB (couch_doc module) just takes the flag from upper level (ep_engine or couchstore?) to determine the doc is a JSON or not.

        Chris, please advise and put your comment here so we could know whom should look at this issue. Thank you!

        Show
        junyi Junyi Xie (Inactive) added a comment - It seems to me the JSON parser called by ep_engine or couchstore incorrectly mark the doc as non-JSON and then propagate the flag to CouchDB and XDCR. Talked to Chris briefly, actually neither XCDR and CouchDB would parse the doc content to determine it is a JSON or not. CouchDB (couch_doc module) just takes the flag from upper level (ep_engine or couchstore?) to determine the doc is a JSON or not. Chris, please advise and put your comment here so we could know whom should look at this issue. Thank you!
        junyi Junyi Xie (Inactive) made changes -
        Assignee Junyi Xie [ junyi ] J Chris Anderson [ jchrisa ]
        Hide
        mschoch Marty Schoch added a comment -

        i overheard a bit of the conversation and it was mentioned that it was using some heuristic, not full parsing.

        that may be fine, but we implemented an optimization which will do the wrong thing if this heuristic is ever wrong. so, if we suspect the heuristic may be wrong, then i should remove the optimization and try to parse it anyway.

        Show
        mschoch Marty Schoch added a comment - i overheard a bit of the conversation and it was mentioned that it was using some heuristic, not full parsing. that may be fine, but we implemented an optimization which will do the wrong thing if this heuristic is ever wrong. so, if we suspect the heuristic may be wrong, then i should remove the optimization and try to parse it anyway.
        Hide
        mschoch Marty Schoch added a comment -

        Had a conversation with Chris and Aaron about this. There are a couple of possibilities here:

        1. It's possible that deletes always come through as non-JSON mode because the body is empty?
        2. There were some legacy validation issues around docs containing things like "_id". At least one of the docs above contains this.

        Aaron has asked for a sample document that exhibits the behavior, will wait until tomorrow 1/29 when Tommie will be in the office.

        Also, Aaron has confirmed that the optimization we're using should be safe as the heuristic should only fail the other direction.

        Show
        mschoch Marty Schoch added a comment - Had a conversation with Chris and Aaron about this. There are a couple of possibilities here: 1. It's possible that deletes always come through as non-JSON mode because the body is empty? 2. There were some legacy validation issues around docs containing things like "_id". At least one of the docs above contains this. Aaron has asked for a sample document that exhibits the behavior, will wait until tomorrow 1/29 when Tommie will be in the office. Also, Aaron has confirmed that the optimization we're using should be safe as the heuristic should only fail the other direction.
        Hide
        tommie Tommie McAfee added a comment -

        Just ran a simpler case by creating a sample doc from UI and then deleting it.

        Document body:

        {
        "click": "to edit",
        "new in 2.0": "there are no reserved field names"
        }

        In log stream below, looks like doc is created with rev=1-xxx and at time of delete it's rev=2-xxx. Not sure if a new revision of the document is first sent prior to delete or if this is expected. Additional logs attached(elasticsearch_simple.log.gz).

        [2013-01-29 10:46:27,718][TRACE][com.couchbase.capi.servlet.CAPIServlet] parsed value is {new_edits=false, docs=[{meta=

        {id=newdoc, rev=1-0046efc8ac0d3a230000000000000000, expiration=0, flags=0}, base64=eyJjbGljayI6InRvIGVkaXQiLCJuZXcgaW4gMi4wIjoidGhlcmUgYXJlIG5vIHJlc2VydmVkIGZpZWxkIG5hbWVzIn0=}]}
        [2013-01-29 10:46:27,718][DEBUG][transport.couchbase.capi ] [Maestro] Bulk doc entry is [{meta={id=newdoc, rev=1-0046efc8ac0d3a230000000000000000, expiration=0, flags=0}

        , base64=eyJjbGljayI6InRvIGVkaXQiLCJuZXcgaW4gMi4wIjoidGhlcmUgYXJlIG5vIHJlc2VydmVkIGZpZWxkIG5hbWVzIn0=}]
        [2013-01-29 10:47:32,743][TRACE][com.couchbase.capi.servlet.CAPIServlet] Got HEAD request for default/262;00000000000000000000000000000000
        [2013-01-29 10:47:32,747][TRACE][com.couchbase.capi.servlet.CAPIServlet] Got HEAD request for default/master;00000000000000000000000000000000
        [2013-01-29 10:47:32,752][TRACE][com.couchbase.capi.servlet.CAPIServlet] Got revs diff request for default/262;00000000000000000000000000000000
        [2013-01-29 10:47:32,752][TRACE][com.couchbase.capi.servlet.CAPIServlet] revs diff request body was

        {"newdoc":"2-0046efc8ac0d3a240000000000000000"}

        [2013-01-29 10:47:32,752][TRACE][com.couchbase.capi.servlet.CAPIServlet] revs diff parsed value is

        {newdoc=2-0046efc8ac0d3a240000000000000000}
        [2013-01-29 10:47:32,752][TRACE][transport.couchbase.capi ] [Maestro] _revs_diff request for default/262;00000000000000000000000000000000 : {newdoc=2-0046efc8ac0d3a240000000000000000}

        [2013-01-29 10:47:32,753][TRACE][transport.couchbase.capi ] [Maestro] _revs_diff response for default/262;00000000000000000000000000000000 is: {newdoc={missing=2-0046efc8ac0d3a240000000000000000}}
        [2013-01-29 10:47:32,755][TRACE][transport.couchbase.capi ] [Maestro] _revs_diff response AFTER conflict resolution {newdoc={missing=2-0046efc8ac0d3a240000000000000000}}
        [2013-01-29 10:47:32,801][TRACE][com.couchbase.capi.servlet.CAPIServlet] Got bulk docs request for default/262;00000000000000000000000000000000
        [2013-01-29 10:47:32,802][TRACE][com.couchbase.capi.servlet.CAPIServlet] parsed value is {new_edits=false, docs=[{meta=

        {id=newdoc, rev=2-0046efc8ac0d3a240000000000000000, att_reason=non-JSON mode, expiration=0, flags=0, deleted=true}, base64=8CkrawsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=}]}
        [2013-01-29 10:47:32,802][DEBUG][transport.couchbase.capi ] [Maestro] Bulk doc entry is [{meta={id=newdoc, rev=2-0046efc8ac0d3a240000000000000000, att_reason=non-JSON mode, expiration=0, flags=0, deleted=true}

        , base64=8CkrawsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=}]

        Show
        tommie Tommie McAfee added a comment - Just ran a simpler case by creating a sample doc from UI and then deleting it. Document body: { "click": "to edit", "new in 2.0": "there are no reserved field names" } In log stream below, looks like doc is created with rev=1-xxx and at time of delete it's rev=2-xxx. Not sure if a new revision of the document is first sent prior to delete or if this is expected. Additional logs attached(elasticsearch_simple.log.gz). [2013-01-29 10:46:27,718] [TRACE] [com.couchbase.capi.servlet.CAPIServlet] parsed value is {new_edits=false, docs=[{meta= {id=newdoc, rev=1-0046efc8ac0d3a230000000000000000, expiration=0, flags=0}, base64=eyJjbGljayI6InRvIGVkaXQiLCJuZXcgaW4gMi4wIjoidGhlcmUgYXJlIG5vIHJlc2VydmVkIGZpZWxkIG5hbWVzIn0=}]} [2013-01-29 10:46:27,718] [DEBUG] [transport.couchbase.capi ] [Maestro] Bulk doc entry is [{meta={id=newdoc, rev=1-0046efc8ac0d3a230000000000000000, expiration=0, flags=0} , base64=eyJjbGljayI6InRvIGVkaXQiLCJuZXcgaW4gMi4wIjoidGhlcmUgYXJlIG5vIHJlc2VydmVkIGZpZWxkIG5hbWVzIn0=}] [2013-01-29 10:47:32,743] [TRACE] [com.couchbase.capi.servlet.CAPIServlet] Got HEAD request for default/262;00000000000000000000000000000000 [2013-01-29 10:47:32,747] [TRACE] [com.couchbase.capi.servlet.CAPIServlet] Got HEAD request for default/master;00000000000000000000000000000000 [2013-01-29 10:47:32,752] [TRACE] [com.couchbase.capi.servlet.CAPIServlet] Got revs diff request for default/262;00000000000000000000000000000000 [2013-01-29 10:47:32,752] [TRACE] [com.couchbase.capi.servlet.CAPIServlet] revs diff request body was {"newdoc":"2-0046efc8ac0d3a240000000000000000"} [2013-01-29 10:47:32,752] [TRACE] [com.couchbase.capi.servlet.CAPIServlet] revs diff parsed value is {newdoc=2-0046efc8ac0d3a240000000000000000} [2013-01-29 10:47:32,752] [TRACE] [transport.couchbase.capi ] [Maestro] _revs_diff request for default/262;00000000000000000000000000000000 : {newdoc=2-0046efc8ac0d3a240000000000000000} [2013-01-29 10:47:32,753] [TRACE] [transport.couchbase.capi ] [Maestro] _revs_diff response for default/262;00000000000000000000000000000000 is: {newdoc={missing=2-0046efc8ac0d3a240000000000000000}} [2013-01-29 10:47:32,755] [TRACE] [transport.couchbase.capi ] [Maestro] _revs_diff response AFTER conflict resolution {newdoc={missing=2-0046efc8ac0d3a240000000000000000}} [2013-01-29 10:47:32,801] [TRACE] [com.couchbase.capi.servlet.CAPIServlet] Got bulk docs request for default/262;00000000000000000000000000000000 [2013-01-29 10:47:32,802] [TRACE] [com.couchbase.capi.servlet.CAPIServlet] parsed value is {new_edits=false, docs=[{meta= {id=newdoc, rev=2-0046efc8ac0d3a240000000000000000, att_reason=non-JSON mode, expiration=0, flags=0, deleted=true}, base64=8CkrawsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=}]} [2013-01-29 10:47:32,802] [DEBUG] [transport.couchbase.capi ] [Maestro] Bulk doc entry is [{meta={id=newdoc, rev=2-0046efc8ac0d3a240000000000000000, att_reason=non-JSON mode, expiration=0, flags=0, deleted=true} , base64=8CkrawsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=}]
        tommie Tommie McAfee made changes -
        Attachment elasticsearch_simple.log.gz [ 16341 ]

          People

          • Assignee:
            jchrisa J Chris Anderson [X] (Inactive)
            Reporter:
            tommie Tommie McAfee
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Gerrit Reviews

              There are no open Gerrit changes