Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-23524

Subdoc UTF-8 paths are incorrectly rejected as PathInvalid

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 4.6.0
    • 5.0.0
    • memcached
    • Triaged
    • Yes
    • KV Spock Beta

    Description

      (post: https://forums.couchbase.com/t/subdoc-api-mutatein-does-not-handle-unicode/12210)

      In a recent forum post a user showed that inserting a path of 

      bucket.mutateIn("root").insert("Fútbol", "value", true).execute();

       raised an error while

      bucket.mutateIn("root").insert("Futbol", "value", true).execute();

      works fine.

      The server returns with a " com.couchbase.client.java.error.subdoc.PathInvalidException: Path Fútbol ends in an array index in root, expected dictionary" which seems weird.

      Could it be that there is an issue handling unicode characters in the path properly? 

       

      I think the client is encoding it properly, see both versions here:

      +-------------------------------------------------+
      | 0 1 2 3 4 5 6 7 8 9 a b c d e f |
      +--------+-------------------------------------------------+----------------+
      |00000000| 80 c7 00 04 03 00 00 34 00 00 00 14 00 00 00 01 |.......4........|
      |00000010| 00 00 00 00 00 00 00 00 00 06 01 72 6f 6f 74 |...........root |
      +--------+-------------------------------------------------+----------------+
      [cb-io-1-1] 2017-03-24 08:22:13 TRACE LoggingHandler:94 - [id: 0xa5c70a21, L:/127.0.0.1:54468 - R:localhost/127.0.0.1:11210] WRITE: 13B
      +-------------------------------------------------+
      | 0 1 2 3 4 5 6 7 8 9 a b c d e f |
      +--------+-------------------------------------------------+----------------+
      |00000000| 46 75 74 62 6f 6c 22 76 61 6c 75 65 22 |Futbol"value" |
      +--------+-------------------------------------------------+----------------+
       
       
      vs.
       
      +-------------------------------------------------+
      | 0 1 2 3 4 5 6 7 8 9 a b c d e f |
      +--------+-------------------------------------------------+----------------+
      |00000000| 80 c7 00 04 03 00 00 34 00 00 00 15 00 00 00 01 |.......4........|
      |00000010| 00 00 00 00 00 00 00 00 00 07 01 72 6f 6f 74 |...........root |
      +--------+-------------------------------------------------+----------------+
      [cb-io-1-1] 2017-03-24 08:23:07 TRACE LoggingHandler:94 - [id: 0x8459ac46, L:/127.0.0.1:54527 - R:localhost/127.0.0.1:11210] WRITE: 14B
      +-------------------------------------------------+
      | 0 1 2 3 4 5 6 7 8 9 a b c d e f |
      +--------+-------------------------------------------------+----------------+
      |00000000| 46 c3 ba 74 62 6f 6c 22 76 61 6c 75 65 22 |F..tbol"value" |
      +--------+-------------------------------------------------+----------------+
      
      

      In the error case the server responds with:

       

      +-------------------------------------------------+
      | 0 1 2 3 4 5 6 7 8 9 a b c d e f |
      +--------+-------------------------------------------------+----------------+
      |00000000| 81 c7 00 00 00 00 00 c2 00 00 00 00 00 00 00 01 |................|
      |00000010| 00 00 00 00 00 00 00 00 |........ |
      +--------+-------------------------------------------------+----------------+

       

       

       

      Attachments

        For Gerrit Dashboard: MB-23524
        # Subject Branch Project Status CR V

        Activity

          Mark Nunberg after discussing with Dave Rigby it is possible that there is a bug in subjson for path handling with unicode. Let me know if I can help debug this further.

          daschl Michael Nitschinger added a comment - Mark Nunberg after discussing with Dave Rigby it is possible that there is a bug in subjson for path handling with unicode. Let me know if I can help debug this further.
          trond Trond Norbye added a comment -

          The packet is correctly encoded and PROTOCOL_BINARY_RESPONSE_SUBDOC_PATH_EINVAL is only returned after a mapping from Subdoc::Error::PATH_EINVAL

          trond Trond Norbye added a comment - The packet is correctly encoded and PROTOCOL_BINARY_RESPONSE_SUBDOC_PATH_EINVAL is only returned after a mapping from Subdoc::Error::PATH_EINVAL
          drigby Dave Rigby added a comment -

          Mark - could you also add a test in memcached testapp to verify that it works end-to-end?

          (Wouldn't want to close this and then discover some issues in memcached...)

          drigby Dave Rigby added a comment - Mark - could you also add a test in memcached testapp to verify that it works end-to-end? (Wouldn't want to close this and then discover some issues in memcached...)

          Build 5.0.0-2417 contains subjson commit 80a9543e7a6588d6c57d89ca55f6eafae10e0fdc with commit message:
          MB-23524: Allow non-ASCII characters in path
          https://github.com/couchbase/subjson/commit/80a9543e7a6588d6c57d89ca55f6eafae10e0fdc

          build-team Couchbase Build Team added a comment - Build 5.0.0-2417 contains subjson commit 80a9543e7a6588d6c57d89ca55f6eafae10e0fdc with commit message: MB-23524 : Allow non-ASCII characters in path https://github.com/couchbase/subjson/commit/80a9543e7a6588d6c57d89ca55f6eafae10e0fdc
          drigby Dave Rigby added a comment -

          Oliver Downard See my last comment to Mark - would be good to have an end-to-end (memcached) test for this.

          drigby Dave Rigby added a comment - Oliver Downard See my last comment to Mark - would be good to have an end-to-end (memcached) test for this.

          Build 5.0.0-2881 contains memcached commit a5fa4ba87b7cee705edd3393420111e93eae4f51 with commit message:
          MB-23524: Add UTF8 subdoc test in testapp
          https://github.com/couchbase/memcached/commit/a5fa4ba87b7cee705edd3393420111e93eae4f51

          build-team Couchbase Build Team added a comment - Build 5.0.0-2881 contains memcached commit a5fa4ba87b7cee705edd3393420111e93eae4f51 with commit message: MB-23524 : Add UTF8 subdoc test in testapp https://github.com/couchbase/memcached/commit/a5fa4ba87b7cee705edd3393420111e93eae4f51

          People

            daschl Michael Nitschinger
            daschl Michael Nitschinger
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty