Uploaded image for project: 'Couchbase Java Client'
  1. Couchbase Java Client
  2. JCBC-1140

Bucket#get (JsonObject) is allocating twice the memory for the document

    XMLWordPrintable

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.5.1
    • 2.5.2
    • Core

    Description

      Motivation

      ----------

      When retrieving a document Bucket#get the netty buffer content is

      duplicated before the parsing by Jackson.

       

      Modifications

      -------------

      Wrap the ByteBuf in an inputstream. Jackson now directly parse the netty

      ByteBuf content

       

      Result

      ------

      Reduced memory allocations

      YMMV but in our test it was a ~10% reduction

      Attachments

        For Gerrit Dashboard: JCBC-1140
        # Subject Branch Project Status CR V

        Activity

          daschl Michael Nitschinger added a comment - - edited

          Here are the full results with your approach (sharing and reader index reset) for both 1.5k small and 8k large docs:

          MyBenchmark.newLargeDirect                                   thrpt   20   31645.933 ±  1459.743   ops/s
          MyBenchmark.newLargeDirect:·gc.alloc.rate                    thrpt   20     597.478 ±    27.475  MB/sec
          MyBenchmark.newLargeDirect:·gc.alloc.rate.norm               thrpt   20   29736.014 ±     0.001    B/op
          MyBenchmark.newLargeDirect:·gc.churn.PS_Eden_Space           thrpt   20     596.663 ±    26.772  MB/sec
          MyBenchmark.newLargeDirect:·gc.churn.PS_Eden_Space.norm      thrpt   20   29708.606 ±   688.016    B/op
          MyBenchmark.newLargeDirect:·gc.churn.PS_Survivor_Space       thrpt   20       0.249 ±     0.065  MB/sec
          MyBenchmark.newLargeDirect:·gc.churn.PS_Survivor_Space.norm  thrpt   20      12.395 ±     3.247    B/op
          MyBenchmark.newLargeDirect:·gc.count                         thrpt   20     364.000              counts
          MyBenchmark.newLargeDirect:·gc.time                          thrpt   20     196.000                  ms
          MyBenchmark.newLargeHeap                                     thrpt   20   31727.854 ±  1325.760   ops/s
          MyBenchmark.newLargeHeap:·gc.alloc.rate                      thrpt   20     599.022 ±    25.214  MB/sec
          MyBenchmark.newLargeHeap:·gc.alloc.rate.norm                 thrpt   20   29736.014 ±     0.001    B/op
          MyBenchmark.newLargeHeap:·gc.churn.PS_Eden_Space             thrpt   20     599.502 ±    28.201  MB/sec
          MyBenchmark.newLargeHeap:·gc.churn.PS_Eden_Space.norm        thrpt   20   29759.234 ±   644.924    B/op
          MyBenchmark.newLargeHeap:·gc.churn.PS_Survivor_Space         thrpt   20       0.250 ±     0.051  MB/sec
          MyBenchmark.newLargeHeap:·gc.churn.PS_Survivor_Space.norm    thrpt   20      12.443 ±     2.667    B/op
          MyBenchmark.newLargeHeap:·gc.count                           thrpt   20     362.000              counts
          MyBenchmark.newLargeHeap:·gc.time                            thrpt   20     191.000                  ms
          MyBenchmark.newSmallDirect                                   thrpt   20  212878.358 ± 10050.930   ops/s
          MyBenchmark.newSmallDirect:·gc.alloc.rate                    thrpt   20     789.301 ±    37.266  MB/sec
          MyBenchmark.newSmallDirect:·gc.alloc.rate.norm               thrpt   20    5840.002 ±     0.001    B/op
          MyBenchmark.newSmallDirect:·gc.churn.PS_Eden_Space           thrpt   20     789.403 ±    36.655  MB/sec
          MyBenchmark.newSmallDirect:·gc.churn.PS_Eden_Space.norm      thrpt   20    5843.212 ±   133.716    B/op
          MyBenchmark.newSmallDirect:·gc.churn.PS_Survivor_Space       thrpt   20       0.245 ±     0.063  MB/sec
          MyBenchmark.newSmallDirect:·gc.churn.PS_Survivor_Space.norm  thrpt   20       1.811 ±     0.470    B/op
          MyBenchmark.newSmallDirect:·gc.count                         thrpt   20     367.000              counts
          MyBenchmark.newSmallDirect:·gc.time                          thrpt   20     189.000                  ms
          MyBenchmark.newSmallHeap                                     thrpt   20  219323.555 ±  9935.312   ops/s
          MyBenchmark.newSmallHeap:·gc.alloc.rate                      thrpt   20     813.647 ±    36.882  MB/sec
          MyBenchmark.newSmallHeap:·gc.alloc.rate.norm                 thrpt   20    5840.002 ±     0.001    B/op
          MyBenchmark.newSmallHeap:·gc.churn.PS_Eden_Space             thrpt   20     813.845 ±    35.084  MB/sec
          MyBenchmark.newSmallHeap:·gc.churn.PS_Eden_Space.norm        thrpt   20    5843.708 ±   112.396    B/op
          MyBenchmark.newSmallHeap:·gc.churn.PS_Survivor_Space         thrpt   20       0.263 ±     0.048  MB/sec
          MyBenchmark.newSmallHeap:·gc.churn.PS_Survivor_Space.norm    thrpt   20       1.887 ±     0.320    B/op
          MyBenchmark.newSmallHeap:·gc.count                           thrpt   20     360.000              counts
          MyBenchmark.newSmallHeap:·gc.time                            thrpt   20     191.000                  ms
          MyBenchmark.oldLargeDirect                                   thrpt   20   30344.874 ±  1309.899   ops/s
          MyBenchmark.oldLargeDirect:·gc.alloc.rate                    thrpt   20     725.622 ±    31.336  MB/sec
          MyBenchmark.oldLargeDirect:·gc.alloc.rate.norm               thrpt   20   37664.014 ±     0.001    B/op
          MyBenchmark.oldLargeDirect:·gc.churn.PS_Eden_Space           thrpt   20     726.830 ±    32.893  MB/sec
          MyBenchmark.oldLargeDirect:·gc.churn.PS_Eden_Space.norm      thrpt   20   37727.626 ±   531.782    B/op
          MyBenchmark.oldLargeDirect:·gc.churn.PS_Survivor_Space       thrpt   20       0.448 ±     0.095  MB/sec
          MyBenchmark.oldLargeDirect:·gc.churn.PS_Survivor_Space.norm  thrpt   20      23.264 ±     4.863    B/op
          MyBenchmark.oldLargeDirect:·gc.count                         thrpt   20     367.000              counts
          MyBenchmark.oldLargeDirect:·gc.time                          thrpt   20     193.000                  ms
          MyBenchmark.oldLargeHeap                                     thrpt   20   30828.832 ±  1369.576   ops/s
          MyBenchmark.oldLargeHeap:·gc.alloc.rate                      thrpt   20     580.162 ±    25.747  MB/sec
          MyBenchmark.oldLargeHeap:·gc.alloc.rate.norm                 thrpt   20   29632.014 ±     0.001    B/op
          MyBenchmark.oldLargeHeap:·gc.churn.PS_Eden_Space             thrpt   20     580.847 ±    31.633  MB/sec
          MyBenchmark.oldLargeHeap:·gc.churn.PS_Eden_Space.norm        thrpt   20   29658.400 ±   686.091    B/op
          MyBenchmark.oldLargeHeap:·gc.churn.PS_Survivor_Space         thrpt   20       0.259 ±     0.049  MB/sec
          MyBenchmark.oldLargeHeap:·gc.churn.PS_Survivor_Space.norm    thrpt   20      13.222 ±     2.440    B/op
          MyBenchmark.oldLargeHeap:·gc.count                           thrpt   20     363.000              counts
          MyBenchmark.oldLargeHeap:·gc.time                            thrpt   20     192.000                  ms
          MyBenchmark.oldSmallDirect                                   thrpt   20  197502.798 ±  9973.839   ops/s
          MyBenchmark.oldSmallDirect:·gc.alloc.rate                    thrpt   20     915.207 ±    46.391  MB/sec
          MyBenchmark.oldSmallDirect:·gc.alloc.rate.norm               thrpt   20    7296.002 ±     0.001    B/op
          MyBenchmark.oldSmallDirect:·gc.churn.PS_Eden_Space           thrpt   20     915.681 ±    45.549  MB/sec
          MyBenchmark.oldSmallDirect:·gc.churn.PS_Eden_Space.norm      thrpt   20    7302.691 ±   158.591    B/op
          MyBenchmark.oldSmallDirect:·gc.churn.PS_Survivor_Space       thrpt   20       0.305 ±     0.046  MB/sec
          MyBenchmark.oldSmallDirect:·gc.churn.PS_Survivor_Space.norm  thrpt   20       2.429 ±     0.341    B/op
          MyBenchmark.oldSmallDirect:·gc.count                         thrpt   20     368.000              counts
          MyBenchmark.oldSmallDirect:·gc.time                          thrpt   20     191.000                  ms
          MyBenchmark.oldSmallHeap                                     thrpt   20  220111.795 ± 11281.918   ops/s
          MyBenchmark.oldSmallHeap:·gc.alloc.rate                      thrpt   20     801.803 ±    41.178  MB/sec
          MyBenchmark.oldSmallHeap:·gc.alloc.rate.norm                 thrpt   20    5736.002 ±     0.001    B/op
          MyBenchmark.oldSmallHeap:·gc.churn.PS_Eden_Space             thrpt   20     801.690 ±    39.571  MB/sec
          MyBenchmark.oldSmallHeap:·gc.churn.PS_Eden_Space.norm        thrpt   20    5738.177 ±   128.804    B/op
          MyBenchmark.oldSmallHeap:·gc.churn.PS_Survivor_Space         thrpt   20       0.260 ±     0.039  MB/sec
          MyBenchmark.oldSmallHeap:·gc.churn.PS_Survivor_Space.norm    thrpt   20       1.865 ±     0.284    B/op
          MyBenchmark.oldSmallHeap:·gc.count                           thrpt   20     363.000              counts
          MyBenchmark.oldSmallHeap:·gc.time                            thrpt   20     194.000                  ms
          

          daschl Michael Nitschinger added a comment - - edited Here are the full results with your approach (sharing and reader index reset) for both 1.5k small and 8k large docs: MyBenchmark.newLargeDirect thrpt 20 31645.933 ± 1459.743 ops/s MyBenchmark.newLargeDirect:·gc.alloc.rate thrpt 20 597.478 ± 27.475 MB/sec MyBenchmark.newLargeDirect:·gc.alloc.rate.norm thrpt 20 29736.014 ± 0.001 B/op MyBenchmark.newLargeDirect:·gc.churn.PS_Eden_Space thrpt 20 596.663 ± 26.772 MB/sec MyBenchmark.newLargeDirect:·gc.churn.PS_Eden_Space.norm thrpt 20 29708.606 ± 688.016 B/op MyBenchmark.newLargeDirect:·gc.churn.PS_Survivor_Space thrpt 20 0.249 ± 0.065 MB/sec MyBenchmark.newLargeDirect:·gc.churn.PS_Survivor_Space.norm thrpt 20 12.395 ± 3.247 B/op MyBenchmark.newLargeDirect:·gc.count thrpt 20 364.000 counts MyBenchmark.newLargeDirect:·gc.time thrpt 20 196.000 ms MyBenchmark.newLargeHeap thrpt 20 31727.854 ± 1325.760 ops/s MyBenchmark.newLargeHeap:·gc.alloc.rate thrpt 20 599.022 ± 25.214 MB/sec MyBenchmark.newLargeHeap:·gc.alloc.rate.norm thrpt 20 29736.014 ± 0.001 B/op MyBenchmark.newLargeHeap:·gc.churn.PS_Eden_Space thrpt 20 599.502 ± 28.201 MB/sec MyBenchmark.newLargeHeap:·gc.churn.PS_Eden_Space.norm thrpt 20 29759.234 ± 644.924 B/op MyBenchmark.newLargeHeap:·gc.churn.PS_Survivor_Space thrpt 20 0.250 ± 0.051 MB/sec MyBenchmark.newLargeHeap:·gc.churn.PS_Survivor_Space.norm thrpt 20 12.443 ± 2.667 B/op MyBenchmark.newLargeHeap:·gc.count thrpt 20 362.000 counts MyBenchmark.newLargeHeap:·gc.time thrpt 20 191.000 ms MyBenchmark.newSmallDirect thrpt 20 212878.358 ± 10050.930 ops/s MyBenchmark.newSmallDirect:·gc.alloc.rate thrpt 20 789.301 ± 37.266 MB/sec MyBenchmark.newSmallDirect:·gc.alloc.rate.norm thrpt 20 5840.002 ± 0.001 B/op MyBenchmark.newSmallDirect:·gc.churn.PS_Eden_Space thrpt 20 789.403 ± 36.655 MB/sec MyBenchmark.newSmallDirect:·gc.churn.PS_Eden_Space.norm thrpt 20 5843.212 ± 133.716 B/op MyBenchmark.newSmallDirect:·gc.churn.PS_Survivor_Space thrpt 20 0.245 ± 0.063 MB/sec MyBenchmark.newSmallDirect:·gc.churn.PS_Survivor_Space.norm thrpt 20 1.811 ± 0.470 B/op MyBenchmark.newSmallDirect:·gc.count thrpt 20 367.000 counts MyBenchmark.newSmallDirect:·gc.time thrpt 20 189.000 ms MyBenchmark.newSmallHeap thrpt 20 219323.555 ± 9935.312 ops/s MyBenchmark.newSmallHeap:·gc.alloc.rate thrpt 20 813.647 ± 36.882 MB/sec MyBenchmark.newSmallHeap:·gc.alloc.rate.norm thrpt 20 5840.002 ± 0.001 B/op MyBenchmark.newSmallHeap:·gc.churn.PS_Eden_Space thrpt 20 813.845 ± 35.084 MB/sec MyBenchmark.newSmallHeap:·gc.churn.PS_Eden_Space.norm thrpt 20 5843.708 ± 112.396 B/op MyBenchmark.newSmallHeap:·gc.churn.PS_Survivor_Space thrpt 20 0.263 ± 0.048 MB/sec MyBenchmark.newSmallHeap:·gc.churn.PS_Survivor_Space.norm thrpt 20 1.887 ± 0.320 B/op MyBenchmark.newSmallHeap:·gc.count thrpt 20 360.000 counts MyBenchmark.newSmallHeap:·gc.time thrpt 20 191.000 ms MyBenchmark.oldLargeDirect thrpt 20 30344.874 ± 1309.899 ops/s MyBenchmark.oldLargeDirect:·gc.alloc.rate thrpt 20 725.622 ± 31.336 MB/sec MyBenchmark.oldLargeDirect:·gc.alloc.rate.norm thrpt 20 37664.014 ± 0.001 B/op MyBenchmark.oldLargeDirect:·gc.churn.PS_Eden_Space thrpt 20 726.830 ± 32.893 MB/sec MyBenchmark.oldLargeDirect:·gc.churn.PS_Eden_Space.norm thrpt 20 37727.626 ± 531.782 B/op MyBenchmark.oldLargeDirect:·gc.churn.PS_Survivor_Space thrpt 20 0.448 ± 0.095 MB/sec MyBenchmark.oldLargeDirect:·gc.churn.PS_Survivor_Space.norm thrpt 20 23.264 ± 4.863 B/op MyBenchmark.oldLargeDirect:·gc.count thrpt 20 367.000 counts MyBenchmark.oldLargeDirect:·gc.time thrpt 20 193.000 ms MyBenchmark.oldLargeHeap thrpt 20 30828.832 ± 1369.576 ops/s MyBenchmark.oldLargeHeap:·gc.alloc.rate thrpt 20 580.162 ± 25.747 MB/sec MyBenchmark.oldLargeHeap:·gc.alloc.rate.norm thrpt 20 29632.014 ± 0.001 B/op MyBenchmark.oldLargeHeap:·gc.churn.PS_Eden_Space thrpt 20 580.847 ± 31.633 MB/sec MyBenchmark.oldLargeHeap:·gc.churn.PS_Eden_Space.norm thrpt 20 29658.400 ± 686.091 B/op MyBenchmark.oldLargeHeap:·gc.churn.PS_Survivor_Space thrpt 20 0.259 ± 0.049 MB/sec MyBenchmark.oldLargeHeap:·gc.churn.PS_Survivor_Space.norm thrpt 20 13.222 ± 2.440 B/op MyBenchmark.oldLargeHeap:·gc.count thrpt 20 363.000 counts MyBenchmark.oldLargeHeap:·gc.time thrpt 20 192.000 ms MyBenchmark.oldSmallDirect thrpt 20 197502.798 ± 9973.839 ops/s MyBenchmark.oldSmallDirect:·gc.alloc.rate thrpt 20 915.207 ± 46.391 MB/sec MyBenchmark.oldSmallDirect:·gc.alloc.rate.norm thrpt 20 7296.002 ± 0.001 B/op MyBenchmark.oldSmallDirect:·gc.churn.PS_Eden_Space thrpt 20 915.681 ± 45.549 MB/sec MyBenchmark.oldSmallDirect:·gc.churn.PS_Eden_Space.norm thrpt 20 7302.691 ± 158.591 B/op MyBenchmark.oldSmallDirect:·gc.churn.PS_Survivor_Space thrpt 20 0.305 ± 0.046 MB/sec MyBenchmark.oldSmallDirect:·gc.churn.PS_Survivor_Space.norm thrpt 20 2.429 ± 0.341 B/op MyBenchmark.oldSmallDirect:·gc.count thrpt 20 368.000 counts MyBenchmark.oldSmallDirect:·gc.time thrpt 20 191.000 ms MyBenchmark.oldSmallHeap thrpt 20 220111.795 ± 11281.918 ops/s MyBenchmark.oldSmallHeap:·gc.alloc.rate thrpt 20 801.803 ± 41.178 MB/sec MyBenchmark.oldSmallHeap:·gc.alloc.rate.norm thrpt 20 5736.002 ± 0.001 B/op MyBenchmark.oldSmallHeap:·gc.churn.PS_Eden_Space thrpt 20 801.690 ± 39.571 MB/sec MyBenchmark.oldSmallHeap:·gc.churn.PS_Eden_Space.norm thrpt 20 5738.177 ± 128.804 B/op MyBenchmark.oldSmallHeap:·gc.churn.PS_Survivor_Space thrpt 20 0.260 ± 0.039 MB/sec MyBenchmark.oldSmallHeap:·gc.churn.PS_Survivor_Space.norm thrpt 20 1.865 ± 0.284 B/op MyBenchmark.oldSmallHeap:·gc.count thrpt 20 363.000 counts MyBenchmark.oldSmallHeap:·gc.time thrpt 20 194.000 ms
          benoit.wiart Benoit Wiart added a comment -

          thanx for all the benchmarks.

          For the direct buffers the stream based parsing is better (thrpt and memory) for 1.6k and 8k documents.

          For heap allocated buffers the stream based parsing should theoretically be worse as it adds memory allocation (the ByteBufInputstream) and cpu instructions (the byte is wrapped in an inputstream)

          Should you keep the old logic ?

          You know this driver better than me...

          Is there any platform where the ByteBuf will be heap allocated ?

          If so we should test the ByteBuf and use the streaming logic if its not baked by an array.

          If not, K.I.S.S use only the stream.

           

           

          benoit.wiart Benoit Wiart added a comment - thanx for all the benchmarks. For the direct buffers the stream based parsing is better (thrpt and memory) for 1.6k and 8k documents. For heap allocated buffers the stream based parsing should theoretically be worse as it adds memory allocation (the ByteBufInputstream) and cpu instructions (the byte is wrapped in an inputstream) Should you keep the old logic ? You know this driver better than me... Is there any platform where the ByteBuf will be heap allocated ? If so we should test the ByteBuf and use the streaming logic if its not baked by an array. If not, K.I.S.S use only the stream.    

          I checked the codepaths and right now as far as I can see there are many places going down this codepath (all kinds of decoders in view, query,.. response parsing) and I think they are all direct-backed buffers. That said, since the check is trivial I'd rather keep it and move forward with a simple conditional.

          I'll move your change into gerrit and if its okay make the simple change, test it and get it in.

          daschl Michael Nitschinger added a comment - I checked the codepaths and right now as far as I can see there are many places going down this codepath (all kinds of decoders in view, query,.. response parsing) and I think they are all direct-backed buffers. That said, since the check is trivial I'd rather keep it and move forward with a simple conditional. I'll move your change into gerrit and if its okay make the simple change, test it and get it in.
          daschl Michael Nitschinger added a comment - http://review.couchbase.org/#/c/84255/

          Merged into master, thanks much! Will be in 2.5.2, released first week of november. thanks again!

          daschl Michael Nitschinger added a comment - Merged into master, thanks much! Will be in 2.5.2, released first week of november. thanks again!

          People

            daschl Michael Nitschinger
            benoit.wiart Benoit Wiart
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty