Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-62017

Investigate reducing Item size by 30%

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Major
    • Morpheus
    • None
    • couchbase-bucket
    • None
    • 0

    Description

      It looks like we can shrink the sizeof(Item) from 120 bytes (which is jemalloc bin 128) to 80 bytes (which is jemalloc bin 80), so effectively by 30%, with moderate tradeoffs.

      This can be worthwhile, as these Item objects are stored in the CheckpointManager and the readyQ, and when the value is resident, the main memory usage in checkpoints and readyQ comes from these Item objects (as the Blob is shared with the HT).

      Current layout

      struct ItemMetaData {
          uint64_t cas; // offset 0, size 8
          cb::uint48_t revSeqno; // offset 8, size 6
          // 2-byte padding
          uint32_t flags; // offset 16, size 4
          // 4-byte padding
          time_t exptime; // offset 24, size 8
          // final size: 32
      };
       
      class Item {
          void* ItemIface_vptr; // offset 0, size 8
          int32_t RCValue_rc_refcount; // offset 8, size 4
          // 4-byte padding
          ItemMetaData metaData; // offset 16, size 32
          value_t value; // offset 48, size 8
          StoredDocKey key; // offset 56, size 32
          int64_t bySeqno; // offset 88, size 8
          cb::uint48_t prepareSeqno; // offset 96, size 6
          Vbid vbucketId; // offset 102, size 2
          queue_op op; // offset 104, size 1
          uint8_t flags_5bits; // offset 105, size 1
          uint8_t datatype_3bits; // offset 106, size 1
          Requirements durabilityReqs; // offset 108, size 4
          time_point queuedTime; // offset 112, size 8
          // final size: 120
      };
      

      Proposed layout

      struct ItemMetaData_V2 {
          uint64_t cas; // offset 0, size 8
          cb::uint48_t revSeqno; // offset 8, size 6
          cb::uint48_t exptime; // offset 14, size 6 (note 1)
          uint32_t flags; // offset 20, size 4
          // final size: 24
      };
       
      class Item {
          void* ItemIface_vptr; // offset 0, size 8
          int32_t RCValue_rc_refcount; // offset 8, size 4
          Requirements durabilityReqs; // offset 12, size 4 (use 4 byte padding)
          ItemMetaData_V2 metaData; // offset 16, size 24
          value_t value; // offset 40, size 8
          char* key; // offset 48, size 8 (note 2)
          int64_t bySeqno; // offset 56, size 8
          time_point<cb::uint48_t> queuedTime; // offset 64, size 6 (note 3)
          cb::uint48_t prepareSeqno; // offset 70, size 6
          Vbid vbucketId; // offset 76, size 2
          queue_op op; // offset 78, size 1
          uint8_t flags_and_datatype_8its; // offset 79, size 1 (note 4)
          // final size: 80
      };
      

      Notes:

      1. This requires use to use a 48-bit integer to represent the expiry time, which is a Unix timestamp in seconds. Todays timestamps take 31 bits to store. Saves us 8 bytes, by removing padding. The MCBP protocol seems to use 4 bytes for this field.
      2. The key is currently stored in a std::string, which can use SSO to remove an allocation of up to 15 bytes on GCC (see https://github.com/elliotgoodrich/SSO-23). Including the CollectionID, this is only applicable for document keys < 13-14 bytes in the best case, which seems perhaps short to be practical, although we should estimate sizes from existing data. We can store the key as a classic C-style string, or inline in the object, as for StoredValue (which I'm not sure how to display here), and then access via DocKeyView.
      3. The queuedTime has microsecond granularity on the steady_clock. With 48-bits, we get 214 years of maximum duration, and implementing offset compression is something that is not difficult to test for correctness. We only need this field for time measurements (stats).
      4. Finally, we can merge the 5 bits of flags with the 3 bits needed for the datatype. We've already done this for StoredValue, saving on padding, and making us fit perfectly in the 80 byte jemalloc bin.

      Besides increase in complexity, the only other downside seems to be losing SSO, however I suspect we are not necessarily always benefiting from it.
      The space-savings seems worth it, and it would be interesting to test with a hacky toybuild, before doing the full implementation.

      Even if we did have to later add more metadata to Item, we have 16 bytes before the 96 byte jemalloc bin, which would still be a 25% space saving compared to 128 bytes, so I think it is worth investigating.

       

      Attachments

        For Gerrit Dashboard: MB-62017
        # Subject Branch Project Status CR V

        Activity

          People

            vesko.karaganev Vesko Karaganev
            vesko.karaganev Vesko Karaganev
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              PagerDuty