Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-55290

Reduce impact of HashTable resize on front-end ops latency

    XMLWordPrintable

Details

    • 0
    • March-June 24

    Description

      When the ep-engine HashTables are resized, all front-end operations to that vBucket are blocked until the resize has completed.

      For large (v)Buckets the time taken to resize (and consequently how long front-end operations are blocked for) can be significant - we have observed operations taking in excess of 1 second :

      WARNING 190: Slow operation: {"bucket":"bucket","cid":"58C8BF2A0000000A/00000000C8934396/75e3eae","command":"SET","duration":"1633 ms","packet":{"bodylen":258,"cas":0,"datatype":"raw","extlen":8,"key":"<ud>xxxxxxxxxxxxxxxx</ud>","keylen":64,"magic":"ClientRequest","opaque":2923322887,"opcode":"SET","vbucket":247},"peer":{"ip":"10.12.9.16","port":57098},"response":"Success","trace":"request=667162244508816:1632951 json_validate=667162244510168:2 store=667162244515148:1632933 execute=667162244508816:1632951","worker_tid":139878729725696}
      

      While we only resize one vBucket's HashTable at a time (and hence the majority of operations will be unaffected), this has a significant impact on tail latencies:

      (Adding 80M ~256B documents to a Bucket, when there's already ~1.12B documents in the Bucket; on a 7 node cluster)

      Note that when the Bucket's vBucket histograms were resized (highlighted point), the tail latencies increased dramatically. For heavily sequential workloads like the one above - where one operation must complete before the next is issued - this can have a significant impact on throughput.

      Attachments

        Issue Links

          Activity

            People

              pavlos.georgiou Pavlos Georgiou
              drigby Dave Rigby (Inactive)
              Votes:
              2 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty