Loading...

Details

Type: Improvement
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: 6.5.0
Component/s: memcached
Labels:
None
Environment:
debian 10, single node with KV and eventing 12 physical cores.

Story Points:
1

Description

I was exploring the suitability of of exposing something like the https://docs.couchbase.com/nodejs-sdk/3.0/howtos/subdocument-operations.html#array-append-and-prepend from one of our SDKs into Eventing.

Surprisingly during tests designed to emulate it use in Eventing I discovered that it didn't seem to scale well.

For example building an array of 100K small documents stared at at about 5000 inserts/sec but slowed down to just 400 inserts/sec on the last documents.

The Node SDK (and perhaps other SDKs) can be improved trying to use the array utility uses a single large document. Although it performs well in for a small number of appends it has a massive slow down based on a) document size and b) the number of elements.

This can be alleviated by implements a wrapper using a 2 level array storing chunks of documents.

The same test as above using a two level chunked array of 100K small documents never slowed down in fact I ran it at 2M documents it maintained at or above 5000 inserts/sec

The attached Node program (needs a 500MB bucket called source) can be used to experiment with a two level array - it shows with almost no overhead we can start to scale to very large item counts.

# node ./Node_arrayAppend_test.js

Tests the Node SDK performance (twolevel is false) shows a slow down as it is apparently not really

made for large data (it appends to a single document) as such not usefull for at-scale large queues

Refer to: couchbase.MutateInSpec.arrayAppend couchbase.MutateInSpec.remove, and couchbase.LookupInSpec.get

By setting twolevel true makes a two level chunked array by batch size, this will not slow down I tested

to 2M items (and if needed a 3 level chunked array would support array sizes in the billions)

All operations are synchronous, as I am trying to emulate the additions of an array SDK to Eventing

Tiny tests

./Node_arrayAppend_test.js  -count      20 -batch   10 -verbose 2 -twolevel false

./Node_arrayAppend_test.js  -count      20 -batch   10 -verbose 2 -twolevel true

Large tests (limit the verbosity)

./Node_arrayAppend_test.js  -count   50000 -batch 1000 -verbose 1 -twolevel false

./Node_arrayAppend_test.js  -count  100000 -batch 1000 -verbose 1 -twolevel false

./Node_arrayAppend_test.js  -count   50000 -batch 1000 -verbose 1 -twolevel true

./Node_arrayAppend_test.js  -count  100000 -batch 1000 -verbose 1 -twolevel true

./Node_arrayAppend_test.js  -count  500000 -batch 1000 -verbose 1 -twolevel true

./Node_arrayAppend_test.js  -count 2000000 -batch 1000 -verbose 1 -twolevel true

Example see the slow down with the current implnentation much wors as documents get bigger or the number appended gets bigger.

# node ./Node_arrayAppend_test.js  -count  100000 -batch 1000 -verbose 1 -twolevel false

will push 100000 documents into double array, then shift them out

step 1 make empty array 2020-05-29T19:27:45.332Z

step 1 batch 0 added 100000 2020-05-29T19:27:45.537Z

   ins ms 193

   ins          ops/sec 5181.347150259067

step 1 batch 1 added 100000 2020-05-29T19:27:45.759Z

   ins ms 221

   ins          ops/sec 4524.886877828054

   *** supress  until last two timings ***

step 1 batch 98 added 100000 2020-05-29T19:29:48.383Z

   ins ms 2528

   ins          ops/sec 395.56962025316454

step 1 batch 99 added 100000 2020-05-29T19:29:50.912Z

   ins ms 2528

   ins          ops/sec 395.56962025316454

TOTAL ITEMS inserted 100000

ins ms 125580

ins          ops/sec 796.3051441312311

step 2 batch 0 removed 100000 2020-05-29T19:29:53.846Z

   del ms 2932

   del          ops/sec 341.06412005457025

step 2 batch 1 removed 100000 2020-05-29T19:29:56.779Z

   del ms 2932

   del          ops/sec 341.06412005457025

   *** supress  until last two timings ***

step 2 batch 98 removed 100000 2020-05-29T19:32:14.795Z

   del ms 310

   del          ops/sec 3225.8064516129034

step 2 batch 99 removed 100000 2020-05-29T19:32:15.088Z

   del ms 289

   del          ops/sec 3460.2076124567475

TOTAL ITEMS removed 100000

del ms 144175

del          ops/sec 693.601525923357

tot ms 269756

tot combined ops/sec 370.70537819362687 steady state streaming

Example showing non-slow down with a two level batched array

# node ./Node_arrayAppend_test.js  -count  100000 -batch 1000 -verbose 1 -twolevel true

will push 100000 documents into double array, then shift them out

step 1 make empty array 2020-05-29T19:44:18.377Z

step 2 batch 0 insert 1000 2020-05-29T19:44:18.390Z

   ins ms 188

   ins          ops/sec 5319.148936170212

step 2 batch 1 insert 1000 2020-05-29T19:44:18.579Z

   ins ms 181

   ins          ops/sec 5524.861878453039

   *** supress  until last two timings ***

step 2 batch 98 insert 1000 2020-05-29T19:44:35.142Z

   ins ms 164

   ins          ops/sec 6097.560975609756

step 2 batch 99 insert 1000 2020-05-29T19:44:35.306Z

   ins ms 167

   ins          ops/sec 5988.023952095808

TOTAL ITEMS inserted 100000

ins ms 17096

ins          ops/sec 5849.32147870847

step 2 batch 0 remove 1000 2020-05-29T19:44:35.474Z

   del ms 315

   del          ops/sec 3174.6031746031745

step 2 batch 1 remove 1000 2020-05-29T19:44:35.790Z

   del ms 316

   del          ops/sec 3164.5569620253164

   *** supress  until last two timings ***

step 2 batch 98 remove 1000 2020-05-29T19:45:02.954Z

   del ms 290

   del          ops/sec 3448.2758620689656

step 2 batch 99 remove 1000 2020-05-29T19:45:03.245Z

   del ms 296

   del          ops/sec 3378.3783783783783

TOTAL ITEMS removed 100000

del ms 28068

del          ops/sec 3562.776115148924

tot ms 45164

tot combined ops/sec 2214.1528651138074 steady state streaming

The point here if we have a utility for an at-scale database it should be able to play with large at-scale data sets.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

Node_arrayAppend_test.js
9 kB
29/May/20 12:50 PM

Issue Links

relates to

MB-30046 Upgrade embedded jsonsl in subjson

Open

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

appendArray utility does not scale

Details

Description

Attachments

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty