Details
-
Improvement
-
Resolution: Unresolved
-
Major
-
None
-
6.5.0
-
None
-
debian 10, single node with KV and eventing 12 physical cores.
-
1
Description
I was exploring the suitability of of exposing something like the https://docs.couchbase.com/nodejs-sdk/3.0/howtos/subdocument-operations.html#array-append-and-prepend from one of our SDKs into Eventing.
Surprisingly during tests designed to emulate it use in Eventing I discovered that it didn't seem to scale well.
For example building an array of 100K small documents stared at at about 5000 inserts/sec but slowed down to just 400 inserts/sec on the last documents.
The Node SDK (and perhaps other SDKs) can be improved trying to use the array utility uses a single large document. Although it performs well in for a small number of appends it has a massive slow down based on a) document size and b) the number of elements.
This can be alleviated by implements a wrapper using a 2 level array storing chunks of documents.
The same test as above using a two level chunked array of 100K small documents never slowed down in fact I ran it at 2M documents it maintained at or above 5000 inserts/sec
The attached Node program (needs a 500MB bucket called source) can be used to experiment with a two level array - it shows with almost no overhead we can start to scale to very large item counts.
# node ./Node_arrayAppend_test.js
|
Tests the Node SDK performance (twolevel is false) shows a slow down as it is apparently not really |
made for large data (it appends to a single document) as such not usefull for at-scale large queues |
|
|
Refer to: couchbase.MutateInSpec.arrayAppend couchbase.MutateInSpec.remove, and couchbase.LookupInSpec.get
|
|
|
By setting twolevel true makes a two level chunked array by batch size, this will not slow down I tested |
to 2M items (and if needed a 3 level chunked array would support array sizes in the billions) |
|
|
All operations are synchronous, as I am trying to emulate the additions of an array SDK to Eventing
|
|
|
Tiny tests
|
|
|
./Node_arrayAppend_test.js -count 20 -batch 10 -verbose 2 -twolevel false |
./Node_arrayAppend_test.js -count 20 -batch 10 -verbose 2 -twolevel true |
|
|
Large tests (limit the verbosity)
|
|
|
./Node_arrayAppend_test.js -count 50000 -batch 1000 -verbose 1 -twolevel false |
./Node_arrayAppend_test.js -count 100000 -batch 1000 -verbose 1 -twolevel false |
|
|
./Node_arrayAppend_test.js -count 50000 -batch 1000 -verbose 1 -twolevel true |
./Node_arrayAppend_test.js -count 100000 -batch 1000 -verbose 1 -twolevel true |
./Node_arrayAppend_test.js -count 500000 -batch 1000 -verbose 1 -twolevel true |
./Node_arrayAppend_test.js -count 2000000 -batch 1000 -verbose 1 -twolevel true |
Example see the slow down with the current implnentation much wors as documents get bigger or the number appended gets bigger.
# node ./Node_arrayAppend_test.js -count 100000 -batch 1000 -verbose 1 -twolevel false |
will push 100000 documents into double array, then shift them out |
|
|
step 1 make empty array 2020-05-29T19:27:45.332Z |
step 1 batch 0 added 100000 2020-05-29T19:27:45.537Z |
ins ms 193 |
ins ops/sec 5181.347150259067 |
step 1 batch 1 added 100000 2020-05-29T19:27:45.759Z |
ins ms 221 |
ins ops/sec 4524.886877828054 |
|
|
*** supress until last two timings ***
|
|
|
step 1 batch 98 added 100000 2020-05-29T19:29:48.383Z |
ins ms 2528 |
ins ops/sec 395.56962025316454 |
step 1 batch 99 added 100000 2020-05-29T19:29:50.912Z |
ins ms 2528 |
ins ops/sec 395.56962025316454 |
|
|
TOTAL ITEMS inserted 100000 |
ins ms 125580 |
ins ops/sec 796.3051441312311 |
|
|
step 2 batch 0 removed 100000 2020-05-29T19:29:53.846Z |
del ms 2932 |
del ops/sec 341.06412005457025 |
step 2 batch 1 removed 100000 2020-05-29T19:29:56.779Z |
del ms 2932 |
del ops/sec 341.06412005457025 |
|
|
*** supress until last two timings ***
|
|
|
step 2 batch 98 removed 100000 2020-05-29T19:32:14.795Z |
del ms 310 |
del ops/sec 3225.8064516129034 |
step 2 batch 99 removed 100000 2020-05-29T19:32:15.088Z |
del ms 289 |
del ops/sec 3460.2076124567475 |
|
|
TOTAL ITEMS removed 100000 |
del ms 144175 |
del ops/sec 693.601525923357 |
|
|
|
|
tot ms 269756 |
tot combined ops/sec 370.70537819362687 steady state streaming |
Example showing non-slow down with a two level batched array
# node ./Node_arrayAppend_test.js -count 100000 -batch 1000 -verbose 1 -twolevel true |
will push 100000 documents into double array, then shift them out |
|
|
step 1 make empty array 2020-05-29T19:44:18.377Z |
step 2 batch 0 insert 1000 2020-05-29T19:44:18.390Z |
ins ms 188 |
ins ops/sec 5319.148936170212 |
step 2 batch 1 insert 1000 2020-05-29T19:44:18.579Z |
ins ms 181 |
ins ops/sec 5524.861878453039 |
|
|
*** supress until last two timings ***
|
|
|
step 2 batch 98 insert 1000 2020-05-29T19:44:35.142Z |
ins ms 164 |
ins ops/sec 6097.560975609756 |
step 2 batch 99 insert 1000 2020-05-29T19:44:35.306Z |
ins ms 167 |
ins ops/sec 5988.023952095808 |
|
|
TOTAL ITEMS inserted 100000 |
ins ms 17096 |
ins ops/sec 5849.32147870847 |
|
|
step 2 batch 0 remove 1000 2020-05-29T19:44:35.474Z |
del ms 315 |
del ops/sec 3174.6031746031745 |
step 2 batch 1 remove 1000 2020-05-29T19:44:35.790Z |
del ms 316 |
del ops/sec 3164.5569620253164 |
|
|
*** supress until last two timings ***
|
|
|
step 2 batch 98 remove 1000 2020-05-29T19:45:02.954Z |
del ms 290 |
del ops/sec 3448.2758620689656 |
step 2 batch 99 remove 1000 2020-05-29T19:45:03.245Z |
del ms 296 |
del ops/sec 3378.3783783783783 |
|
|
TOTAL ITEMS removed 100000 |
del ms 28068 |
del ops/sec 3562.776115148924 |
|
|
|
|
tot ms 45164 |
tot combined ops/sec 2214.1528651138074 steady state streaming |
The point here if we have a utility for an at-scale database it should be able to play with large at-scale data sets.
Attachments
Issue Links
- relates to
-
MB-30046 Upgrade embedded jsonsl in subjson
- Open