Details
-
Bug
-
Resolution: Fixed
-
Critical
-
6.6.5, 7.0.2, 7.1.0
-
RC1 - Enterprise Edition 7.1.0 build 2475
-
Untriaged
-
-
1
-
No
Description
1. Load 1 billion 5k(body field contains the major chunk of data) items in a default magma bucket.
2. Create 50% fragmentation in KV
3. Create an secondary index idx0 and wait for it to build.
4. Start a data load as below:
Read Start: 0
|
Read End: 500000000
|
Update Start: 0
|
Update End: 500000000
|
Expiry Start: 0
|
Expiry End: 0
|
Delete Start: 500000000
|
Delete End: 1000000000
|
Create Start: 1000000000
|
Create End: 1500000000
|
5. Rebalance IN 1 KV node. Rebalance finished in 11019.869 sec
6. Create another index idx1 on body field while the above rebalance was at 35%. Index was building while KV rebalances were running. Total plasma disk size was at ~2TB.
7. Rebalance OUT 1 KV node. Rebalance finished in 10870.6289999 sec
8. Rebalance IN 2 and out 1 KV node. Rebalance failed at 80%.
{u'code': 0, u'module': u'ns_orchestrator', u'type': u'critical', u'node': u'ns_1@172.23.110.64', u'tstamp': 1647477583397L, u'shortText': u'message', u'serverTime': u'2022-03-16T17:39:43.397Z', u'text': u"Rebalance exited with reason {service_rebalance_failed,index,\n {agent_died,<26112.18949.96>,\n {lost_connection,\n {'ns_1@172.23.110.66',shutdown}}}}.\nRebalance Operation Id = 6e737e537f7cc57e8962bec1a189e45e"} |
2022-03-16 17:39:52,405 | test | ERROR | pool-3-thread-6 | [rest_client:print_UI_logs:2708] {u'code': 0, u'module': u'ns_log', u'type': u'info', u'node': u'ns_1@172.23.110.66', u'tstamp': 1647477583388L, u'shortText': u'message', u'serverTime': u'2022-03-16T17:39:43.388Z', u'text': u"Service 'index' exited with status 2. Restarting. Messages:\nruntime.gopark(0x14603b8, 0x0, 0x1809, 0x1)\n\t/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.5/go/src/runtime/proc.go:336 +0xe5 fp=0xc018ea99e0 sp=0xc018ea99c0 pc=0x441ae5\nruntime.selectgo(0xc018ea9c50, 0xc018ea9ba4, 0x0, 0x0, 0x3, 0x1459601, 0x0, 0x1)\n\t/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.5/go/src/runtime/select.go:327 +0xef7 fp=0xc018ea9b18 sp=0xc018ea99e0 pc=0x452a97\ngithub.com/couchbase/indexing/secondary/indexer.(*Rebalancer).processDropIndexQueue(0xc018e96000)\n\t/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/rebalancer.go:1383 +0x1b2 fp=0xc018ea9fd8 sp=0xc018ea9b18 pc=0xfe9cb2\nruntime.goexit()\n\t/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.5/go/src/runtime/asm_amd64.s:1371 +0x1 fp=0xc018ea9fe0 sp=0xc018ea9fd8 pc=0x4784c1\ncreated by github.com/couchbase/indexing/secondary/indexer.NewRebalancer\n\t/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/rebalancer.go:171 +0x545\n"} |
2022-03-16 17:39:52,407 | test | ERROR | pool-3-thread-6 | [rest_client:print_UI_logs:2708] {u'code': 0, u'module': u'ns_memcached', u'type': u'info', u'node': u'ns_1@172.23.110.65', u'tstamp': 1647477562188L, u'shortText': u'message', u'serverTime': u'2022-03-16T17:39:22.188Z', u'text': u'Shutting down bucket "GleamBookUsers0" on \'ns_1@172.23.110.65\' for deletion'} |
2022-03-16 17:39:52,407 | test | ERROR | pool-3-thread-6 | [rest_client:print_UI_logs:2708] {u'code': 0, u'module': u'ns_log', u'type': u'info', u'node': u'ns_1@172.23.110.66', u'tstamp': 1647476292642L, u'shortText': u'message', u'serverTime': u'2022-03-16T17:18:12.642Z', u'text': u"Service 'index' exited with status 2. Restarting. Messages:\n\t/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/mvcc.go:824 +0x3bd fp=0xc05021feb8 sp=0xc05021fc70 pc=0xe2c33d\ngithub.com/couchbase/plasma.(*Plasma).VisitPartition(0xc005ae2000, 0x46, 0xc014640000, 0xc014641300, 0xc0136419c0, 0x0, 0x0)\n\t/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/page_visitor.go:87 +0x24a fp=0xc05021ff38 sp=0xc05021feb8 pc=0xddbdca\ngithub.com/couchbase/plasma.(*Plasma).PageVisitor.func1(0xc048f98530, 0xc005ae2000, 0xc0136419c0, 0xc0007c4900, 0x48, 0x48, 0x46, 0xc014640000, 0xc014641300)\n\t/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/page_visitor.go:41 +0x88 fp=0xc05021ff98 sp=0xc05021ff38 pc=0xe2d1c8\nruntime.goexit()\n\t/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.5/go/src/runtime/asm_amd64.s:1371 +0x1 fp=0xc05021ffa0 sp=0xc05021ff98 pc=0x4784c1\ncreated by github.com/couchbase/plasma.(*Plasma).PageVisitor\n\t/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/page_visitor.go:39 +0x18b\n"} |
Note: This is found while trying out a recently reported customer use case(CBSE-11583). The env is much better in this case. These are physical machines with 256GB RAM and 72 cores.
Attachments
Issue Links
- is a backport of
-
MB-51491 Indexer/Plasma crash lead to kv rebalance failure.
- Closed