Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Not a Bug
-
6.6.2
-
None
-
Untriaged
-
1
-
Unknown
Description
Steps to reproduce:
- Create 4 buckets
- Create indexes with replicas on each of the 4 buckets.
- Run pillowfight to continuously load data ((buckets have 1M, 1M , 1M and 3M items). The bucket RR needs to be under 10%. Load until then
- Run a shell script that runs the request_plus scans continuously.
- Run stress-ng with the params -> stress-ng --vm 4 --vm-bytes 1G --metrics-brief --vm-keep --vm-locked -m 4 --aggressive --vm-populate. (Adjust the --vm-bytes param depending upon the VM resources)
- Once you run enough stress-ng processes, OOM kill will kick in. This can be verified by checking the dmesg ( dmesg -T | egrep -i 'killed process' )
- There's a possibility that stress-ng gets spawned and killed since OOM kill is determined by a oom_score_adj factor. In order to make sure that memcached gets killed run this
echo 1000 > /proc/<memcached PID>/oom_score_adj |
- Observe that the scans are timing out and that the index has rolled back to 0.
Logs and pcaps are attached. There were 2 instances of rollbacks observed. Please use these timestamps for analysis
Instance 1
Index node1 ( 172.23.106.159) |
2022-07-26T03:26:54.738-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test8 |
|
Index node2 ( 172.23.106.163) |
|
2022-07-26T03:26:58.186-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test8 |
Instance 2
Index node1 ( 172.23.106.159) |
|
2022-07-26T05:06:12.658-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test8 |
|
Index node2 ( 172.23.106.163) |
|
2022-07-26T05:06:10.805-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test8 |
Log bundles ->
s3://cb-customers-secure/cbse12279oomkil2/2022-07-26/collectinfo-2022-07-26t125548-ns_1@172.23.105.36.zip |
s3://cb-customers-secure/cbse12279oomkil2/2022-07-26/collectinfo-2022-07-26t125548-ns_1@172.23.105.37.zip |
s3://cb-customers-secure/cbse12279oomkil2/2022-07-26/collectinfo-2022-07-26t125548-ns_1@172.23.106.156.zip |
s3://cb-customers-secure/cbse12279oomkil2/2022-07-26/collectinfo-2022-07-26t125548-ns_1@172.23.106.159.zip |
s3://cb-customers-secure/cbse12279oomkil2/2022-07-26/collectinfo-2022-07-26t125548-ns_1@172.23.106.163.zip |
s3://cb-customers-secure/cbse12279oomkil2/2022-07-26/collectinfo-2022-07-26t125548-ns_1@172.23.106.204.zip |
I have the packet capture files too and can attach them if necessary.
I think the case where the bucket got flushed and Indexer had to rollback to zero is completely fine.
The is when the bucket flush starts:
[ns_server:debug,2022-07-26T03:26:48.768-07:00,ns_1@172.23.105.36:ns_audit<0.28196.0>:ns_audit:handle_call:146]Audit flush_bucket: [{remote,{[{ip,<<"10.100.195.195">>},{port,49358}]}},
{sessionid,<<"634890a73e552079485242dc299216a91c34cdda">>},
{real_userid,{[{domain,builtin},
{user,<<"<ud>Administrator</ud>">>}]}},
{timestamp,<<"2022-07-26T03:26:48.768-07:00">>},
{bucket_name,<<"test8">>}]
This is when it finishes:
[ns_server:debug,2022-07-26T03:26:48.768-07:00,ns_1@172.23.105.36:ns_audit<0.28196.0>:ns_audit:handle_call:146]Audit flush_bucket: [{remote,{[{ip,<<"10.100.195.195">>},{port,49358}]}},
{sessionid,<<"634890a73e552079485242dc299216a91c34cdda">>},
{real_userid,{[{domain,builtin},
{user,<<"<ud>Administrator</ud>">>}]}},
{timestamp,<<"2022-07-26T03:26:48.768-07:00">>},
{bucket_name,<<"test8">>}]
Flushs empties the bucket by deleting the vbucket which results in the creation of a new vbuuid. It's not surprising that Indexing rolls back to zero a few seconds after the bucket flush finishes:
Index node1 ( 172.23.106.159)
2022-07-26T03:26:54.738-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test8
The second case at 5:06 is more interesting as it is the result of a failover induced by an OOM kill.
OOM kill:
2022-07-26T05:02:58.678-07:00, ns_memcached:0:info:message(ns_1@172.23.105.36) - Control connection to memcached on 'ns_1@172.23.105.36' disconnected. Check logs for details.
2022-07-26T05:03:00.548-07:00, ns_log:0:info:message(ns_1@172.23.105.36) - Service 'memcached' exited with status 137. Restarting. Messages:
Failover:
2022-07-26T05:05:02.810-07:00, ns_orchestrator:0:info:message(ns_1@172.23.105.36) - Starting failover of nodes ['ns_1@172.23.105.36']. Operation Id = 2139917b8f5387cec83fac633b7e7b5a
2022-07-26T05:05:05.290-07:00, ns_memcached:0:info:message(ns_1@172.23.105.36) - Shutting down bucket "test8" on 'ns_1@172.23.105.36' for deletion
2022-07-26T05:05:07.683-07:00, ns_memcached:0:info:message(ns_1@172.23.105.36) - Bucket "test6" loaded on node 'ns_1@172.23.105.36' in 124 seconds.
2022-07-26T05:05:08.608-07:00, failover:0:info:message(ns_1@172.23.105.36) - Failed over ['ns_1@172.23.105.36']: ok
2022-07-26T05:05:08.651-07:00, failover:0:info:message(ns_1@172.23.105.36) - Deactivating failed over nodes ['ns_1@172.23.105.36']
2022-07-26T05:05:08.690-07:00, auto_failover:0:info:message(ns_1@172.23.105.36) - Node ('ns_1@172.23.105.36') was automatically failed over. Reason: The data service did not respond for the duration of the auto-failover threshold. Either none of the buckets have warmed up or there is an issue with the data service.
Rollback to zero:
Index node1 ( 172.23.106.159)
2022-07-26T05:06:12.658-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test8
CPU is high on all 3 of the memcached nodes for the last 10 minutes before the failover – however it doesn't seem to be that high prior to that.
That said, I see quite a lot of slow scheduling for the DCP connection notifier task on the non-IO thread workers during this time.
$ grep "T0[345].*test8.*Slow scheduling for NON_IO task .DCP connection" memcached.log | grep -o 2022-..-..T..:. | uniq -c
50 2022-07-26T03:0
13 2022-07-26T03:1
14 2022-07-26T03:2
87 2022-07-26T03:3
19 2022-07-26T03:4
65 2022-07-26T03:5
79 2022-07-26T04:0
41 2022-07-26T04:1
13 2022-07-26T04:2
22 2022-07-26T04:3
45 2022-07-26T04:4
27 2022-07-26T04:5
33 2022-07-26T05:0
Some of these waits are pretty lengthy. Dave - is this evidence for KV being slow to get mutations to indexing? Is there a reason for the slowness?