Details
-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
6.6.2
-
None
-
Untriaged
-
1
-
Unknown
Description
While trying to reproduce the CBSE ticket , we ran into the index rollback to zero issue
Steps to reproduce:
- Create 4 buckets
- Create indexes with replicas on each of the 4 buckets.
- Run pillowfight to continuously load data ((buckets have 1M, 1M , 1M and 3M items). The bucket RR needs to be under 10%. Load until then
- Run a shell script that runs the request_plus scans continuously.
- Simulate the memcached kill which fails over the KV node (on the orchestrator)
- Observe that the scans are timing out and that the index has rolled back to 0
For step 5, the following commands were run ->
sudo chmod 777 /proc/sysrq-trigger |
sudo echo f > /proc/sysrq-trigger
|
|
This kills the process that's the most memory-intensive. It was repeated until dmesg showed that memcached was killed. After memcached was killed, autofailover was triggered and index rollback occurred. This was done on the orchestrator node (172.23.105.36) |
Index rollback
2022-07-22T05:18:38.879-07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM test1 |
Scans timing out
2022-07-22T05:07:21.412-07:00 [Info] SCAN##17582 RESPONSE status:(error = Indexer rollback), requestId: c4b8406d-9534-4173-af34-e39bb45a4af3 |
2022-07-22T05:17:00.692-07:00 [Info] SCAN##17652 RESPONSE status:(error = Indexer rollback), requestId: ca08cb49-821a-4309-8a02-c4fed2e47bca |
2022-07-22T05:18:39.065-07:00 [Info] SCAN##17657 RESPONSE status:(error = Indexer rollback), requestId: fcea066f-4378-461f-9b30-5f22b7b4dd10 |
Logs ->
s3://cb-customers-secure/cbse122792/122792/2022-07-22/collectinfo-2022-07-22t124922-ns_1@172.23.105.36.zip |
s3://cb-customers-secure/cbse122792/122792/2022-07-22/collectinfo-2022-07-22t124922-ns_1@172.23.105.37.zip |
s3://cb-customers-secure/cbse122792/122792/2022-07-22/collectinfo-2022-07-22t124922-ns_1@172.23.106.156.zip |
s3://cb-customers-secure/cbse122792/122792/2022-07-22/collectinfo-2022-07-22t124922-ns_1@172.23.106.159.zip |
s3://cb-customers-secure/cbse122792/122792/2022-07-22/collectinfo-2022-07-22t124922-ns_1@172.23.106.163.zip |
s3://cb-customers-secure/cbse122792/122792/2022-07-22/collectinfo-2022-07-22t124922-ns_1@172.23.106.204.zip |
Cluster is still live -> http://172.23.105.36:8091/
Attachments
Issue Links
- is duplicated by
-
MB-53172 [6.6.5 build 10104] - Secondary Index rollback to zero after KV node auto failover
- Resolved
-
MB-53180 [6.6.5 build 10104] - Primary Index rollback to zero after KV node auto failover
- Closed
-
MB-53186 [6.6.5 build 10104] - Multiple primary Indexes rollback to zero after KV node auto failover
- Closed