Details
-
Bug
-
Resolution: Not a Bug
-
Critical
-
Cheshire-Cat
-
Untriaged
-
1
-
Unknown
Description
Saw these errors while investigating 46318 (logs - s3://cb-customers-secure/mb-46190/2021-05-21/collectinfo-2021-05-21t045513-ns_1@172.23.97.77.zip)
1. On node 172.23.97.77, Index gets built:
2021-05-20T14:07:23.072-07:00 [Info] Indexer::handleBuildIndex Added Index: [12917581944872651272] to Stream: INIT_STREAM State: INDEX_STATE_INITIAL
|
2021-05-20T14:07:23.074-07:00 [Info] MutationMgr::handleUpdateIndexInstMap, updated instances:
|
InstanceId: 12917581944872651272 Name: idx1_KDve3 Keyspace: bucket2:scope_2:coll_13 State: INDEX_STATE_INITIAL Stream: INIT_STREAM RState: RebalActive Version: 0 ReplicaId: 3
|
2. Disk snapshot gets created for both partition 2 and partition 4 as part of build done.
2021-05-20T14:11:25.363-07:00 [Info] plasmaSlice::OpenSnapshot SliceId 0 IndexInstId 12917581944872651272 PartitionId 4 Creating New Snapshot SnapshotInfo: count:236481 committed:true
|
2021-05-20T14:11:25.363-07:00 [Info] StorageMgr::handleCreateSnapshot Added New Snapshot Index: 12917581944872651272 PartitionId: 4 SliceId: 0 Crc64: 9528950848570813173 (Sn
|
apshotInfo: count:236481 committed:true) SnapType DISK_SNAP SnapAligned true SnapCreateDur 144.248231ms SnapOpenDur 87.709µs
|
2021-05-20T14:11:25.363-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 12917581944872651272, PartitionId 4 Creating recovery point ...
|
2021-05-20T14:11:25.416-07:00 [Info] plasmaSlice::OpenSnapshot SliceId 0 IndexInstId 12917581944872651272 PartitionId 2 Creating New Snapshot SnapshotInfo: count:236459 committed:true
|
2021-05-20T14:11:25.416-07:00 [Info] StorageMgr::handleCreateSnapshot Added New Snapshot Index: 12917581944872651272 PartitionId: 2 SliceId: 0 Crc64: 9528950848570813173 (Sn
|
apshotInfo: count:236459 committed:true) SnapType DISK_SNAP SnapAligned true SnapCreateDur 53.119628ms SnapOpenDur 82.812µs
|
2021-05-20T14:11:25.416-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 12917581944872651272, PartitionId 2 Creating recovery point ..
|
3. Merge happens and index moves to ACTIVE state
2021-05-20T14:11:25.601-07:00 [Info] Timekeeper::checkInitStreamReadyToMerge Index Ready To Merge. Index: 12917581944872651272 Stream: INIT_STREAM KeyspaceId: bucket2:scope_
|
2:coll_13 SessionId: 1 LastFlushTS: bucket: bucket2, scopeId: , collectionId: , vbuckets: 1024 Crc64: 9528950848570813173 snapType DISK_SNAP -
|
|
2021-05-20T14:11:25.606-07:00 [Info] MutationMgr::handleUpdateIndexInstMap, updated instances:
|
InstanceId: 12917581944872651272 Name: idx1_KDve3 Keyspace: bucket2:scope_2:coll_13 State: INDEX_STATE_ACTIVE Stream: MAINT_STREAM RState: RebalActive Version: 0 ReplicaId: 3
|
4. Test kills indexer for recovery after 25mins.
2021-05-20T14:34:41.639-07:00 [Info] Indexer started with command line:
|
5. Plasma is seen recovering both the partitions:
2021-05-20T14:34:48.584-07:00 [Info] bucket2/idx1_72qm/Backstore#11860434538765390097:3 Plasma: /data/@2i/bucket2_idx1_KDve3_12917581944872651272_4.index/docIndex: Repair 0 added 0 deleted
|
2021-05-20T14:34:49.161-07:00 [Info] bucket2/idx1_72qm/Backstore#11860434538765390097:3 Plasma: /data/@2i/bucket2_idx1_KDve3_12917581944872651272_2.index/docIndex: Repair 0 added 0 deleted
|
|
2021-05-20T14:34:57.035-07:00 [Info] bucket2/idx1_72qm/Mainstore#11860434538765390097:3 Plasma: /data/@2i/bucket2_idx1_KDve3_12917581944872651272_4.index/mainIndex: Repair 20 added 0 deleted
|
2021-05-20T14:34:57.110-07:00 [Info] bucket2/idx1_72qm/Mainstore#11860434538765390097:3 Plasma: /data/@2i/bucket2_idx1_KDve3_12917581944872651272_2.index/mainIndex: Repair 27 added 0 deleted
|
6. Then "Unable to find recovery point" message appears for both the partitions and disk snapshot is not available for loading.
2021-05-20T14:34:57.211-07:00 [Info] plasmaSlice::doRecovery SliceId 0 IndexInstId 12917581944872651272 PartitionId 4 Recovering from recovery point ..
|
2021-05-20T14:34:57.211-07:00 [Info] plasmaSlice::doRecovery SliceId 0 IndexInstId 12917581944872651272 PartitionId 4 Unable to find recovery point. Resetting store ..
|
|
2021-05-20T14:36:37.598-07:00 [Info] plasmaSlice::doRecovery SliceId 0 IndexInstId 12917581944872651272 PartitionId 2 Recovering from recovery point ..
|
2021-05-20T14:36:37.598-07:00 [Info] plasmaSlice::doRecovery SliceId 0 IndexInstId 12917581944872651272 PartitionId 2 Unable to find recovery point. Resetting store ..
|
|
2021-05-20T14:36:37.596-07:00 [Info] plasmaSlice:NewplasmaSlice Created New Slice Id 0 IndexInstId 12917581944872651272 partitionId 4 WriterThreads 2
|
2021-05-20T14:36:37.596-07:00 [Info] Indexer::initPartnInstance Initialized Slice:
|
Index: 12917581944872651272 Slice: SliceId: 0 File: /data/@2i/bucket2_idx1_KDve3_12917581944872651272_4.index Index: 12917581944872651272 Partition: 4
|
2021-05-20T14:36:37.596-07:00 [Info] Indexer::initPartnInstance Initialized Partition:
|
Index: 12917581944872651272 Partition: PartitionId: 2 Endpoints: [:9105]
|
|
2021-05-20T14:36:38.996-07:00 [Info] Indexer::initPartnInstance Initialized Slice:
|
Index: 12917581944872651272 Slice: SliceId: 0 File: /data/@2i/bucket2_idx1_KDve3_12917581944872651272_2.index Index: 12917581944872651272 Partition: 2
|
2021-05-20T14:36:38.996-07:00 [Info] StorageMgr::updateIndexSnapMapForIndex IndexInst 12917581944872651272 Partitions [4 2]
|
2021-05-20T14:36:38.996-07:00 [Info] StorageMgr::openSnapshot IndexInst:12917581944872651272 Partition:4 No Snapshot Found.
|
2021-05-20T14:36:38.996-07:00 [Info] StorageMgr::updateIndexSnapMapForIndex IndexInst 12917581944872651272 Adding Nil Snapshot.
|
7. There are multiple such messages. I have investigated only the last one.
cbcollect_info_ns_1@172.23.97.77_20210521-045516 > grep "Unable to find recovery point. Resetting store" ns_server.indexer.log | wc -l
|
49
|