Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.1.1
-
6.6.5-10080
7.1.1-3115
-
Untriaged
-
Centos 64-bit
-
1
-
Unknown
Description
STEPS
- Cluster consists of 5 nodes (2 kv, 1 eventing, index and query node each) running CB server 6.6.5-10080.
- Create the required set of buckets.
- Create 2 eventing functions - bucket op and timers.
- Deploy timer function.
- Perform offline upgrade of the cluster of 7.1.1-3115.
OBSERVATION
Timers handler is stuck in deploying state.
errors in eventing.log post upgrade
2022-06-07T23:28:07.055-07:00 [Error] Producer::Serve [timers:0] Failed to get meta data handle , err: App: timers is not registered by any of the member pools |
2022-06-07T23:28:56.610-07:00 [Info] {"allocated":65564296,"heap_allocated":65564296,"heap_idle":35356672,"heap_in_use":68321280,"heap_objects":220087,"heap_os_related":1417216,"heap_system":103677952,"last_gc_cycle":"2022-06-07 23:28:01.989445202 -0700 PDT","memory_allocations":790202,"memory_frees":570115,"next_gc_cycle":"95ms","num_gc":16,"stack_cache_in_use":4800,"stack_in_use":1179648,"stack_span_in_use":268600,"stack_system":1179648,"total_allocated":161863920} |
2022-06-07T23:29:56.611-07:00 [Info] {"allocated":72030136,"heap_allocated":72030136,"heap_idle":29523968,"heap_in_use":74219520,"heap_objects":282242,"heap_os_related":1417216,"heap_system":103743488,"last_gc_cycle":"2022-06-07 23:28:01.989445202 -0700 PDT","memory_allocations":855780,"memory_frees":573538,"next_gc_cycle":"95ms","num_gc":16,"stack_cache_in_use":4800,"stack_in_use":1114112,"stack_span_in_use":338232,"stack_system":1114112,"total_allocated":168329760} |
2022-06-07T23:30:56.611-07:00 [Info] {"allocated":6802928,"heap_allocated":6802928,"heap_idle":93437952,"heap_in_use":10371072,"heap_objects":54150,"heap_os_related":92807168,"heap_system":103809024,"last_gc_cycle":"2022-06-07 23:30:02.05224944 -0700 PDT","memory_allocations":871422,"memory_frees":817272,"next_gc_cycle":"11ms","num_gc":17,"stack_cache_in_use":4800,"stack_in_use":1048576,"stack_span_in_use":153680,"stack_system":1048576,"total_allocated":170257480} |
2022-06-07T23:31:14.846-07:00 [Error] Eventing::main metakv observe error for global config, err: unexpected EOF. Retrying... |
2022-06-07T23:31:14.846-07:00 [Error] Eventing::main metakv observe error for event handler code, err: unexpected EOF. Retrying... |
2022-06-07T23:31:14.846-07:00 [Warn] servicesChangeNotifier: Connection terminated for collection manifest notifier instance of http://%40eventing@127.0.0.1:8091, default, bucket: n1ql_op_dst, (unexpected EOF). Retrying... |
2022-06-07T23:31:14.846-07:00 [Error] ServiceMgr::initService metakv observe error for temp store, err: unexpected EOF. Retrying... |
2022-06-07T23:31:14.846-07:00 [Error] ServiceMgr::initService metakv observe error for setting store, err: unexpected EOF. Retrying... |
2022-06-07T23:31:14.846-07:00 [Error] Eventing::main metakv observe error for debugger, err: unexpected EOF. Retrying. |
Also following set of log statements indicate internal undeployment was triggered for timers handler due to deletion of metadata collection.
Can confirm that there was no step involving deletion of metadata collection in the test so not sure how it was triggered.
2022-06-07T23:28:02.243-07:00 [Info] SuperSupervisor::checkDeletedCid Undeploying timers Reason: metadata collection delete err: <nil> |
2022-06-07T23:28:02.243-07:00 [Info] SuperSupervisor::StopProducer [1] Function: timers stopping running producer instance, DeleteFunction: false, SkipMetadataCleanup: true, UpdateMetakv: true |
2022-06-07T23:28:02.243-07:00 [Info] SuperSupervisor::deleteFromLocallyDeployedApps [1] Function: timers deleting from locally deployed apps map |
2022-06-07T23:28:02.243-07:00 [Info] SuperSupervisor::CleanupProducer [1] Function: timers stopping running instance of Eventing.Producer, DeleteFunction: false, SkipMetadataCleanup: true, UpdateMetakv: true |
2022-06-07T23:28:02.243-07:00 [Info] Producer::StopRunningConsumers [timers:0] Stopping running instances of Eventing.Consumer |
2022-06-07T23:28:02.248-07:00 [Info] SuperSupervisor::deleteFromRunningProducers [0] Function: timers deleted from running functions |
NOTE
Issue is not reproducible consistently, test has ran into it only once.
PFA logs for eventing node.
Attachments
Issue Links
- backports to
-
MB-52560 [BP of MB-52474]: [Upgrade] Timers handler stuck in deploying state after offline upgrade from 6.6.5 to 7.1.1
- Closed