Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.0.0, 7.0.1, 7.0.2, 7.1.0
-
Untriaged
-
1
-
No
Description
It seems that there are listeners that are not being stopped and leaking goroutines:
Looking for CollectionRoutingEventListener_ ...
|
Started times: 214
|
Stopped times: 186
|
Looking for DataClonedEventListener_ ...
|
Started times: 214
|
Stopped times: 186
|
Looking for DataFailedCREventListener_ ...
|
Started times: 428
|
Stopped times: 372
|
Looking for DataFilteredEventListener_ ...
|
Started times: 214
|
Stopped times: 186
|
Looking for DataProcessedEventListener_ ...
|
Started times: 214
|
Stopped times: 186
|
Looking for DataReceivedEventListener_ ...
|
Started times: 214
|
Stopped times: 186
|
Looking for DataSentCasChangedEventListener_ ...
|
Started times: 428
|
Stopped times: 372
|
Looking for DataSentEventListener_ ...
|
Started times: 428
|
Stopped times: 372
|
Looking for DataThrottledEventListener_ ...
|
Started times: 428
|
Stopped times: 372
|
Looking for DataThroughputThrottledEventListener_ ...
|
Started times: 214
|
Stopped times: 186
|
Looking for GetReceivedEventListener_ ...
|
Started times: 428
|
Stopped times: 372
|
Looking for TargetDataSkippedEventListener_ ...
|
Started times: 428
|
Stopped times: 372
|
Looking for CollectionRoutingEventListener_ (mainPipeline)...
|
Started times: 90
|
Stopped times: 92
|
Looking for DataClonedEventListener_ (mainPipeline)...
|
Started times: 90
|
Stopped times: 92
|
Looking for DataFailedCREventListener_ (mainPipeline)...
|
Started times: 180
|
Stopped times: 184
|
Looking for DataFilteredEventListener_ (mainPipeline)...
|
Started times: 90
|
Stopped times: 92
|
Looking for DataProcessedEventListener_ (mainPipeline)...
|
Started times: 90
|
Stopped times: 92
|
Looking for DataReceivedEventListener_ (mainPipeline)...
|
Started times: 90
|
Stopped times: 92
|
Looking for DataSentCasChangedEventListener_ (mainPipeline)...
|
Started times: 180
|
Stopped times: 184
|
Looking for DataSentEventListener_ (mainPipeline)...
|
Started times: 180
|
Stopped times: 184
|
Looking for DataThrottledEventListener_ (mainPipeline)...
|
Started times: 180
|
Stopped times: 184
|
Looking for DataThroughputThrottledEventListener_ (mainPipeline)...
|
Started times: 90
|
Stopped times: 92
|
Looking for GetReceivedEventListener_ (mainPipeline)...
|
Started times: 180
|
Stopped times: 184
|
Looking for TargetDataSkippedEventListener_ (mainPipeline)...
|
Started times: 180
|
Stopped times: 184
|
Backfill pipeline is where the problem is:
Looking for CollectionRoutingEventListener_ (backfill)...
|
Started times: 124
|
Stopped times: 94
|
Looking for DataClonedEventListener_ (backfill)...
|
Started times: 124
|
Stopped times: 94
|
Looking for DataFailedCREventListener_ (backfill)...
|
Started times: 248
|
Stopped times: 188
|
Looking for DataFilteredEventListener_ (backfill)...
|
Started times: 124
|
Stopped times: 94
|
Looking for DataProcessedEventListener_ (backfill)...
|
Started times: 124
|
Stopped times: 94
|
Looking for DataReceivedEventListener_ (backfill)...
|
Started times: 124
|
Stopped times: 94
|
Looking for DataSentCasChangedEventListener_ (backfill)...
|
Started times: 248
|
Stopped times: 188
|
Looking for DataSentEventListener_ (backfill)...
|
Started times: 248
|
Stopped times: 188
|
Looking for DataThrottledEventListener_ (backfill)...
|
Started times: 248
|
Stopped times: 188
|
Looking for DataThroughputThrottledEventListener_ (backfill)...
|
Started times: 124
|
Stopped times: 94
|
Looking for GetReceivedEventListener_ (backfill)...
|
Started times: 248
|
Stopped times: 188
|
Looking for TargetDataSkippedEventListener_ (backfill)...
|
Started times: 248
|
Stopped times: 188
|
It seems that certain listeners get started/stopped twice as much as listeners in some cases.
This exhibits the same goroutine leak as MB-48722, which can cause memory bloat
Attachments
Issue Links
- is duplicated by
-
MB-48838 [Magma, 10TB, KV+XDCR, 1%]: KV Rebalance-In failed during get_dcp_docs_estimate.
- Closed
- relates to
-
MB-48722 XDCR - async listener may not stop properly
- Closed
-
MB-48728 [System Test] XDCR OOM killed multiple times
- Closed
-
MB-48772 [System Test] Rebalance exited with reason not_all_nodes_are_ready_yet
- Closed
-
MB-48672 [System Test] batchGetMeta received fatal error and had to abort - error observed in longevity
- Closed
-
MB-48677 [System Test][XDCR] RuntimeCtx : Execution timed out - observed in longevity during topology changes
- Closed
For Gerrit Dashboard: MB-48787 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
163205,6 | MB-48787 - backfill pipelines that failed to start may be orphaned and left running | master | goxdcr | Status: MERGED | +2 | +1 |