Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Not a Bug
-
6.6.1
-
6.6.1 build 9147
-
Untriaged
-
1
-
Unknown
Description
Description:
The activity monitor does not appear to be logging inactivity.
Steps to reproduce:
- Set CB_DCP_ACTIVITY_MONITOR_FREQUENCY equal to 5 seconds
- Start a backup asynchronously.
- When ' DCP ' is spotted in the logs repeat the following 5 times:
- pause the memcached process for a total of 7 seconds by sending a SIGSTOP.
- Unpause memcached by sending a SIGCONT.
- Scan the logs for the inactivity message.
What happens:
In 6.6.1-9174, the inactivity message is not present in the backup logs.
However, in 7.0.0-3583, the inactivity message is present:
2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 153) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247
|
2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 167) Stream has been inactive for 5s, last seqno 113 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247
|
2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 209) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247
|
2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 223) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247
|
What I expected to happen:
I expected the equivalent message to be present in the 6.6.1-9174. backup logs.
Logs for 7.0.0:
cbbackupmgr-collectinfo-entbackup_10.112.210.101-2020-11-09T154345.zip
[^collectinfo-2020-11-09T154308-ns_1@cb.local.zip]
Logs for 6.6.1:
cbbackupmgr-collectinfo-entbackup_10.112.202.101-2020-11-09T160246.zip
[^collectinfo-2020-11-09T160228-ns_1@cb.local.zip]
Attachments
Activity
Field | Original Value | New Value |
---|---|---|
Description |
*Description:*
The activity monitor does not appear to be logging inactivity. *Steps to reproduce:* # Set CB_DCP_ACTIVITY_MONITOR_FREQUENCY equal to 5 seconds # Start a backup asynchronously. # When ' DCP ' is spotted in the logs repeat the following 5 times: ## pause the memcached process for a total of 7 seconds by sending a SIGSTOP. ## Unpause memcached by sending a SIGCONT. # Scan the logs for the inactivity message. *What happens:* In 6.6.1-9174, the inactivity message is not present in the backup logs. However, in 7.0.0-3583, the inactivity message is present. {code:java} 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 153) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 167) Stream has been inactive for 5s, last seqno 113 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 209) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 223) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 {code} *What I expected to happen:* I expected the equivalent message to be present in the 6.6.1-9174. backup logs. |
*Description:*
The activity monitor does not appear to be logging inactivity. *Steps to reproduce:* # Set CB_DCP_ACTIVITY_MONITOR_FREQUENCY equal to 5 seconds # Start a backup asynchronously. # When ' DCP ' is spotted in the logs repeat the following 5 times: ## pause the memcached process for a total of 7 seconds by sending a SIGSTOP. ## Unpause memcached by sending a SIGCONT. # Scan the logs for the inactivity message. *What happens:* In 6.6.1-9174, the inactivity message is not present in the backup logs. However, in 7.0.0-3583, the inactivity message is present. {code:java}2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 153) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 167) Stream has been inactive for 5s, last seqno 113 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 209) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 223) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 {code} *What I expected to happen:* I expected the equivalent message to be present in the 6.6.1-9174. backup logs. |
Description |
*Description:*
The activity monitor does not appear to be logging inactivity. *Steps to reproduce:* # Set CB_DCP_ACTIVITY_MONITOR_FREQUENCY equal to 5 seconds # Start a backup asynchronously. # When ' DCP ' is spotted in the logs repeat the following 5 times: ## pause the memcached process for a total of 7 seconds by sending a SIGSTOP. ## Unpause memcached by sending a SIGCONT. # Scan the logs for the inactivity message. *What happens:* In 6.6.1-9174, the inactivity message is not present in the backup logs. However, in 7.0.0-3583, the inactivity message is present. {code:java}2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 153) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 167) Stream has been inactive for 5s, last seqno 113 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 209) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 223) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 {code} *What I expected to happen:* I expected the equivalent message to be present in the 6.6.1-9174. backup logs. |
*Description:*
The activity monitor does not appear to be logging inactivity. *Steps to reproduce:* # Set CB_DCP_ACTIVITY_MONITOR_FREQUENCY equal to 5 seconds # Start a backup asynchronously. # When ' DCP ' is spotted in the logs repeat the following 5 times: ## pause the memcached process for a total of 7 seconds by sending a SIGSTOP. ## Unpause memcached by sending a SIGCONT. # Scan the logs for the inactivity message. *What happens:* In 6.6.1-9174, the inactivity message is not present in the backup logs. However, in 7.0.0-3583, the inactivity message is present: {code:java}2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 153) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 167) Stream has been inactive for 5s, last seqno 113 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 209) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 223) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 {code} *What I expected to happen:* I expected the equivalent message to be present in the 6.6.1-9174. backup logs. Included: Backup logs for 6.6.1: |
Description |
*Description:*
The activity monitor does not appear to be logging inactivity. *Steps to reproduce:* # Set CB_DCP_ACTIVITY_MONITOR_FREQUENCY equal to 5 seconds # Start a backup asynchronously. # When ' DCP ' is spotted in the logs repeat the following 5 times: ## pause the memcached process for a total of 7 seconds by sending a SIGSTOP. ## Unpause memcached by sending a SIGCONT. # Scan the logs for the inactivity message. *What happens:* In 6.6.1-9174, the inactivity message is not present in the backup logs. However, in 7.0.0-3583, the inactivity message is present: {code:java}2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 153) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 167) Stream has been inactive for 5s, last seqno 113 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 209) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 223) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 {code} *What I expected to happen:* I expected the equivalent message to be present in the 6.6.1-9174. backup logs. Included: Backup logs for 6.6.1: |
*Description:*
The activity monitor does not appear to be logging inactivity. *Steps to reproduce:* # Set CB_DCP_ACTIVITY_MONITOR_FREQUENCY equal to 5 seconds # Start a backup asynchronously. # When ' DCP ' is spotted in the logs repeat the following 5 times: ## pause the memcached process for a total of 7 seconds by sending a SIGSTOP. ## Unpause memcached by sending a SIGCONT. # Scan the logs for the inactivity message. *What happens:* In 6.6.1-9174, the inactivity message is not present in the backup logs. However, in 7.0.0-3583, the inactivity message is present: {code:java}2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 153) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 167) Stream has been inactive for 5s, last seqno 113 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 209) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 223) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 {code} *What I expected to happen:* I expected the equivalent message to be present in the 6.6.1-9174. backup logs. Logs: Logs for 7.0.0: [^cbbackupmgr-collectinfo-entbackup_10.112.210.101-2020-11-09T154345.zip] [^collectinfo-2020-11-09T154308-ns_1@cb.local.zip] Logs for 6.6.1: [^cbbackupmgr-collectinfo-entbackup_10.112.202.101-2020-11-09T160246.zip] [^collectinfo-2020-11-09T160228-ns_1@cb.local.zip] |
Description |
*Description:*
The activity monitor does not appear to be logging inactivity. *Steps to reproduce:* # Set CB_DCP_ACTIVITY_MONITOR_FREQUENCY equal to 5 seconds # Start a backup asynchronously. # When ' DCP ' is spotted in the logs repeat the following 5 times: ## pause the memcached process for a total of 7 seconds by sending a SIGSTOP. ## Unpause memcached by sending a SIGCONT. # Scan the logs for the inactivity message. *What happens:* In 6.6.1-9174, the inactivity message is not present in the backup logs. However, in 7.0.0-3583, the inactivity message is present: {code:java}2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 153) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 167) Stream has been inactive for 5s, last seqno 113 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 209) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 223) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 {code} *What I expected to happen:* I expected the equivalent message to be present in the 6.6.1-9174. backup logs. Logs: Logs for 7.0.0: [^cbbackupmgr-collectinfo-entbackup_10.112.210.101-2020-11-09T154345.zip] [^collectinfo-2020-11-09T154308-ns_1@cb.local.zip] Logs for 6.6.1: [^cbbackupmgr-collectinfo-entbackup_10.112.202.101-2020-11-09T160246.zip] [^collectinfo-2020-11-09T160228-ns_1@cb.local.zip] |
*Description:*
The activity monitor does not appear to be logging inactivity. *Steps to reproduce:* # Set CB_DCP_ACTIVITY_MONITOR_FREQUENCY equal to 5 seconds # Start a backup asynchronously. # When ' DCP ' is spotted in the logs repeat the following 5 times: ## pause the memcached process for a total of 7 seconds by sending a SIGSTOP. ## Unpause memcached by sending a SIGCONT. # Scan the logs for the inactivity message. *What happens:* In 6.6.1-9174, the inactivity message is not present in the backup logs. However, in 7.0.0-3583, the inactivity message is present: {code:java}2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 153) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 167) Stream has been inactive for 5s, last seqno 113 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 209) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 223) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 {code} *What I expected to happen:* I expected the equivalent message to be present in the 6.6.1-9174. backup logs. Logs for 7.0.0: [^cbbackupmgr-collectinfo-entbackup_10.112.210.101-2020-11-09T154345.zip] [^collectinfo-2020-11-09T154308-ns_1@cb.local.zip] Logs for 6.6.1: [^cbbackupmgr-collectinfo-entbackup_10.112.202.101-2020-11-09T160246.zip] [^collectinfo-2020-11-09T160228-ns_1@cb.local.zip] |
Description |
*Description:*
The activity monitor does not appear to be logging inactivity. *Steps to reproduce:* # Set CB_DCP_ACTIVITY_MONITOR_FREQUENCY equal to 5 seconds # Start a backup asynchronously. # When ' DCP ' is spotted in the logs repeat the following 5 times: ## pause the memcached process for a total of 7 seconds by sending a SIGSTOP. ## Unpause memcached by sending a SIGCONT. # Scan the logs for the inactivity message. *What happens:* In 6.6.1-9174, the inactivity message is not present in the backup logs. However, in 7.0.0-3583, the inactivity message is present: {code:java}2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 153) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 167) Stream has been inactive for 5s, last seqno 113 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 209) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 223) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 {code} *What I expected to happen:* I expected the equivalent message to be present in the 6.6.1-9174. backup logs. Logs for 7.0.0: [^cbbackupmgr-collectinfo-entbackup_10.112.210.101-2020-11-09T154345.zip] [^collectinfo-2020-11-09T154308-ns_1@cb.local.zip] Logs for 6.6.1: [^cbbackupmgr-collectinfo-entbackup_10.112.202.101-2020-11-09T160246.zip] [^collectinfo-2020-11-09T160228-ns_1@cb.local.zip] |
*Description:*
The activity monitor does not appear to be logging inactivity. *Steps to reproduce:* # Set CB_DCP_ACTIVITY_MONITOR_FREQUENCY equal to 5 seconds # Start a backup asynchronously. # When ' DCP ' is spotted in the logs repeat the following 5 times: ## pause the memcached process for a total of 7 seconds by sending a SIGSTOP. ## Unpause memcached by sending a SIGCONT. # Scan the logs for the inactivity message. *What happens:* In 6.6.1-9174, the inactivity message is not present in the backup logs. However, in 7.0.0-3583, the inactivity message is present: {code:noformat} 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 153) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 167) Stream has been inactive for 5s, last seqno 113 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 209) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 2020-11-09T15:42:35.583+00:00 WARN: (DCP) (default) (vb 223) Stream has been inactive for 5s, last seqno 112 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247 {code} *What I expected to happen:* I expected the equivalent message to be present in the 6.6.1-9174. backup logs. Logs for 7.0.0: [^cbbackupmgr-collectinfo-entbackup_10.112.210.101-2020-11-09T154345.zip] [^collectinfo-2020-11-09T154308-ns_1@cb.local.zip] Logs for 6.6.1: [^cbbackupmgr-collectinfo-entbackup_10.112.202.101-2020-11-09T160246.zip] [^collectinfo-2020-11-09T160228-ns_1@cb.local.zip] |
Labels | backup |
Assignee | James Lee [ james.lee ] | Asad Zaidi [ JIRAUSER25025 ] |
Attachment | cbbackupmgr-collectinfo-couchbase-archive-2020-11-09T173359.zip [ 114962 ] |
Fix Version/s | Cheshire-Cat [ 15915 ] |
Resolution | Not a Bug [ 10200 ] | |
Status | Open [ 1 ] | Closed [ 6 ] |
Labels | backup | backup manual_testing |
Fix Version/s | 7.0.0 [ 17233 ] |
Fix Version/s | Cheshire-Cat [ 15915 ] |
Asad Zaidi and I spoke about this before the MB was raised, I'd like to highlight a few things from that conversation:
1) This was broken in 6.6.0 and should have been fixed in 6.6.1
2) Asad is using the same test case between MB and CC
3) There hasn't been any (significant) code change around the DCP client from MH to CC (most of, if not all of the changes are around line wrapping and function signatures)
4) The activity monitor is slightly more limited that it used to be due to synchronous backfill i.e. a stream is only marked active upon receiving the first mutation. If a stream never receives a mutation (but does receive a snapshot marker) the activity monitor will not be triggered.
5) We do have unit testing around this (however it may be wrong)
With that out of the way, the activity monitor seems to working as expected (note that I'm building with the 'mad-hatter' branch, all 6.6.1 builds should be equivalent); I'm using the following steps to test it:
1) Create a test cluster
2) Load the travel-sample bucket
3) Add a 'time.Sleep(5 * time.Second)' just before begin opening DCP streams (obviously this can't be done in QE)
4) Run 'cbbackupmgr' with 'CB_DCP_ACTIVITY_MONITOR_FREQUENCY=1s'
5) Whilst 'cbbackupmgr' is sleeping, use the 'tc' command to introduce an artificial delay to the network of 5s
6) Let the backup run for a little bit
7) Remove the artificial delay
8) Inspect the logs
Looking at the logs from the above steps, I see:
2020-11-09T17:28:24.278+00:00 TEMP: enable tc filter -- couchbase.(*Source).Data() at source.go:312
2020-11-09T17:28:29.279+00:00 (DCP) (travel-sample) (vb 0) Creating DCP stream {"uuid":54313299993859,"start_seqno":0,"end_seqno":38,"snap_start":0,"snap_end":0,"retries":0}
2020-11-09T17:28:38.273+00:00 WARN: (DCP) (travel-sample) (vb 0) Stream has been inactive for 1s, last seqno 12 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247
2020-11-09T17:28:39.273+00:00 WARN: (DCP) (travel-sample) (vb 0) Stream is no longer inactive, last seqno 14 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:236
2020-11-09T17:28:40.273+00:00 WARN: (DCP) (travel-sample) (vb 0) Stream has been inactive for 1s, last seqno 14 -- couchbase.(*DCPAsyncWorker).monitorActivity.func1() at dcp_async_worker.go:247
2020-11-09T17:28:41.658+00:00 (DCP) (travel-sample) (vb 0) Stream closed because all items were streamed | {"uuid":54313299993859,"snap_start":38,"snap_end":38,"snap_complete":true,"last_seqno":38,"retries":0}
We can see the activity monitor pick up that things are going slowly i.e it detects the inactivity, and then the new activity. This may just be a case of this being difficult to test without modifying the code, although, it is odd that Asad Zaidi has been unable to trigger the activity monitor using testrunner.