Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
7.1.0
-
Untriaged
-
-
1
-
No
Description
Build : 7.1.0-1773
Test : -test tests/integration/neo/test_neo_couchstore_milestone3.yml -scope tests/integration/neo/scope_couchstore.yml
Scale : 3
Iteration : 3
Rebalance started at 2021-11-25T13:09:40 to remove query node 172.23.121.77 from the cluster
[2021-11-25T13:09:40-08:00, sequoiatools/couchbase-cli:7.1:03742a] rebalance -c 172.23.97.74:8091 --server-remove 172.23.121.77 -u Administrator -p password
|
It has been over 4 days, but rebalance is not yet complete. On 172.23.121.77, the following can be seen in the logs :
2021-11-25T13:10:34.403-08:00 [INFO] Graceful shutdown initiated.
|
2021-11-25T13:10:34.403-08:00 [INFO] Shutdown: Waiting for 1 active request(s) and 0 active transaction(s) to complete.
|
2021-11-25T13:10:34.436-08:00 [INFO] Incoming request from '172.23.104.254:54118' rejected during service shutdown.
|
2021-11-25T13:10:34.907-08:00 [Info] serviceChangeNotifier: received PoolChangeNotification
|
2021-11-25T13:10:34.913-08:00 [INFO] Incoming request from '172.23.104.254:53400' rejected during service shutdown.
|
and even later
2021-11-29T16:52:15.047-08:00 [INFO] Incoming request from '172.23.104.254:55748' rejected during service shutdown.
|
2021-11-29T16:52:20.583-08:00 [INFO] Incoming request from '172.23.104.254:56002' rejected during service shutdown.
|
2021-11-29T16:52:26.090-08:00 [INFO] Incoming request from '172.23.104.254:56234' rejected during service shutdown.
|
2021-11-29T16:52:37.283-08:00 [INFO] Incoming request from '172.23.104.254:56730' rejected during service shutdown.
|
2021-11-29T16:52:14.627-08:00 [INFO] Shutdown: Waiting for 1 active request(s) and 0 active transaction(s) to complete.
|
2021-11-29T16:52:24.680-08:00 [INFO] Shutdown: Waiting for 1 active request(s) and 0 active transaction(s) to complete.
|
2021-11-29T16:52:34.735-08:00 [INFO] Shutdown: Waiting for 1 active request(s) and 0 active transaction(s) to complete.
|
2021-11-29T16:52:44.795-08:00 [INFO] Shutdown: Waiting for 1 active request(s) and 0 active transaction(s) to complete.
|
2021-11-29T16:52:54.883-08:00 [INFO] Shutdown: Waiting for 1 active request(s) and 0 active transaction(s) to complete.
|
Upon querying the system:active_requests, I can find one query in timeout state.
{
|
"active_requests": {
|
"clientContextID": "query_thread_4",
|
"elapsedTime": "99h48m1.920782733s",
|
"executionTime": "99h48m1.920713777s",
|
"n1qlFeatCtrl": 76,
|
"node": "172.23.121.77:8091",
|
"phaseOperators": {
|
"authorize": 1,
|
"stream": 1
|
},
|
"phaseTimes": {
|
"authorize": "38.83µs",
|
"instantiate": "63.99µs",
|
"parse": "2.179918ms",
|
"plan": "3.920764ms",
|
"stream": "3.652µs"
|
},
|
"remoteAddr": "172.23.104.254:45268",
|
"requestId": "a295f87b-e361-4959-aaa8-74ed2b290c06",
|
"requestTime": "2021-11-25T13:07:09.949-08:00",
|
"scanConsistency": "scan_plus",
|
"state": "timeout",
|
"statement": "SELECT h.name, h.city, r.author FROM `default`:`bucket6`.`scope_2`.`coll_0` h UNNEST reviews AS r WHERE r.ratings.Rooms < 2 AND h.avg_rating >= 3 ORDER BY r.author DESC LIMIT 100;",
|
"statementType": "SELECT",
|
"useCBO": true,
|
"userAgent": "couchbase-java-client/${project.version} (git: ${git.commit.id.describe}, core: ${git.commit.id.describe}) (Linux/3.10.0-1062.9.1.el7.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_302-b08)",
|
"users": "Administrator"
|
}
|
}
|
So this query must be the one that is holding up the graceful shutdown.
1. Shouldn't a query that has timed out not get listed as an active request.
2. Graceful shutdown should wait only for requests in active state.
Query nodes : 172.23.120.58, 172.23.121.77, 172.23.120.81
The logs attached are from about 2 hrs after the rebalance started. If you want logs from any other time stamp, please let me know.
Attachments
For Gerrit Dashboard: MB-49827 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
166583,2 | MB-49827 Graceful shutdown: interrogate active request state and don't wait if not running | master | query | Status: MERGED | +2 | +1 |