Description
_emphasized text_I am not quite sure why it is panicing, but our upgrade test panics only during this upgrade type (online upgrade using failovers). Meaning we failover a server, upgrade it while it is failed over, then recover the node
4 nodes-
172.23.217.148:8091 => {'services': ['index', 'kv', 'n1ql']
172.23.217.149:8091 => {'services': ['fts', 'index', 'kv', 'n1ql'],
172.23.217.150:8091 => {'services': ['index', 'kv', 'n1ql'],
172.23.217.151:8091 => {'services': ['index', 'kv', 'n1ql'],
first node that is upgraded is .149 (then mixed mode testing takes place) -
2024-01-05 15:27:58,570 - root - INFO - Failing over 172.23.217.149:8091 with graceful=False
2024-01-05 15:33:42,878 - root - INFO - rebalancing was completed with progress: 100% in 90.16561341285706 sec
2024-01-05 15:33:42,878 - root - INFO - upgraded 1 servers: [ip:172.23.217.149 port:8091 ssh_username:root]
then .151 is upgraded-
2024-01-05 15:34:30,966 - root - INFO - Failing over 172.23.217.151:8091 with graceful=False
2024-01-05 15:40:15,858 - root - INFO - rebalancing was completed with progress: 100% in 90.23029613494873 sec
then .150 is upgraded-
2024-01-05 15:40:16,913 - root - INFO - Failing over 172.23.217.150:8091 with graceful=False
2024-01-05 15:46:02,700 - root - INFO - rebalancing was completed with progress: 100% in 90.2540225982666 sec
and finally .148 is upgraded-
2024-01-05 15:46:03,754 - root - INFO - Failing over 172.23.217.148:8091 with graceful=False
2024-01-05 15:52:06,830 - root - INFO - rebalancing was completed with progress: 100% in 111.37833738327026 sec
2024-01-05 15:52:06,831 - root - INFO - successfully upgraded 3 remaining servers: [ip:172.23.217.151 port:8091 ssh_username:root, ip:172.23.217.150 port:8091 ssh_username:root, ip:172.23.217.148 port:8091 ssh_username:root]
here is the console log (no obvious panics take place from inspecting the console logs)
http://qa.sc.couchbase.com/job/test_suite_executor/660280/console
it is important to note that the other upgrade paths that we test are not seeing this panic
- offline upgrade, online upgrade via swap rebalance, online upgrade via rebalance. I am not sure where in the test the panic is being introduced
before upgrade - we create UDFs and some cbo stats
mixed mode - we run various tests including using the above udfs and cbo stats
fully upgraded - we run various tests including using the udfs and cbo stats from pre upgrade
and logs from each node will be attached
Let me know if more info is required here, I am hoping there is something in the logs that points to what is going wrong exactly
we see 10 panics according to our test, here is an example:
Stack:
2024-01-05T15:50:29.852-08:00 [INFO] n1fty: NewFTSIndexer2, server: http://127.0.0.1:8091, namespace: default, bucket: N1QL_SYSTEM_BUCKET, scope: N1QL_SYSTEM_SCOPE, keyspace: N1QL_CBO_STATS |
2024-01-05T15:50:29.851-08:00 [Info] GSIC[default/N1QL_SYSTEM_BUCKET-N1QL_SYSTEM_SCOPE-N1QL_CBO_STATS-1704498629842454637] started ... |
2024-01-05T15:50:29.851-08:00 [Info] Receive security change notification. encryption=false |
2024-01-05T15:50:29.852-08:00 [Info] Certificate refreshed successfully with certFile /opt/couchbase/var/lib/couchbase/config/certs/chain.pem, keyFile /opt/couchbase/var/lib/couchbase/config/certs/pkey.pem, caFile /opt/couchbase/var/lib/couchbase/config/certs/ca.pem |
panic: runtime error: invalid memory address or nil pointer dereference
|
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x2a3dfb5] |
|
|
goroutine 1 [running]: |
github.com/couchbase/query/datastore/couchbase.cbAuthorize({0x3a1d820?, 0xc0000524b0?}, 0x18?, 0x0) |
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/query/datastore/couchbase/auth.go:263 +0x35 |
github.com/couchbase/query/datastore/couchbase.(*store).Authorize(0x0?, 0x11e0c00?, 0xc00229f960?) |
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/query/datastore/couchbase/couchbase.go:483 +0x2c |
github.com/couchbase/query/planner.seqScanAuth({0xc00236c3c0?, 0xc000134d20?}, 0x4?) |
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/query/planner/build_scan.go:894 +0x2a8 |
github.com/couchbase/query/planner.allIndexes({0x3a45ab0, 0xc001ccea00}, {0x0, 0x0, 0x70?}, {0x0, 0x0, 0x0?}, 0xc00229fbf8?, 0x0, ...) |
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/query/planner/build_scan.go:843 +0x474 |
github.com/couchbase/query/planner.(*builder).buildPredicateScan(0xc0020e3600, {0x3a45ab0, 0xc001ccea00}, 0xc00056ac60, 0xc0013d9900, {0x3a54680?, 0xc002382c80}, {0x0, 0x0, 0x0}, ...) |
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/query/planner/build_scan.go:229 +0x6a5 |
github.com/couchbase/query/planner.(*builder).buildScan(0xc0020e3600, {0x3a45ab0, 0xc001ccea00}, 0xc00056ac60) |
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/query/planner/build_scan.go:166 +0xd45 |
github.com/couchbase/query/planner.(*builder).selectScan(0xc0020e3600, {0x3a45ab0?, 0xc001ccea00?}, 0xc00056ac60, 0xff?) |
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/query/planner/build_scan.go:61 +0x2da |
github.com/couchbase/query/planner.(*builder).VisitKeyspaceTerm(0xc0020e3600, 0xc00056ac60) |