Description
Sync Gateway version
SG version - 2.1.0-121
Couchbase server
CBS - 6.0.0-1643
Operating system
Centos for CBS and Sync gateway
Sync gateway Config file
{{ "interface":":4984",
"adminInterface": "0.0.0.0:4985",
"maxIncomingConnections": 0,
"maxCouchbaseConnections": 16,
"maxFileDescriptors": 90000,
"slowServerCallWarningThreshold": 500,
"compressResponses": false,
"logging": {"debug":
{"enabled": true}},
"databases":{
"db":
}
}}}
Sync gateway Log output
https://gist.github.com/sridevi-15/f64ac3ba064e887188938478142f7265
Steps to reproduce
- Set up 3 couchbase servers in the CBS cluster.
- Start sync-gateway with above config by providing 3 cbs urls on the config
Expected behavior
It should start successfully
Actual behavior
Sync gateway failed to start
Note : It passed with previous version 5.5.2-3733 and also 6.0.0-1592. It passed in DI mode(sg accel) with CBS 6.0.0-1643.
It passed with use views in CC mode with CBS 6.0.0-1643
It failed only with GSI
Attachments
- cbs1.zip
- 4.10 MB
- cbs2.zip
- 2.69 MB
- cbs3.zip
- 2.67 MB
- index-logs.txt
- 7 kB
Activity
All the indexes are Ready on 192.168.33.10
2018-09-16T06:34:28.789+00:00 [Info] ClustMgr:handleIndexMap
|
Message: MsgUpdateInstMap
|
InstanceId: 1754351140469947657 Name: sg_syncDocs_x1 Bucket: data-bucket State: INDEX_STATE_ACTIVE Stream: MAINT_STREAM RState: RebalActive Version: 0 ReplicaId: 0
|
InstanceId: 2183354563651944360 Name: sg_access_x1 Bucket: data-bucket State: INDEX_STATE_ACTIVE Stream: MAINT_STREAM RState: RebalActive Version: 0 ReplicaId: 0
|
InstanceId: 2586860435507563506 Name: sg_roleAccess_x1 Bucket: data-bucket State: INDEX_STATE_ACTIVE Stream: MAINT_STREAM RState: RebalActive Version: 0 ReplicaId: 0
|
InstanceId: 7683416839874631805 Name: sg_channels_x1 Bucket: data-bucket State: INDEX_STATE_ACTIVE Stream: MAINT_STREAM RState: RebalActive Version: 0 ReplicaId: 0
|
InstanceId: 15621229286760607597 Name: sg_allDocs_x1 Bucket: data-bucket State: INDEX_STATE_ACTIVE Stream: MAINT_STREAM RState: RebalActive Version: 0 ReplicaId: 0
|
InstanceId: 13398647996422839803 Name: sg_tombstones_x1 Bucket: data-bucket State: INDEX_STATE_ACTIVE Stream: MAINT_STREAM RState: RebalActive Version: 0 ReplicaId: 0
|
How does sync gateway check for the status of the indexes?
For the failing operation in this test, Sync Gateway is issuing a limit=1 query that exercises the index.
Based on the timing in SG logs on the github ticket, the last check was made at:
2018-09-16T03:05:22.311Z [INF] Timeout waiting for index "sg_channels_x1" to be ready for bucket "data-bucket" - retrying...
Is the timestamp in the above message (2018-09-16T06:34:28.789) correspond to when the indexes became ready? Or were they also ready before that time?
Some additional context:
- SG doesn't issue that query until the index state is showing as online (based on system:indexes), so based on the logs, Sync Gateway detects that the index is online as of 2018-09-16T00:37:51.275Z, but queries continue to time out until at least 2018-09-16T03:05:22.311Z
Sridevi Saragadam - I'm assuming the test is expecting an empty bucket at this point (so the query isn't timing out due to data volume). Can you confirm?
One additional observation.
Since the timeout error in the sync gateway log is caused as a result of a failed query
I took a quick look at the ns_server.query.log on node 192.168.33.10. Its full of null pointer exceptions like
_time=2018-09-16T06:34:30.197+00:00 _level=SEVERE _msg= panic=runtime error: invalid memory address or nil pointer dereference stack=goroutine 67 [running]:
github.com/couchbase/query/server.(*Server).serviceRequest.func1()
goproj/src/github.com/couchbase/query/server/server.go:501 +0xbc
panic(0xe3b140, 0x1873120)
<.......>
Adam Fraser if you could post the specific query that is being issued it would help the query team triage further.
Adam Fraser, Yes it is expecting empty bucket . No docs loaded yet at this point.
Hey @Amit, Adam is traveling and he asked me to look into this. This is the query: https://github.com/couchbase/sync_gateway/blob/master/db/indexes.go#L107-L112
Traun Leyden: That sounds good. We're unfamiliar with sync gateway code. Please debug and let us know if you need any help.
/cc Daniel Petersen
To debug this I'll either have to get a repro setup locally, which will take a decent chunk of effort, or find out from Sridevi Saragadam the details of the hosts where this test is running and coordinate with her so that my debugging doesn't interfere with the automated testing.
I did some local inspection and enabled additional logging, and here are the log messages for the index creation and query strings. (attached as [^index-logs.txt).]
In particular, the query string that corresponds to the code link I linked to earlier:
SELECT [op.name, LEAST(`test_data_bucket`._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null), IFMISSING(op.val.del,null)][1] AS sequence FROM `test_data_bucket` UNNEST OBJECT_PAIRS(`test_data_bucket`._sync.channels) AS op WHERE [op.name, LEAST(`test_data_bucket`._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null), IFMISSING(op.val.del,null)] BETWEEN ['foo', 0] AND ['foo', 1] ORDER BY sequence LIMIT 1
I'm not sure if there is some unexpected/unwanted formatting in that string, if so let me know and I can dig into how to get it formatted better.
Sridevi Saragadam the following steps might reveal something from the failing test:
1. Enable trace level logs for queries by updating your sync gateway config to enable trace level logging on the Query log key, for example:
{
"adminInterface":":4985",
"logging": {
"console":
},
2. Reproduce the issue using those settings.
3. Look into the logs and find the equivalent query mentioned in my previous comment.
4. Try running the query directly on the query workbench to see if it returns in a timely manner.
_time=2018-09-16T08:13:16.014+00:00 _level=SEVERE _msg= panic=runtime error: invalid memory address or nil pointer dereference stack=goroutine 52 [running]:
github.com/couchbase/query/server.(*Server).serviceRequest.func1()
goproj/src/github.com/couchbase/query/server/server.go:501 +0xbc
panic(0xe3b140, 0x1873120)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.8.5/go/src/runtime/panic.go:489 +0x2cf
github.com/couchbase/query/planner.(*Filter).Copy(0xc4211fad20, 0xc42102af38)
goproj/src/github.com/couchbase/query/planner/filter.go:56 +0x68
github.com/couchbase/query/planner.addUnnestPreds(0xc4211fab10, 0xc420e81d80)
goproj/src/github.com/couchbase/query/planner/build_scan_unnest.go:215 +0x473
github.com/couchbase/query/planner.(*builder).buildScan(0xc4206d7a40, 0x18a7860, 0xc42053bf90, 0xc42054caf0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
goproj/src/github.com/couchbase/query/planner/build_scan.go:105 +0x7d2
github.com/couchbase/query/planner.(*builder).selectScan(0xc4206d7a40, 0x18a7860, 0xc42053bf90, 0xc42054caf0, 0x0, 0x0, 0x0, 0x0)
goproj/src/github.com/couchbase/query/planner/build_scan.go:39 +0x3b1
github.com/couchbase/query/planner.(*builder).VisitKeyspaceTerm(0xc4206d7a40, 0xc42054caf0, 0x18d88a0, 0xc42141a240, 0xf4ada0, 0xc4211a4870)
goproj/src/github.com/couchbase/query/planner/build_select_from.go:118 +0xf3
github.com/couchbase/query/algebra.(*KeyspaceTerm).Accept(0xc42054caf0, 0x18aacc0, 0xc4206d7a40, 0x18ad6c0, 0xc4211a4870, 0xc4211fab10, 0x0)
goproj/src/github.com/couchbase/query/algebra/from_keyspace.go:58 +0x3b
github.com/couchbase/query/planner.(*builder).VisitExpressionTerm(0xc4206d7a40, 0xc4211fa630, 0x9371f3, 0xc4211fa630, 0x18a5540, 0xc4211fa630)
goproj/src/github.com/couchbase/query/planner/build_select_from.go:176 +0x628
github.com/couchbase/query/algebra.(*ExpressionTerm).Accept(0xc4211fa630, 0x18aacc0, 0xc4206d7a40, 0x0, 0x0, 0x0, 0x0)
goproj/src/github.com/couchbase/query/algebra/from_expression.go:38 +0x3b
github.com/couchbase/query/planner.(*builder).VisitUnnest(0xc4206d7a40, 0xc4214892c0, 0x0, 0x0, 0x0, 0x0)
goproj/src/github.com/couchbase/query/planner/build_select_from.go:446 +0x9c
github.com/couchbase/query/algebra.(*Unnest).Accept(0xc4214892c0, 0x18aacc0, 0xc4206d7a40, 0x0, 0x0, 0x0, 0x0)
goproj/src/github.com/couchbase/query/algebra/from_unnest.go:37 +0x3e
github.com/couchbase/query/planner.(*builder).visitFrom(0xc4206d7a40, 0xc420fed950, 0x0, 0x0, 0x0)
goproj/src/github.com/couchbase/query/planner/build_select_from.go:74 +0x617
github.com/couchbase/query/planner.(*builder).VisitSubselect(0xc4206d7a40, 0xc420fed950, 0x0, 0x0, 0x0, 0x0)
goproj/src/github.com/couchbase/query/planner/build_select_sub.go:194 +0x80a
github.com/couchbase/query/algebra.(*Subselect).Accept(0xc420fed950, 0x18aacc0, 0xc4206d7a40, 0x18b0ac0, 0xc4211fa780, 0x0, 0x0)
CREATE INDEX `ix20` ON `default`(ALL (ARRAY [op.name, LEAST(`self`._sync.sequence,op.val.seq), IFMISSING(op.val.rev,null), IFMISSING(op.val.del,null)] FOR op IN OBJECT_PAIRS(`self`._sync.channels) END), `self`._sync.rev, `self`._sync.sequence, `self`._sync.flags); |
|
SELECT [op.name, LEAST(`d`._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null), IFMISSING(op.val.del,null)][1] AS sequence FROM `default` UNNEST OBJECT_PAIRS(`d`._sync.channels) AS op WHERE [op.name, LEAST(`d`._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null), IFMISSING(op.val.del,null)] BETWEEN ['foo', 0] AND ['foo', 1] ORDER BY sequence LIMIT 1; |
Raju Suravarjjala, Keshav Murthy If this need for 6.0.0 please triage this.
Build couchbase-server-6.5.0-1343 contains query commit 7b4d40c with commit message:
MB-31367. Copy expression when it is not nil
verified on 6.0.0-1658 and 6.5.0-1343. Test has passed now on both the builds
https://github.com/couchbase/sync_gateway/issues/3755