Details
-
Bug
-
Resolution: Fixed
-
Critical
-
4.0.0
-
Security Level: Public
-
None
-
centOS 6.x, 4 cores, 15Gb RAM - each node
-
Untriaged
-
-
No
Description
Build
4.0.0-1767
Clusters
-----------
C1 : http://172.23.105.44:8091/
C2 : http://172.23.105.54:8091/
The clusters are available for investigation.
What we do in XDCR System test
------------------------------
1. Load on both clusters C1[8 nodes], C2[8 nodes] till vb_active_resident_items_ratio < 50 for standardbucket, <70 for standardbucket1
2. Create xdcr:
C1.standardbucket <--> C2.standardbucket , no filter
C1.standardbucket1 --> C1.standardbucket1 , no filter
no replication on sasl bucket.
2. Access phase with 98% gets, 2%sets runs for 3 hours
3. Rebalance-out 1 node at cluster1 with workload
4. Rebalance-in the same node with workload
5. Failover one node(172.23.105.47) with workload and rebalance. Rebalance failed here, although .47 is not a part of C1 anymore.
6. few more steps that did not run due to failure at 5.
goxdcr crashes have taken over UI log on C1. Same noted on C2.
Port server goxdcr on node 'babysitter_of_ns_1@127.0.0.1' exited with status 1. Restarting. Messages: github.com/couchbase/gomemcached/client.(*UprFeed).runFeed(0xc294d2d560, 0xc231a2d1e0)
/home/couchbase/jenkins/workspace/sherlock-unix/godeps/src/github.com/couchbase/gomemcached/client/upr_feed.go:398 +0x162
created by github.com/couchbase/gomemcached/client.(*UprFeed).StartFeed
/home/couchbase/jenkins/workspace/sherlock-unix/godeps/src/github.com/couchbase/gomemcached/client/upr_feed.go:328 +0x65
[goport] 2015/04/09 07:54:15 /opt/couchbase/bin/goxdcr terminated: exit status 2
Port server goxdcr on node 'babysitter_of_ns_1@127.0.0.1' exited with status 1. Restarting. Messages: github.com/couchbase/goxdcr/parts.(*DcpNozzle).processData(0xc28e0f45a0, 0x0, 0x0)
/home/couchbase/jenkins/workspace/sherlock-unix/goproj/src/github.com/couchbase/goxdcr/parts/dcp_nozzle.go:269 +0x1605
created by github.com/couchbase/goxdcr/parts.(*DcpNozzle).Start
/home/couchbase/jenkins/workspace/sherlock-unix/goproj/src/github.com/couchbase/goxdcr/parts/dcp_nozzle.go:168 +0x381
[goport] 2015/04/09 07:49:21 /opt/couchbase/bin/goxdcr terminated: exit status 2
Port server goxdcr on node 'babysitter_of_ns_1@127.0.0.1' exited with status 1. Restarting. Messages: net/http.(*persistConn).readLoop(0xc264bd0630)
/usr/local/go/src/pkg/net/http/transport.go:782 +0x95
created by net/http.(*Transport).dialConn
/usr/local/go/src/pkg/net/http/transport.go:600 +0x93f
[goport] 2015/04/09 07:43:59 /opt/couchbase/bin/goxdcr terminated: exit status 2
3 repeated crashes on .50:
Port server goxdcr on node 'babysitter_of_ns_1@127.0.0.1' exited with status 1. Restarting. Messages: github.com/couchbase/goxdcr/parts.funcĀ·007(0x1f4, 0xc20df37da0, 0xc20c415350, 0xc20c415380, 0xc20c415320, 0x3b9aca00, 0xbd4b68, 0x0, 0x0, 0xc20de93260, ...)
/home/couchbase/jenkins/workspace/sherlock-unix/goproj/src/github.com/couchbase/goxdcr/parts/xmem_nozzle.go:1186 +0x430
created by github.com/couchbase/goxdcr/parts.(*XmemNozzle).batchGetMeta
/home/couchbase/jenkins/workspace/sherlock-unix/goproj/src/github.com/couchbase/goxdcr/parts/xmem_nozzle.go:1220 +0x390
[goport] 2015/04/09 07:40:56 /opt/couchbase/bin/goxdcr terminated: exit status 2 ns_log000 ns_1@172.23.105.50 07:40:56 - Thu Apr 9, 2015
Port server goxdcr on node 'babysitter_of_ns_1@127.0.0.1' exited with status 1. Restarting. Messages: github.com/couchbase/goxdcr/parts.funcĀ·007(0x1f4, 0xc2164f4180, 0xc215ce9ec0, 0xc215ce9ef0, 0xc215ce9e90, 0x3b9aca00, 0xbd4b68, 0x0, 0x0, 0xc212f7b180, ...)
/home/couchbase/jenkins/workspace/sherlock-unix/goproj/src/github.com/couchbase/goxdcr/parts/xmem_nozzle.go:1186 +0x430
created by github.com/couchbase/goxdcr/parts.(*XmemNozzle).batchGetMeta
/home/couchbase/jenkins/workspace/sherlock-unix/goproj/src/github.com/couchbase/goxdcr/parts/xmem_nozzle.go:1220 +0x390
[goport] 2015/04/09 07:40:16 /opt/couchbase/bin/goxdcr terminated: exit status 2 ns_log000 ns_1@172.23.105.50 07:40:16 - Thu Apr 9, 2015
Port server goxdcr on node 'babysitter_of_ns_1@127.0.0.1' exited with status 1. Restarting. Messages: net/http.(*persistConn).readLoop(0xc216288840)
/usr/local/go/src/pkg/net/http/transport.go:782 +0x95
created by net/http.(*Transport).dialConn
/usr/local/go/src/pkg/net/http/transport.go:600 +0x93f
[goport] 2015/04/09 07:39:16 /opt/couchbase/bin/goxdcr terminated: exit status 2 ns_log000 ns_1@172.23.105.50 07:39:16 - Thu Apr 9, 2015
Attaching logs from C1 and C2.
Attachments
Issue Links
- is duplicated by
-
MB-14360 [system test] a large number of errors Port server goxdcr on node 'babysitter_of_ns_1@127.0.0.1' exited with status 1. Restarting. Messages: net/http.(*persistConn).writeLoop(0xc208044420)
- Closed
-
MB-14387 [GoXDCR-System test] Replication is extremely slow (doc receival rate is in 10000s and replication rate is in 10s/100s)
- Closed
- relates to
-
MB-14421 go-couchbase upr panic in UprCloseStream
- Closed