Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-30147

Unable to find given hostport in cbauth database

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • 5.5.0
    • 5.5.0
    • ns_server
    • Untriaged
    • Centos 64-bit
    • Unknown

    Description

      Noticed following panic on one of the Eventing node on longevity system test cluster:

      panic: Unable to find given hostport in cbauth database: `172.23.96.210:11210'
       
      goroutine 213266 [running]:
      panic(0xc5e760, 0xc43115b3c0)
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.6/go/src/runtime/panic.go:500 +0x1a1 fp=0xc431188ac0 sp=0xc431188a30
      github.com/couchbase/eventing/util.(*CbAuthHandler).AuthenticateMemcachedConn(0xc4262bd5c0, 0xc4322fb4a0, 0x13, 0xc422c2e8d0, 0xc422c2e801, 0x0)
      	goproj/src/github.com/couchbase/eventing/util/kv.go:63 +0x22b fp=0xc431188b50 sp=0xc431188ac0
      github.com/couchbase/eventing/dcp.defaultMkConn(0xc4322fb4a0, 0x13, 0x153b460, 0xc4262bd5c0, 0x10000c4281f2000, 0x0, 0x7b)
      	goproj/src/github.com/couchbase/eventing/dcp/conn_pool.go:60 +0x110 fp=0xc431188be8 sp=0xc431188b50
      github.com/couchbase/eventing/dcp.(*connectionPool).GetWithTimeout(0xc42d467940, 0x9356907420000, 0x0, 0x0, 0x0)
      	goproj/src/github.com/couchbase/eventing/dcp/conn_pool.go:139 +0x555 fp=0xc431188dd0 sp=0xc431188be8
      github.com/couchbase/eventing/dcp.(*connectionPool).Get(0xc42d467940, 0xc42984abd0, 0x64, 0xd8df3b)
      	goproj/src/github.com/couchbase/eventing/dcp/conn_pool.go:152 +0x37 fp=0xc431188e08 sp=0xc431188dd0
      github.com/couchbase/eventing/dcp.(*connectionPool).StartDcpFeed(0xc42d467940, 0xc4272d5a80, 0x7b, 0x400000000, 0xc4202383c0, 0xc42984abcd, 0xc42707a210, 0x9, 0x0, 0x0)
      	goproj/src/github.com/couchbase/eventing/dcp/conn_pool.go:216 +0x38 fp=0xc431188e80 sp=0xc431188e08
      github.com/couchbase/eventing/dcp.(*DcpFeed).connectToNodes(0xc423d2b710, 0xc42849afb0, 0x1, 0x1, 0x421e4abcd, 0xc42707a210, 0x67, 0x8)
      	goproj/src/github.com/couchbase/eventing/dcp/upr.go:368 +0x5f2 fp=0xc4311892c0 sp=0xc431188e80
      github.com/couchbase/eventing/dcp.(*Bucket).StartDcpFeedOver(0xc429ff0780, 0xc42ff25740, 0x60, 0x400000000, 0xc42849afb0, 0x1, 0x1, 0xabcd, 0xc42707a210, 0x0, ...)
      	goproj/src/github.com/couchbase/eventing/dcp/upr.go:204 +0x54b fp=0xc431189398 sp=0xc4311892c0
      github.com/couchbase/eventing/consumer.glob..func17(0xc421ddce40, 0x3, 0x3, 0x1, 0x1)
      	goproj/src/github.com/couchbase/eventing/consumer/bucket_ops.go:516 +0x1fd fp=0xc4311895a8 sp=0xc431189398
      github.com/couchbase/eventing/util.Retry(0x1543a60, 0xc4289fb760, 0xc4257660d8, 0xe3aba8, 0xc421ddce40, 0x3, 0x3, 0xd88ed7, 0x1)
      	goproj/src/github.com/couchbase/eventing/util/retry.go:65 +0x63 fp=0xc431189618 sp=0xc4311895a8
      github.com/couchbase/eventing/consumer.(*Consumer).dcpRequestStreamHandle(0xc428389400, 0x100, 0xc42d6b7980, 0x26baa, 0xc42685e930, 0x7)
      	goproj/src/github.com/couchbase/eventing/consumer/process_events.go:901 +0x237a fp=0xc431189e20 sp=0xc431189618
      github.com/couchbase/eventing/consumer.(*Consumer).processReqStreamMessages.func1(0xc432926b60, 0xc428389400, 0xda2eb6, 0x22)
      	goproj/src/github.com/couchbase/eventing/consumer/process_events.go:1187 +0x65 fp=0xc431189f70 sp=0xc431189e20
      runtime.goexit()
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.6/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc431189f78 sp=0xc431189f70
      created by github.com/couchbase/eventing/consumer.(*Consumer).processReqStreamMessages
      	goproj/src/github.com/couchbase/eventing/consumer/process_events.go:1204 +0xc8f
      

      Flow of events from Eventing side that lead to this panic:

      • Eventing tried to issue stream for one of the vbuckets
      • It requested vbmap from ns_server to figure KV node that has active copy of the vbucket
      • Eventing issued to open a dcp connection to that KV node(172.23.96.210) and that reported node not found in cbauth db

      Could this be because vbmap is out-of-sync?

      2018-06-16T15:47:47-07:00 - time panic occurred
      2018-06-16T15:11:57-07:00 - rebalance was kicked off
      b/w 2018-06-16T15:09:09-07:00 & 2018-06-16T15:11:36-07:00 - cb was stopped on 3 kv nodes after enabling auto-failover.

      =====

      Additional details:

      jenkins job link - http://172.23.109.231/job/centos-systest-launcher-2/112/consoleFull

      Prior to this panic - couple of requests to failover nodes returned 503 error

      [2018-06-16T13:14:38-07:00, sequoiatools/couchbase-cli:2b528b] failover -c 172.23.96.206:8091 --server-failover 172.23.96.210:8091 -u Administrator -p password
       
      Error occurred on container - sequoiatools/couchbase-cli:[failover -c 172.23.96.206:8091 --server-failover 172.23.96.210:8091 -u Administrator -p password]
       
      docker logs 2b528b
      docker start 2b528b
       
      &ERROR: Received unexpected status 503
      [pull] sequoiatools/couchbase-cli
      [2018-06-16T13:17:07-07:00, sequoiatools/couchbase-cli:359ee9] failover -c 172.23.96.206:8091 --server-failover 172.23.96.212:8091 -u Administrator -p password --force
       
      Error occurred on container - sequoiatools/couchbase-cli:[failover -c 172.23.96.206:8091 --server-failover 172.23.96.212:8091 -u Administrator -p password --force]
       
      docker logs 359ee9
      docker start 359ee9
       
      &ERROR: Received unexpected status 503
      

      Then test enabled autofailover and stopped cb service on 3 kv nodes including 172.23.96.210 and then rebalance was kicked off, where Eventing panic-ed.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            vikas.chaudhary Vikas Chaudhary
            asingh Abhishek Singh (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty