Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-31699

panic: socket: too many open files - service indexer exited with status 2 accompanied by rebalance failure

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: User Error
    • 6.0.0
    • None
    • secondary-index
    • Untriaged
    • Unknown

    Description

      longevity - centos cluster - Rc4 - 6.0.0-1693 - 8th day - following panic observed:

      018-10-19T00:42:16.250-07:00 [Error] [Queryport ":9101"] Restarting listener
      2018-10-19T00:42:16.250-07:00 [Error] [Queryport ":9101"] Accept() Error: accept tcp [::]:9101: accept4: too many open files
      2018-10-19T00:42:16.250-07:00 [Error] [Queryport ":9101"] Restarting listener
      2018-10-19T00:42:16.250-07:00 [Error] [Queryport ":9101"] Accept() Error: accept tcp [::]:9101: accept4: too many open files
      2018-10-19T00:42:16.250-07:00 [Error] [Queryport ":9101"] Restarting listener
      2018-10-19T00:42:16.250-07:00 [Error] [Queryport ":9101"] Accept() Error: accept tcp [::]:9101: accept4: too many open files
      2018-10-19T00:42:16.251-07:00 [Error] [Queryport ":9101"] Restarting listener
      2018-10-19T00:42:16.251-07:00 [Error] [Queryport ":9101"] Accept() Error: accept tcp [::]:9101: accept4: too many open files
      2018-10-19T00:42:16.251-07:00 [Error] [Queryport ":9101"] Restarting listener
      2018-10-19T00:42:16.251-07:00 [Error] [Queryport ":9101"] Accept() Error: accept tcp [::]:9101: accept4: too many open files
      2018-10-19T00:42:16.251-07:00 [Error] [Queryport ":9101"] failed starting listen tcp :9101: socket: too many open files !!
      panic: listen tcp :9101: socket: too many open files
       
      goroutine 150813773 [running]:
      panic(0xe54d60, 0xc53fb1ba40)
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.6/go/src/runtime/panic.go:500 +0x1a1
      github.com/couchbase/indexing/secondary/queryport.(*Server).listener.func1()
      	goproj/src/github.com/couchbase/indexing/secondary/queryport/server.go:134 +0x2bc
      github.com/couchbase/indexing/secondary/queryport.(*Server).listener.func2(0xc42015a180, 0xc6bf40df48)
      	goproj/src/github.com/couchbase/indexing/secondary/queryport/server.go:146 +0x1b9
      github.com/couchbase/indexing/secondary/queryport.(*Server).listener(0xc42015a180)
      	goproj/src/github.com/couchbase/indexing/secondary/queryport/server.go:163 +0x332
      created by github.com/couchbase/indexing/secondary/queryport.(*Server).listener.func1
      	goproj/src/github.com/couchbase/indexing/secondary/queryport/server.go:137 +0x1c0
      [goport(/opt/couchbase/bin/indexer)] 2018/10/19 00:42:23 child process exited with status 2
      2018-10-19T00:42:23.993-07:00 [Info] Indexer started with command line: [/opt/couchbase/bin/indexer -vbuckets=1024 -cluster=127.0.0.1:8091 -adminPort=9100 -scanPort=9101 -httpPort=9102 -streamInitPort=9103 -streamCatchupPort=9104 -streamMaintPort=9105 -storageDir=/data/@2i -diagDir=/opt/couchbase/var/lib/couchbase/crash -nodeUUID=7866ce507b8b5d010227caff7df987f4 -ipv6=false -isEnterprise=true --httpsPort=19102 --certFile=/opt/couchbase/var/lib/couchbase/config/memcached-cert.pem --keyFile=/opt/couchbase/var/lib/couchbase/config/memcached-key.pem]
      2018-10-19T00:42:24.031-07:00 [Info] Indexer::NewIndexer Status Warmup
      2018-10-19T00:42:24.043-07:00 [Info] Setting buffer block size to 16384 bytes
      2018-10-19T00:42:24.043-07:00 [Info] Setting maxcpus = 8
      2018-10-19T00:42:24.043-07:00 [Info] Setting log level to Info
      2018-10-19T00:42:24.043-07:00 [Info] Indexer::NewIndexer Build Mode Set Enterprise
      2018-10-19T00:42:24.043-07:00 [Info] Indexer::Cluster Storage Mode Set plasma
      2018-10-19T00:42:24.043-07:00 [Info] Indexer::NewIndexer Starting with Vbucke
      

      It is always accompanied by rebalance failure:

      018-10-19T00:42:26.383-07:00, ns_orchestrator:0:critical:message(ns_1@172.23.104.164) - Rebalance exited with reason {service_rebalance_failed,index,
      {lost_connection,shutdown}}
      2018-10-19T00:43:14.196-07:00, compaction_new_daemon:0:info:message(ns_1@172.23.108.103) - User-triggered compaction of bucket `ITEM` completed.
      2018-10-19T00:44:41.865-07:00, compaction_new_daemon:0:info:messag

      supportal: https://supportal.couchbase.com/snapshot/13804bb7d32cb90dab23bfd3d076a08a::0
      cluster live at : http://172.23.108.103:8091/ui/index.html#!/servers/list

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          jliang John Liang added a comment -

          Indexer node 72.23.96.145 is hitting 100% CPU. If the test continues to spam the indexer with request when CPU is high, it could cause this issue.

          jliang John Liang added a comment - Indexer node 72.23.96.145 is hitting 100% CPU. If the test continues to spam the indexer with request when CPU is high, it could cause this issue.
          wayne Wayne Siu added a comment -

          Re-opening the ticket to remove the fix version.

          wayne Wayne Siu added a comment - Re-opening the ticket to remove the fix version.

          People

            arunkumar Arunkumar Senthilnathan (Inactive)
            arunkumar Arunkumar Senthilnathan (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty