Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-32826

System Test : GSI crash because of too many open files

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • 5.5.4
    • 5.5.4
    • secondary-index
    • centos-2

    Description

      Build : 5.5.4-4302
      Test : -test tests/integration/test_allFeaturesWithGSI_vulcan.yml -scope tests/integration/scope_Xattrs_GSI_Vulcan.yml
      Scale : 3
      Iteration : 1

      Seeing the following panic msg in the indexer log on 172.23.96.215

      2019-01-27T01:49:13.477-08:00 [Error] [Queryport ":9101"] failed starting listen tcp :9101: socket: too many open files !!
      panic: listen tcp :9101: socket: too many open files
       
      goroutine 56646123 [running]:
      panic(0xe54100, 0xc75df37d10)
              /home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.6/go/src/runtime/panic.go:500 +0x1a1 fp=0xc630f5fdd0 sp=0xc630f5fd40
      github.com/couchbase/indexing/secondary/queryport.(*Server).listener.func1()
              goproj/src/github.com/couchbase/indexing/secondary/queryport/server.go:134 +0x2bc fp=0xc630f5fe70 sp=0xc630f5fdd0
      github.com/couchbase/indexing/secondary/queryport.(*Server).listener.func2(0xc420210200, 0xc630f5ff48)
              goproj/src/github.com/couchbase/indexing/secondary/queryport/server.go:146 +0x1b9 fp=0xc630f5fef8 sp=0xc630f5fe70
      github.com/couchbase/indexing/secondary/queryport.(*Server).listener(0xc420210200)
              goproj/src/github.com/couchbase/indexing/secondary/queryport/server.go:163 +0x332 fp=0xc630f5ffa8 sp=0xc630f5fef8
      runtime.goexit()
              /home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.6/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc630f5ffb0 sp=0xc630f5ffa8
      created by github.com/couchbase/indexing/secondary/queryport.(*Server).listener.func1
              goproj/src/github.com/couchbase/indexing/secondary/queryport/server.go:137 +0x1c0
      

      Before this crash, there were several entries like the following in the indexer logs.

      2019-01-27T01:49:13.476-08:00 [Error] [Queryport ":9101"] Accept() Error: accept tcp [::]:9101: accept4: too many open files
      2019-01-27T01:49:13.476-08:00 [Error] [Queryport ":9101"] Restarting listener
      2019-01-27T01:49:13.476-08:00 [Error] [Queryport ":9101"] Accept() Error: accept tcp [::]:9101: accept4: too many open files
      2019-01-27T01:49:13.476-08:00 [Error] [Queryport ":9101"] Restarting listener
      

      Also interestingly, there were too many TCP connections open involving indexer. An output of netstat -alp is attached.

      From the test perspective, GSI was building a primary index on 172.23.96.215 and index building got stuck at 98%.

      [2019-01-26T09:15:44-08:00, sequoiatools/cbq:e8ae70] -e=http://172.23.96.14:8093 -u=Administrator -p=password -script=drop primary index on `default` using GSI
      [pull] sequoiatools/cbq
      [2019-01-26T09:17:52-08:00, sequoiatools/cbq:c545db] -e=http://172.23.96.14:8093 -u=Administrator -p=password -script=drop index `default`.default_rating using GSI
      [pull] sequoiatools/cbq
      [2019-01-26T09:20:01-08:00, sequoiatools/cbq:51d9b9] -e=http://172.23.96.14:8093 -u=Administrator -p=password -script=drop index `CUSTOMER`.o1_claims using GSI
      [pull] sequoiatools/cmd
      [2019-01-26T09:20:09-08:00, sequoiatools/cmd:9ba377] 600
      [pull] sequoiatools/cbq
      [2019-01-26T09:30:16-08:00, sequoiatools/cbq:e3db5b] -e=http://172.23.96.14:8093 -u=Administrator -p=password -script=create index default_rating on `default`(rating) using GSI
      [pull] sequoiatools/cbq
      [2019-01-26T09:44:40-08:00, sequoiatools/cbq:3f62c4] -e=http://172.23.96.14:8093 -u=Administrator -p=password -script=create primary index on `default` using GSI
      

      The cluster is available for debugging - http://172.23.96.14:8091

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            mihir.kamdar Mihir Kamdar (Inactive)
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty