Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-22040

GSI client panics when FDs exhausted

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 5.0.0
    • 5.0.0
    • secondary-index
    • None
    • Dev environment, on Mac, with cluster_run
    • Untriaged
    • No

    Description

      Cloned from MB-21985, so the original bug can track the cbq-engine issue, and this cloned bug will track the GSI issue (basically, better error handling when system runs out of file descriptors).

      Original Issue reported by Eben:


      I did a repo sync yesterday and rebuilt, so my code is up-to-date.

      While testing query monitoring, I set up a script to keep running the query:

      select * from `travel-sample` where foo = "bar";

      It ran successfully for a while, then I started getting panics in the query engine every time.

      I uploaded the logs to: https://s3.amazonaws.com/cb-customers/Eben+Haber/collectinfo-2016-12-15T182153-n_0%40127.0.0.1.zip

      Here is an example of the stack traces from query.log:

      goroutine 4150128 [running]:
      github.com/couchbase/query/execution.(*Context).Recover(0xc427ba7d40)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/query/execution/context.go:392 +0xc7
      panic(0x483fb40, 0xc4200160d0)
      /Users/eben/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/panic.go:458 +0x243
      github.com/couchbase/indexing/secondary/queryport/n1ql.makeResponsehandler.func2(0x4ee1cc0, 0xc42496e100, 0xc420188120)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/n1ql/secondary_index.go:847 +0xc21
      github.com/couchbase/indexing/secondary/queryport/client.(*GsiScanClient).streamResponse(0xc420127650, 0x4eeb820, 0xc420188120, 0xc4220b0240, 0xc4237d0e70, 0xc422ca4a50, 0x24, 0x0, 0x0, 0x0)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/client/scan_client.go:611 +0x5e4
      github.com/couchbase/indexing/secondary/queryport/client.(*GsiScanClient).ScanAll(0xc420127650, 0xa8e36a8e7a3aafec, 0xc422ca4a50, 0x24, 0x7fffffffffffffff, 0x1, 0x0, 0xc4237d0e70, 0x0, 0x0, ...)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/client/scan_client.go:372 +0x371
      github.com/couchbase/indexing/secondary/queryport/client.(*GsiClient).ScanAll.func1(0xc420127650, 0xc4237b0b60, 0x4f41920, 0xf, 0xc420d7b0c8)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/client/client.go:511 +0x147
      github.com/couchbase/indexing/secondary/queryport/client.(*GsiClient).doScan(0xc42090ee10, 0xa8e36a8e7a3aafec, 0xc422ca4a50, 0x24, 0xc421d19d98, 0xc4266c95f8, 0x4667637)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/client/client.go:710 +0xa8d
      github.com/couchbase/indexing/secondary/queryport/client.(*GsiClient).ScanAll(0xc42090ee10, 0xa8e36a8e7a3aafec, 0xc422ca4a50, 0x24, 0x7fffffffffffffff, 0xc4207faa01, 0x0, 0xc4237d0e70, 0xc4238de6c0, 0xc4237d0e70)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/client/client.go:512 +0x308
      github.com/couchbase/indexing/secondary/queryport/n1ql.(*secondaryIndex).ScanEntries(0xc427736b00, 0xc422ca4a50, 0x24, 0x7fffffffffffffff, 0x49560be, 0x9, 0x4ede580, 0xc4238d8ee0, 0xc4294df740)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/n1ql/secondary_index.go:742 +0x2ee
      github.com/couchbase/query/execution.(*PrimaryScan).scanEntries(0xc423e19c20, 0xc427ba7d40, 0xc4294df740)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/query/execution/scan_primary.go:210 +0x308
      created by github.com/couchbase/query/execution.(*PrimaryScan).scanPrimary
      /Users/eben/src/spock/goproj/src/github.com/couchbase/query/execution/scan_primary.go:71 +0x164
      2016/12/15 09:57:56 http: Accept error: accept tcp [::]:9093: accept: too many open files; retrying in 5ms
      2016-12-15T09:57:56.452-08:00 [Error] [GsiScanClient:"127.0.0.1:10011"] ScanAll(da3b144d-337d-4a96-8619-e5e5e2a0d5a7) response failed `Index scan timed out`
      _time=2016-12-15T09:57:56.489-08:00 _level=ERROR _msg= panic=runtime error: invalid memory address or nil pointer dereference stack=goroutine 4151342 [running]:
      github.com/couchbase/query/execution.(*Context).Recover(0xc423a23440)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/query/execution/context.go:392 +0xc7
      panic(0x483fb40, 0xc4200160d0)
      /Users/eben/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/panic.go:458 +0x243
      github.com/couchbase/indexing/secondary/queryport/n1ql.makeResponsehandler.func2(0x4ee1cc0, 0xc4207f0400, 0xc427098fe8)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/n1ql/secondary_index.go:847 +0xc21
      github.com/couchbase/indexing/secondary/queryport/client.(*GsiScanClient).streamResponse(0xc420127650, 0x4eeb820, 0xc427098fe8, 0xc42455ced0, 0xc421e36160, 0xc427317c50, 0x24, 0x0, 0x0, 0x0)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/client/scan_client.go:611 +0x5e4
      github.com/couchbase/indexing/secondary/queryport/client.(*GsiScanClient).ScanAll(0xc420127650, 0xa8e36a8e7a3aafec, 0xc427317c50, 0x24, 0x7fffffffffffffff, 0x1, 0x0, 0xc421e36160, 0x0, 0x0, ...)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/client/scan_client.go:372 +0x371
      github.com/couchbase/indexing/secondary/queryport/client.(*GsiClient).ScanAll.func1(0xc420127650, 0xc4237b0b60, 0x4f41920, 0xf, 0xc420d7b0c8)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/client/client.go:511 +0x147
      github.com/couchbase/indexing/secondary/queryport/client.(*GsiClient).doScan(0xc42090ee10, 0xa8e36a8e7a3aafec, 0xc427317c50, 0x24, 0xc429c93d98, 0xc4203e45f8, 0x4667637)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/client/client.go:710 +0xa8d
      github.com/couchbase/indexing/secondary/queryport/client.(*GsiClient).ScanAll(0xc42090ee10, 0xa8e36a8e7a3aafec, 0xc427317c50, 0x24, 0x7fffffffffffffff, 0xc422c20001, 0x0, 0xc421e36160, 0xc4238de6c0, 0xc421e36160)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/client/client.go:512 +0x308
      github.com/couchbase/indexing/secondary/queryport/n1ql.(*secondaryIndex).ScanEntries(0xc42184d080, 0xc427317c50, 0x24, 0x7fffffffffffffff, 0x49560be, 0x9, 0x4ede580, 0xc423bf5f80, 0xc4234b5230)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/indexing/secondary/queryport/n1ql/secondary_index.go:742 +0x2ee
      github.com/couchbase/query/execution.(*PrimaryScan).scanEntries(0xc423e18780, 0xc423a23440, 0xc4234b5230)
      /Users/eben/src/spock/goproj/src/github.com/couchbase/query/execution/scan_primary.go:210 +0x308
      created by github.com/couchbase/query/execution.(*PrimaryScan).scanPrimary
      /Users/eben/src/spock/goproj/src/github.com/couchbase/query/execution/scan_primary.go:71 +0x164

      goroutine 4151342 [running]:


      Looking at log files in MB-21985, it is clear system has run out of FDs - we should handle this case more gracefully in backfill code path, so we return an error to terminate the scan when we can't open a backfill file, rather than panic.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            prataprc Pratap Chakravarthy (Inactive)
            siri Sriram Melkote (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty