Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51457

[System Test] cbq-engine crash "runtime: program exceeds 10000-thread limit"

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • No

    Description

      Build : 7.1.0-2475 (RC1)
      Test : -test tests/integration/neo/test_neo_couchstore_milestone4.yml -scope tests/integration/neo/scope_couchstore.yml
      Scale : 3
      Iteration : 6th (5th day)

      cbq-engine process on 172.23.97.119 crashed with the following error and stack dump :

      2022-03-15T11:19:44.798-07:00 [INFO] (Attempt: 0) Pool Get returned bucket5: no connection pool
      2022-03-15T11:19:44.805-07:00 [INFO] (Attempt: 0) Pool Get returned bucket5: no connection pool
      2022-03-15T11:19:44.807-07:00 [INFO] (Attempt: 0) Pool Get returned bucket5: no connection pool
      2022-03-15T11:19:44.892-07:00 [Info] GsiClient::UpdateUsecjson: using collatejson as data format between indexer and GsiClient
      2022-03-15T11:19:44.892-07:00 [INFO] Retrying bucket keyspacenameplaceholder
      2022-03-15T11:19:44.892-07:00 [INFO] keyspace keyspacenameplaceholder not found No bucket named keyspacenameplaceholder
      runtime: program exceeds 10000-thread limit
      fatal error: thread exhaustion
       
      runtime stack:
      runtime.throw(0x2665923, 0x11)
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.6/go/src/runtime/panic.go:1117 +0x72
      runtime.checkmcount()
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.6/go/src/runtime/proc.go:701 +0xac
      runtime.mReserveID(0x3ad22d8)
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.6/go/src/runtime/proc.go:717 +0x3e
      runtime.startm(0xc00005b800, 0xd7e300)
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.6/go/src/runtime/proc.go:2370 +0x92
      runtime.handoffp(0xc00005b800)
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.6/go/src/runtime/proc.go:2412 +0x65
      runtime.retake(0x1b23ee6217148d, 0x2)
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.6/go/src/runtime/proc.go:5366 +0x17d
      runtime.sysmon()
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.6/go/src/runtime/proc.go:5274 +0x185
      runtime.mstart1()
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.6/go/src/runtime/proc.go:1306 +0xc8
      runtime.mstart()
      	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.16.6/go/src/runtime/proc.go:1272 +0x6e
      

      At this point the cluster was in a steady state, with ongoing KV mutations and queries running. Can this be because of the constantly running queries that execute n1ql udf functions ?

      172.23.99.21 is the other query node.

      Following logs were collected at about 1 PM PST on 3/15/22 and has the stack trace for the crash :
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.104.137.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.104.155.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.104.157.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.104.5.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.104.67.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.104.69.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.104.70.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.105.107.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.105.111.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.105.168.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.106.100.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.106.188.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.108.103.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.120.107.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.120.245.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.121.117.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.123.28.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.96.148.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.96.192.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.96.251.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.96.252.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.96.253.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.97.119.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.97.121.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.97.122.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.97.239.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.99.11.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.99.21.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647375338/collectinfo-2022-03-15T201543-ns_1%40172.23.99.25.zip

      These are the set of logs collected a short while before the crash happened.

      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.104.137.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.104.155.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.104.157.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.104.5.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.104.67.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.104.69.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.104.70.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.105.107.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.105.111.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.105.168.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.106.100.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.106.188.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.108.103.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.120.107.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.120.245.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.121.117.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.123.28.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.96.148.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.96.192.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.96.251.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.96.252.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.96.253.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.97.119.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.97.121.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.97.122.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.97.239.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.99.11.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.99.21.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1647368158/collectinfo-2022-03-15T181600-ns_1%40172.23.99.25.zip

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              mihir.kamdar Mihir Kamdar (Inactive)
              mihir.kamdar Mihir Kamdar (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty