Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49083

[System Test] Indexer crashed while getting stats from another node

    XMLWordPrintable

Details

    Description

      Build : 7.1.0-1542
      Test : -test tests/2i/cheshirecat/test_idx_clusterops_cheshire_cat_recovery.yml -scope tests/2i/cheshirecat/scope_idx_cheshire_cat_dgm.yml
      Scale : 2
      Iteration : 3rd

      This is the GSI recovery test being run on ARM platform on AMZN2 OS on AWS EC2 instances.

      In the third iteration, on one indexer node - 172.31.25.241, indexer crashed. Here is the excerpt from the stack trace :

      2021-10-21T13:36:32.655+00:00 [Warn] stats::restGetStats: Failed to get the most recent stats from node: 172.31.29.243:9102, err: Get "http://172.31.29.243:9102/stats?async=false&partition=true&consumerFilter=smartBatching": dial tcp 172.31.29.243:9102: connect: connection refused Try fetch cached stats.
      2021-10-21T13:36:32.655+00:00 [Error] stats::restGetStats: Failed to get cached stats from node: 172.31.29.243:9102, err: Get "http://172.31.29.243:9102/stats?async=true&partition=true&consumerFilter=smartBatching": dial tcp 172.31.29.243:9102: connect: connection refused
      2021-10-21T13:36:32.655+00:00 [Warn] stats::parallelStatsRestCall took 512.3µs for addr 172.31.29.243:9102 with err Get "http://172.31.29.243:9102/stats?async=true&partition=true&consumerFilter=smartBatching": dial tcp 172.31.29.243:9102: connect: connection refused
      panic: runtime error: invalid memory address or nil pointer dereference
      [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xdd59f0]
       
      goroutine 19982454 [running]:
      github.com/couchbase/indexing/secondary/indexer.(*NodeLoad).addStats(0x4024c9eec0, 0x0)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/rebalancer.go:930 +0x20
      github.com/couchbase/indexing/secondary/indexer.(*Rebalancer).getNodeIndexerStats(0x401a931800, 0x401a333fb0)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/rebalancer.go:1054 +0x230
      github.com/couchbase/indexing/secondary/indexer.(*Rebalancer).publishTransferTokenBatch(0x401a931800, 0x40185d8f01)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/rebalancer.go:631 +0x50
      github.com/couchbase/indexing/secondary/indexer.(*Rebalancer).doRebalance(0x401a931800)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/rebalancer.go:524 +0xd8
      created by github.com/couchbase/indexing/secondary/indexer.(*Rebalancer).initRebalAsync
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/rebalancer.go:423 +0x960
      Initializing write barrier = 8000
      2021-10-21T13:36:33.680+00:00 [Info] Indexer started with command line: [/opt/couchbase/bin/indexer -adminPort=9100 -scanPort=9101 -httpPort=9102 -streamInitPort=9103 -streamCatchupPort=9104 -streamMaintPort=9105 --httpsPort=19102 --certFile=/opt/couchbase/var/lib/couchbase/config/chain.pem --keyFile=/opt/couchbase/var/lib/couchbase/config/pkey.pem --caFile=/opt/couchbase/var/lib/couchbase/config/ca.pem -ipv4=required -ipv6=optional -vbuckets=1024 -cluster=127.0.0.1:8091 -storageDir=/data/@2i -diagDir=/opt/couchbase/var/lib/couchbase/crash -logDir=/opt/couchbase/var/lib/couchbase/logs -nodeUUID=5969cf630d7f541b4b480d685c3062fb -isEnterprise=true]
      2021-10-21T13:36:33.680+00:00 [Info] Setting ipv6=false
      

      Note : This may or may not be ARM specific. We haven't run the test with same build on the regular Centos7 cluster, plus this is a one off crash.

      Attachments

        For Gerrit Dashboard: MB-49083
        # Subject Branch Project Status CR V

        Activity

          People

            kevin.cherkauer Kevin Cherkauer (Inactive)
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty