Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-56320

[CDC] System is hung because of goroutine 920054 [select, 137 minutes]: net/http.(*persistConn).roundTrip(0xc021666240, 0xc014b47f00) /home/couchbase/.cbdepscache/exploded/x86_64/g((Mutex is locked at the start of loadIndexes().)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • 7.2.0
    • 7.2.0
    • secondary-index
    • 7.2.0-5285

    Description

      Note:
      While debugging MB-56318 Donald Haggart noticed system was hung because of

      goroutine 920054 [select, 137 minutes]:
      net/http.(*persistConn).roundTrip(0xc021666240, 0xc014b47f00)
              /home/couchbase/.cbdepscache/exploded/x86_64/go-1.19.7/go/src/net/http/transport.go:2620 +0x974
      net/http.(*Transport).roundTrip(0x4046ac0, 0xc006ce5e00)
              /home/couchbase/.cbdepscache/exploded/x86_64/go-1.19.7/go/src/net/http/transport.go:595 +0x7ba
      net/http.(*Transport).RoundTrip(0xeeafdf?, 0x0?)
              /home/couchbase/.cbdepscache/exploded/x86_64/go-1.19.7/go/src/net/http/roundtrip.go:17 +0x19
      github.com/couchbase/cbauth.(*cbauthRoundTripper).RoundTrip(0xc00039c1c0, 0xc006ce5d00)
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/cbauth/convenience.go:76 +0x43f
      net/http.send(0xc006ce5d00, {0x2d31b60, 0xc00039c1c0}, {0x26415c0?, 0x1?, 0x0?})
              /home/couchbase/.cbdepscache/exploded/x86_64/go-1.19.7/go/src/net/http/client.go:251 +0x5f7
      net/http.(*Client).send(0xc000190d50, 0xc006ce5d00, {0x0?, 0xeeafdf?, 0x0?})
              /home/couchbase/.cbdepscache/exploded/x86_64/go-1.19.7/go/src/net/http/client.go:175 +0x9b
      net/http.(*Client).do(0xc000190d50, 0xc006ce5d00)
              /home/couchbase/.cbdepscache/exploded/x86_64/go-1.19.7/go/src/net/http/client.go:715 +0x8fc
      net/http.(*Client).Do(...)
              /home/couchbase/.cbdepscache/exploded/x86_64/go-1.19.7/go/src/net/http/client.go:581
      github.com/couchbase/cbauth/metakv.doCallInner(0xc0003948f0, {0x2706758, 0x3}, {0x272d225?, 0x40?}, 0x0)
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/cbauth/metakv/metakv.go:102 +0x30e
      github.com/couchbase/cbauth/metakv.doCall(0xc00291b360?, {0x2706758?, 0xeeb327?}, {0x272d225?, 0x2553ee0?}, 0x261c201?)
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/cbauth/metakv/metakv.go:119 +0x65
      github.com/couchbase/cbauth/metakv.doJSONCall(0xc02c9a7470?, {0x2706758?, 0xc004a42a60?}, {0x272d225?, 0xffffffffffffffff?}, 0x7b?, {0x22f94e0, 0xc003a7caf0})
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/cbauth/metakv/metakv.go:130 +0x2d
      github.com/couchbase/cbauth/metakv.(*store).get(0xc00291b4a0?, {0x272d225, 0x19})
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/cbauth/metakv/metakv.go:171 +0xa9
      github.com/couchbase/cbauth/metakv.Get(...)
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/cbauth/metakv/metakv.go:333
      github.com/couchbase/indexing/secondary/common.GetSettingsConfig.func1(0xf33469?, {0xc00291b540?, 0xfd098a?})
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/common/settings.go:37 +0x19e
      github.com/couchbase/indexing/secondary/common.(*RetryHelper).Run(0xc00291b570)
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/common/retry_helper.go:36 +0x83
      github.com/couchbase/indexing/secondary/common.GetSettingsConfig(0x2724553?)
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/common/settings.go:49 +0x90
      github.com/couchbase/indexing/secondary/queryport/n1ql.NewGSIIndexer2({0xc0150840f0, 0x15}, {0x270c095, 0x7}, {0xc02333b728, 0xf}, {0xc01a379068, 0x8}, {0xc015084078, 0x11}, ...)
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/queryport/n1ql/secondary_index.go:170 +0x3e9
      github.com/couchbase/query/datastore/couchbase.(*collection).loadIndexes(0xc031853e00)
              /home/couchbase/jenkins/workspace/couchbase-server-
      

      which has been holding the collection's lock for 137 minutes. (Mutex is locked at the start of loadIndexes().)

      Based on Donald Haggart suggestion, logging a seprate bug

      Steps:

      1. Create a 2 KV and 1 index/query node cluster.
      2. Create a magma bucket(replicas=1) and collections(total collection count including default collections is 51)
      3. Create 500000000 items sequentially(After creation of few thousands of documents update
      4. Update 500000000 created in above step
      5. Create 500000000 items sequentially
      6. Update 500000000 created in above step
      7. Create five indexes Wait for index building.
      8. Rebalance in KV with Loading of docs. (Rebalance completed successfully)
      9. Rebalance Out KV with Loading of docs.(Rebalance completed successfully)
      10. Rebalance In_Out KV with Loading of docs.
      11. Pause the rebalance and Enable CDC bucket_history_retention_seconds=259200,bucket_history_retention_bytes=10000000000000)
      12. Again trigger rebalance in_out KV with loading of docs (Rebalance completed successfully)
      13. Gracefull failover a node , Add a node and trigger rebalance(A swap rebalance)
      14. Rebalance exited with reason {service_rebalance_failed,index,
        {agent_died,<34340.5783.0>,

      QE-TEST:

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/ankush_temp_job3.ini -p bucket_storage=magma,bucket_eviction_policy=fullEviction,rerun=False -t aGoodDoctor.Hospital.Murphy.ClusterOpsVolume,nodes_init=2,graceful=True,skip_cleanup=True,num_items=50000000,num_buckets=1,bucket_names=GleamBook,doc_size=1024,bucket_type=membase,eviction_policy=fullEviction,iterations=5,batch_size=1000,sdk_timeout=60,log_level=info,infra_log_level=error,rerun=False,skip_cleanup=True,key_size=18,randomize_doc_size=False,randomize_value=True,assert_crashes_on_load=True,num_collections=50,maxttl=10,num_indexes=5,pc=10,index_nodes=1,cbas_nodes=0,fts_nodes=0,ops_rate=200000,ramQuota=102400,doc_ops=create:update:delete:read,mutation_perc=100,rebl_ops_rate=30000,key_type=RandomKey -m rest'
      

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            varun.velamuri Varun Velamuri
            ankush.sharma Ankush Sharma
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty