Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-32812

Forward Port MH (MB-32693) - client thread hangs

    XMLWordPrintable

Details

    • Untriaged
    • Unknown

    Description

      Issue reported by Prathibha:

      Steps:
      1. Add "time.Sleep(30 * time.Second)" at line 363 , right after doRecovery() in plasma_slice.go to simulate a slow warm up.
      2. Two node setup: ./cluster_connect -n2 -s 2048 -I 2048 -T n0:kv+n1ql+index,n1:index
      3. Load 1000 docs: ./cbworkloadgen -u Administrator -p asdasd -n 127.0.0.1:9000 -j -i 1000
      4. Run query: create index i1 on default(age) partition by HASH(meta().id) with

      { "num_replica": 1 }

      5. Kill indexer on n_1. This will push n_1 indexer in slow warm up state. Each partition takes 30s, since there are 8 partitions, each index takes 4mins and since there are total of 2 replicas, the warmup itself takes 8 minutes.
      6. Run query: select age, meta().id from default where age is not null

      It always does not hang. sometimes it hangs, sometimes it runs fine. Another observation is, the client hangs for about 10 mins, after which it succeeds with 1000 rows as result.

      This issue repro's in Alice branch as well. When client is hung, the go routine dump of query process is attached (from madhatter repo). Below is stack of routines hung:

      1 @ 0x403042b 0x40401fd 0x48576e1 0x486c76a 0x488506e 0x405e4b1
      #	0x48576e0	github.com/couchbase/indexing/secondary/queryport/client.(*Queue).Enqueue+0xc0			/Users/prathibha/Documents/source/madhatter3/goproj/src/github.com/couchbase/indexing/secondary/queryport/client/queue.go:113
      #	0x486c769	github.com/couchbase/indexing/secondary/queryport/client.(*RequestBroker).SendEntries+0x1d9	/Users/prathibha/Documents/source/madhatter3/goproj/src/github.com/couchbase/indexing/secondary/queryport/client/scatter.go:1164
      #	0x488506d	github.com/couchbase/indexing/secondary/queryport/n1ql.makeResponsehandler.func1+0x65d		/Users/prathibha/Documents/source/madhatter3/goproj/src/github.com/couchbase/indexing/secondary/queryport/n1ql/secondary_index.go:1344
      

      1 @ 0x403042b 0x40304d3 0x4040dac 0x40409e9 0x4075ff4 0x488462b 0x487edb1 0x46e44fa 0x405e4b1
      #	0x40409e8	sync.runtime_Semacquire+0x38									/Users/prathibha/.cbdepscache/exploded/x86_64/go-1.11/go/src/runtime/sema.go:56
      #	0x4075ff3	sync.(*WaitGroup).Wait+0x63									/Users/prathibha/.cbdepscache/exploded/x86_64/go-1.11/go/src/sync/waitgroup.go:130
      #	0x488462a	github.com/couchbase/indexing/secondary/queryport/n1ql.(*secondaryIndex3).Scan3.func2+0x2a	/Users/prathibha/Documents/source/madhatter3/goproj/src/github.com/couchbase/indexing/secondary/queryport/n1ql/secondary_index.go:1105
      #	0x487edb0	github.com/couchbase/indexing/secondary/queryport/n1ql.(*secondaryIndex3).Scan3+0x6e0		/Users/prathibha/Documents/source/madhatter3/goproj/src/github.com/couchbase/indexing/secondary/queryport/n1ql/secondary_index.go:1135
      #	0x46e44f9	github.com/couchbase/query/execution.(*IndexScan3).scan+0x549					/Users/prathibha/Documents/source/madhatter3/goproj/src/github.com/couchbase/query/execution/scan_index3.go:178
      

      [^cbq_goroutine.pprof]

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            jliang John Liang added a comment -

            We also need to test the following:
            1) start 2 indexer
            2) create a large partitioned index so it takes time to scan
            3) Initiate a scan
            4) kill one of the indexer soon after scan start
            5) n1ql should return partial result with error

            jliang John Liang added a comment - We also need to test the following: 1) start 2 indexer 2) create a large partitioned index so it takes time to scan 3) Initiate a scan 4) kill one of the indexer soon after scan start 5) n1ql should return partial result with error

            Build couchbase-server-6.5.0-2250 contains indexing commit 38f3737 with commit message:
            MB-32812: Wait for backfill goroutine to terminate before exiting scan

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.0-2250 contains indexing commit 38f3737 with commit message: MB-32812 : Wait for backfill goroutine to terminate before exiting scan

            Prathibha Bisarahalli Can you instrument the code as done in MB-32693 and let us know if the issue is resolved. QE will try out the test that John has listed above.

            mihir.kamdar Mihir Kamdar (Inactive) added a comment - Prathibha Bisarahalli Can you instrument the code as done in MB-32693 and let us know if the issue is resolved. QE will try out the test that John has listed above.

            Tested the steps given by John, got partial results with an error as expected.

            girish.benakappa Girish Benakappa added a comment - Tested the steps given by John, got partial results with an error as expected.

            Prathibha Bisarahalli pls instrument the code as done in MB-32693 and close the bug after the testing. QE has executed the above mentioned steps.

            mihir.kamdar Mihir Kamdar (Inactive) added a comment - Prathibha Bisarahalli pls instrument the code as done in MB-32693 and close the bug after the testing. QE has executed the above mentioned steps.

            People

              prathibha Prathibha Bisarahalli (Inactive)
              jliang John Liang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty