Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-56275

DCP Consumer StreamEnd when buffering can leak the flow control ack_bytes (was: [System test] [CDC]: System test rebalance hangs)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.2.0
    • 6.6.0, 6.5.0, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.1.0, 7.1.1, 7.1.2, 7.2.0, 7.1.3
    • couchbase-bucket
    •  7.2.0 build 5275
    • Untriaged
    • Centos 64-bit
    • 0
    • No

    Description

      Steps to Repro
      1. Start a longevity test on 7.2 using the following script.

      ./sequoia -client 172.23.104.27:2375 -provider file:centos_pine.yml -test tests/integration/7.2/test_7.2.yml -scope tests/integration/7.2/scope_7.2_magma.yml -scale 3 -repeat 0 -log_level 0 -version 7.2.0-5275 -skip_setup=false -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=604800 -show_topology=true
      

      2. We saw a few bugs MB-56097, MB-56274 (adding in case they could cause a side effect like this)

      [2023-04-02T16:33:50-07:00, sequoiatools/cbutil:141f26] /cbinit.py 172.23.106.100 root couchbase stop
      [2023-04-02T16:34:04-07:00, sequoiatools/cmd:abb0b1] 10
      [2023-04-02T16:34:20-07:00, sequoiatools/couchbase-cli:7.1:3061e7] rebalance -c 172.23.108.103:8091 -u Administrator -p password
       
      Error occurred on container - sequoiatools/couchbase-cli:7.1:[rebalance -c 172.23.108.103:8091 -u Administrator -p password]
       
      docker logs 3061e7
      docker start 3061e7
       
      ������sWARNING: couchbase-cli version 7.1.0-1345-enterprise does not match couchbase server version 7.2.0-5275-enterprise
      ������*Unable to display progress bar on this os
      ������JERROR: Rebalance failed. See logs for detailed reason. You can try again.
      [2023-04-02T16:34:57-07:00, sequoiatools/cmd:64231d] 60
      [2023-04-02T16:36:02-07:00, sequoiatools/cmd:a7bef4] 180
      [2023-04-02T16:39:08-07:00, sequoiatools/cbutil:df95b6] /cbinit.py 172.23.106.100 root couchbase start
      [2023-04-02T16:39:15-07:00, sequoiatools/cmd:465a53] 300
      [2023-04-02T16:44:21-07:00, sequoiatools/couchbase-cli:7.1:94deea] server-add -c 172.23.108.103:8091 --server-add https://172.23.106.100 -u Administrator -p password --server-add-username Administrator --server-add-password password --services data
       
      Error occurred on container - sequoiatools/couchbase-cli:7.1:[server-add -c 172.23.108.103:8091 --server-add https://172.23.106.100 -u Administrator -p password --server-add-username Administrator --server-add-password password --services data]
       
      docker logs 94deea
      docker start 94deea
       
      ������sWARNING: couchbase-cli version 7.1.0-1345-enterprise does not match couchbase server version 7.2.0-5275-enterprise
      ������=ERROR: Prepare join failed. Node is already part of cluster.
      [2023-04-02T16:44:28-07:00, sequoiatools/couchbase-cli:7.1:e1fa83] rebalance -c 172.23.108.103:8091 -u Administrator -p password
       
      Error occurred on container - sequoiatools/couchbase-cli:7.1:[rebalance -c 172.23.108.103:8091 -u Administrator -p password]
       
      docker logs e1fa83
      docker start e1fa83
       
      ������sWARNING: couchbase-cli version 7.1.0-1345-enterprise does not match couchbase server version 7.2.0-5275-enterprise
      ������*Unable to display progress bar on this os
      ������JERROR: Rebalance failed. See logs for detailed reason. You can try again.
      [2023-04-02T16:44:46-07:00, sequoiatools/cmd:9c8f5f] 60
      [2023-04-02T16:45:52-07:00, sequoiatools/couchbase-cli:7.1:e3a8e4] setting-autofailover -c 172.23.108.103:8091 -u Administrator -p password --enable-auto-failover=0
      [2023-04-02T16:45:58-07:00, sequoiatools/cmd:75e31f] 600
      [2023-04-02T16:56:03-07:00, sequoiatools/pillowfight:7.0:8bf8f0] -U couchbase://172.23.108.103/default?select_bucket=true -I 3000 -B 300 -t 4 -c 100 -P password
      [2023-04-02T21:38:46-07:00, sequoiatools/couchbase-cli:7.1:ab7223] server-add -c 172.23.108.103:8091 --server-add https://172.23.104.5 -u Administrator -p password --server-add-username Administrator --server-add-password password --services fts
       
      Error occurred on container - sequoiatools/couchbase-cli:7.1:[server-add -c 172.23.108.103:8091 --server-add https://172.23.104.5 -u Administrator -p password --server-add-username Administrator --server-add-password password --services fts]
       
      docker logs ab7223
      docker start ab7223
       
      ������sWARNING: couchbase-cli version 7.1.0-1345-enterprise does not match couchbase server version 7.2.0-5275-enterprise
      ������BERROR: Node addition is disallowed while rebalance is in progress
      [2023-04-02T21:38:53-07:00, sequoiatools/couchbase-cli:7.1:6b57da] rebalance -c 172.23.108.103:8091 -u Administrator -p password
      

      One of the rebalance hangs in KV. I am not exactly able to pinpoint kind of rebalance(as I see some failures to adding nodes as can be seen from above set of logs). I saw that rebalance was hung at exactly same vbuckets/incoming/outgoing docs for over 4 hours.

      cbcollect_info attached.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              Balakumaran.Gopal Balakumaran Gopal
              Balakumaran.Gopal Balakumaran Gopal
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty