Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-13286

Disconnection by consumer because of no-op not sent by producer causes rebalance hang

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 3.0.3, 4.0.0
    • 3.0, 3.0.1, 3.0.2
    • couchbase-bucket
    • Security Level: Public
    • None
    • Untriaged
    • Unknown
    • Mar 9 - Mar 27

    Description

      If the no-op message is not received before a specific interval, the consumer will disconnect the connection causing rebalance to hang.

      It was noted that the step function of that particular producer wasn't visited at all, there by not sending the no-op and thus causing the consumer to disconnect. This is because we were failing to notify the producer connection.

      Logs filled with timeouts in checkpoint persistence:
      ...
      n_3/logs/memcached.log.0.txt:Tue Feb 3 14:07:30.466913 PST 3: (default) Notified the timeout on checkpoint persistence for vbucket 890, id 0, cookie 0x10db21480
      n_3/logs/memcached.log.0.txt:Tue Feb 3 14:07:51.466905 PST 3: (default) Notified the timeout on checkpoint persistence for vbucket 890, id 0, cookie 0x10db21480
      n_3/logs/memcached.log.0.txt:Tue Feb 3 14:08:22.466663 PST 3: (default) Notified the timeout on checkpoint persistence for vbucket 890, id 0, cookie 0x10db21480
      n_3/logs/memcached.log.0.txt:Tue Feb 3 14:08:53.473040 PST 3: (default) Notified the timeout on checkpoint persistence for vbucket 890, id 0, cookie 0x10db21480
      ...

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-13286
          # Subject Branch Project Status CR V

          Activity

            People

              abhinav Abhi Dangeti
              abhinav Abhi Dangeti
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty