Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-19656

CBAuth stale errors due to slow server start

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 4.1.2, 4.5.1
    • 4.0.0, 4.1.0, 4.1.1, 4.5.0
    • ns_server
    • None
    • Untriaged
    • Unknown

    Description

      as seen in MB-19610:

      CBAuth stale error was caused by slow server restarts
      If the time span between components start and menelaus barrier being lifted is more than 5 sec, here's what happens:
      components starts and sends revrpc request
      request hangs
      in 5 sec component gets "stale" error and panicks
      component is restarted and sends another revrpc request
      and so on
      since we do not care about health of the socket while we wait for the barrier, multiple requests from multiple instances of the component get stacked waiting
      which creates multiple "stale" messages

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-19656
          # Subject Branch Project Status CR V

          Activity

            People

              artem Artem Stemkovski
              artem Artem Stemkovski
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty