Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-5425

Mc-engine might never receive response from mccouch and get stuck waiting

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 2.0-beta
    • 2.0-developer-preview-4
    • couchbase-bucket
    • Security Level: Public
    • None

    Description

      This is follow-up issue to MB-5367 (synchronous select_bucket). The problem is critical because it shows different wrong behaviors with or without synchronous select_bucket. For tracking purpose, I'd list both here.

      1, Async select_bucket

      Say, we have notify_bucket msg sent to mccouch, but me-engine timeout waiting for response. It would reset connection and keep waiting for the response in waitForReadable. I guess, the logic was the socket connection had been re-established. However, the problem is that the response handler to notify_update request had been deleted as part of resetConnection. An async select_bucket was sent as part of reset as well. Back in waitForReadable, what would mc-engine receive? It is not response to notify_update but select_bucket. In fact, a response to the original notify_update wouyld never come because the old socket connection had been reset. In this case, at least it would return from the wait and continue. However, the end result is wrong, and back in couch-kvstore, it could abort the system because the callback would have neither success nor etmpfail.

      2, Synchronous select_bucket

      With MB-5367, select_bucket by itself is not recursive. It would keep re-send the request until succeeded. Back in waitForReadable, there would be no response to come back. Because select_bucket had its own wait-and-resend logic. Then, mc-engine would simply get stuck waiting for nothing to come back.

      In short, mc-engine synchronous wait code was incorrect, because no response would ever come back after a connection had been reset. Instead, notify_update, delVBucket, and flush all should re-send requests after reset connection.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            liang Liang Guo (Inactive)
            liang Liang Guo (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty