Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-43818

Expand information captured from exceptions which are (probably) fatal

    XMLWordPrintable

Details

    • 1

    Description

      Background

      In the C++ code in Couchbase Server we have a couple of "hail-mary" catch blocks; to catch exceptions which shouldn't have been thrown (and uncaught) here, but if they have , we try to do something about them. The main two instance which spring to mind are:

      1. Frontend worker thread (Connection::executeCommandsCallback) - if in the process of executing a client request an uncaught exception is thrown; we close that connection.
      2. Backend executor pool (GlobalTask::execute) - if during background task execution an uncaught exception is thrown, we cannot do very much as the state of the system is unknown - we explicitly terminate the process.

      Problem

      When such exceptions are caught, we have limited information about where the exception was thrown. We have the exception's what() message which might have enough information to uniquely identify where it was thrown from, but that is unlikely for some of our more common exceptions - for example ThrowExceptionUnderflowPolicy as thrown by NonNegativeCounter, of which there are many instance of.

      This makes debugging such problems difficult - particulary for case (#2) where the effects of such uncaught exceptions are severe (complete process termination).

      Additionally, for case (#2) until recently we used to not explicitly catch and terminate; with CB3ExecutorPool no "hail-mary" catch block was present so the uncaught exception would fall out of the background thread and terminate the process for us. this has the advantage that the recorded minidump backtrace typically records where the exception was thrown (as it was never caught).

      Approaches

      1. Automatically record backtraces of all exceptions when thrown; and log these in the last-catch exception handlers mentioned above. At least one library exists which can do this - Folly's exception_tracer, however this has a number of potential issues (and I suspect other similar libraries would also):
        • Automagically adds backtraces to all exceptions.
        • Relies on libstdc++ internals; so Linux-only.
        • Adds cost (space + time) to all thrown exceptions (to record the backtrace)
        • Experience tells me getting it working across the multiple executables / shared libraries which make up our processes is likely to be hard.
      2. Manually record backtraces of selected exceptions on a case-by-case basis. This is the approach which Boost::stacktrace supports - at the point exceptions are thrown they can be wrapped with the current backtrace, which can later be examined when exception caught.
        • Cross platform
        • Adds cost (space + time) only to chosen exceptions. Given we typically know which exceptions are of interest (are going to terminate the process), the extra cost is not a problem
        • Requires manually throwing in a special way (throw_with_trace)
      3. Manually add extra context information to relevant exceptions. For example, Monotonic<> already includes the type of the underlying Monotonic<> instantiation and an optional label to to the exception message. We could do similar for cb::NonNegativeCounter and friends.
        • Cross platform, easy to incrementally add.
        • Doesn't give full context (i.e. where exception was thrown from), just the identity.
        • Doesn't work for non-custom exceptions (e.g. std::runtime_error); we'd have to create our own subclasses to store extra info).

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-43818
          # Subject Branch Project Status CR V

          Activity

            People

              drigby Dave Rigby (Inactive)
              drigby Dave Rigby (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty