Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50167

Add transactions metrics via new Sub-Document specs

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Won't Do
    • Major
    • None
    • None
    • memcached
    • None
    • 1

    Description

      High-level request

      From CBD-4208:

      For usage tracking and adoption tracking, we should have a stat for tracking created transactions.

      This stat would have to be collected by phonehome and cbcollect, so the server needs to be aware of it in some manner.

      Background context

      The transactions protocol involves writing an Active Transaction Record (ATR) entry on each attempt.  There can be multiple attempts in a single transaction due to retries.

      These ATR entries are created in the ATR documents.  The format is detailed here. ].  An ATR document contains an 'attempts' xattr (a JSON map), with each entry being an object in this map, keyed by that attempt's attempt-id (a UUID).  The contents of the object include the transaction attempt's state - Pending, Committed, or Aborted. 

      Creating an entry, or updating it as the attempt passes through these states, is done using the Sub-Document API.  Details of exactly how the API is used are here.  On updating these states we wish to pass additional information for tracking and metrics purposes.

      Request

      After evaluating several options on CBD-4208 with the KV team, the desired approach to add metrics and tracking for transaction is as follows.

      Support a new Sub-Document spec, solely for transactions use, that can be sent at three points in the transaction lifecycle (the Pending, Committed and Aborted states mentioned above), and allow JSON to be sent in the spec value containing tracking information, that memcached will turn into Prometheus stats.

      Prometheus stat names in the form of "transaction_attempts_initiated" are used below.  I am not very familiar with the stat tracking, please suggest more appropriate names if these don't fit the existing style.

       

      Pending

      Here are the fields we would like to send at the Pending point, when the ATR entry is created on a new transaction attempt.

      Technical limitation: There are some fields we want to send only on the first attempt in a transaction.  But there is a current technical limitation with that: at present, query intentionally does not know whether it is starting a new attempt on a new transaction, or a new attempt inside the same transaction.  We will need to pass addl info to query in a future server release to resolve this.  Until then, the stats won't be perfectly correct.

       

       

      JSON field Affects stat Description
      t transactions_initiated The presence of a "t" field increments this stat by one, indicating that a new transaction is initiated.
       
      Only sent on the first attempt in a transaction (except for query).
      a transaction_attempts_initiated The presence of an "a" field increments this stat by one, indicating that a new transaction attempt - not necessarily a transaction - is initiated.
      i transaction_implementation "java", "cpp", "go" etc.
       
      Only sent on the first attempt in a transaction (except for query).
       
      I am not sure what is possible with Promethus stat tracking.  It would be useful to record how many of each there are.  Perhaps separate stats for transaction_implementation_java etc., created dynamically (assuming that's possible)?  Or maybe there are better ways of tracking such discrete stats - let me know.
       
      I'd prefer that mapping be done in memcached rather than transaction_implementation_java have to be sent in the JSON, so we can easily support future clients.
      p transaction_protocol "2.0", "2.1" etc.  Will always be in format "X.Y" - no patch version.
       
      The version of the protocol used by the implementation, to help with tracking update and future deprecation decisions.
       
      Only sent on the first attempt in a transaction (except for query).
       
      Handled in the same way as transaction_implementation, e.g. mapped inside memcached to "transaction_protocol_2_0" etc.  Unless more advanced/useful ways of tracking these kinds of discrete stats are possible.
      f transaction_* An array of transaction feature, ["feature_x", "field_y"].
       
      Will not always be sent.
       
      Mapped inside memcached to integer stats "transaction_feature_x", "transaction_field_y" etc., which are created dynamically (assuming that is possible).  The presence of "feature_x" increments transaction_feature_x by one.
       
      This allows us to track basic usage of future transaction features in a backwards compatible way and without needing server changes.

      So the JSON would be sent as:

      {
        "t":1,       // sent on 1st attempt only
        "a":1,       // sent always
        "i":"java",  // sent on 1st attempt only
        "p":"2.1"    // sent on 1st attempt only
        "f":["mc"]   // optionally sent
      } 

       

      I propose we send 1 rather than "true" or "false" to save space (it wins out by one byte over another space-saving packing - sending an array).

       

      Committed & Aborted

      Here are the fields we would like to send when modifying an ATR entry at the Committed and Aborted points, when we have more context on the transaction:

       

      JSON field Affects stat Description
      q transaction_attempts_involving_query The presence of a "q" field increments this stat by 1, indicating that a transaction attempt was (possibly partially) processed by the query service.
      d transaction_number_of_docs How many docs were involved in this transaction attempt, as an integer.
       
      I am not sure what is possible with Promethus stat tracking.  It would be good to record max, min, histograms, etc.
      f transaction_* Same as the field sent at Pending point.

      So the JSON would appear as:

      {
        "q":1,     // sent always
        "d":43     // sent always
        "f":["mc"] // optionally sent
      }

       

      Compatiblity

      Any fields in the JSON that are not understood should be ignored, for compatibility with future releases supporting more tracking info.

      The "f" field allows tracking some basic feature usage in a backwards-compatible way.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              graham.pople Graham Pople
              graham.pople Graham Pople
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty