Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-30553

'Hash' memcached stat collection causes significant intra-cluster replication delay

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 6.5.0
    • 3.1.6, 4.1.2, 4.5.1, 4.6.5, 5.0.1, 5.1.1, 5.5.0
    • couchbase-bucket
    • Security Level: Public
    • Untriaged
    • Release Note
    • No

    Description

      Collecting the hash statistic from memcached causes significant replication delay, which severely affects the response times of replicateTo requests.

      It's suspected that this is because the hash stat uses the visitDepth() method of the hashtable (this is the only piece in the codebase which uses it) which uses inefficient locking:

      void HashTable::visitDepth(HashTableDepthVisitor &visitor) {
          if (valueStats.getNumItems() == 0 || !isActive()) {
              return;
          }
          size_t visited = 0;
          VisitorTracker vt(&visitors);
       
          for (int l = 0; l < static_cast<int>(mutexes.size()); l++) {
              LockHolder lh(mutexes[l]);
              for (int i = l; i < static_cast<int>(size); i+= mutexes.size()) {
                  size_t depth = 0;
                  StoredValue* p = values[i].get().get();
      

      In this code the lock for the relevant hashtable 'buckets' is held until all have been iterated over, rather than releasing it between iterations.

      This is a significant problem as even though the hash statistics are very rarely required, they are requested as a part of every single cbcollect_info.

      Reproduction
      Below is a very basic async Java application which runs upserts with replicateTo=1:

      package com.matt;
       
      import com.couchbase.client.java.*;
      import com.couchbase.client.java.document.JsonDocument;
      import com.couchbase.client.java.document.json.JsonArray;
      import com.couchbase.client.java.document.json.JsonObject;
      import rx.Observable;
       
      import java.text.DateFormat;
      import java.text.SimpleDateFormat;
      import java.util.Date;
      import java.util.TimeZone;
      import java.util.concurrent.TimeUnit;
      import java.util.concurrent.TimeoutException;
       
       
      public class Main {
       
          public static void main(String... args) {
       
              // Initialize the Connection
              Cluster cluster = CouchbaseCluster.create("localhost");
              cluster.authenticate("matt.carabine", "correcthorsebatterystaple");
              AsyncBucket bucket = cluster.openBucket("default").async();
              // Create a JSON Document
              JsonObject arthur = JsonObject.create()
                      .put("name", "Arthur")
                      .put("email", "kingarthur@couchbase.com")
                      .put("interests", JsonArray.from("Holy Grail", "African Swallows"))
                      .put("lorem_ipsum", "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.");
       
              for (int i = 0; i < 100000000; i++) {
                  JsonDocument doc = JsonDocument.create("Doc::" + i, arthur);
                  Observable
                          .just(doc)
                          .flatMap(v -> bucket.upsert(v, ReplicateTo.ONE).timeout(1, TimeUnit.SECONDS))
                          .forEach(document -> {
                                  }, error -> {
                                      if (error.getClass() == TimeoutException.class) {
                                          TimeZone tz = TimeZone.getTimeZone("UTC");
                                          DateFormat df = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZ"); // Quoted "Z" to indicate UTC, no timezone offset
                                          df.setTimeZone(tz);
                                          String nowAsISO = df.format(new Date());
                                          System.out.println(nowAsISO);
                                      } else {
                                          error.printStackTrace();
                                      }
                                  }
                          );
                  try {
                      Thread.sleep(2);
                  } catch (InterruptedException e) {
       
                  }
              }
          }
      }
      

      Running the following command during the execution of the program causes timeouts:

      /opt/couchbase/bin/cbstats -u matt.carabine -p correcthorsebatterystaple localhost:11210 -b default hash
      

      As soon as the cbstats command finishes, the timeouts stop.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              arunkumar Arunkumar Senthilnathan (Inactive)
              matt.carabine Matt Carabine (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  PagerDuty