Details
-
Bug
-
Resolution: Fixed
-
Critical
-
3.1.6, 4.1.2, 4.5.1, 4.6.5, 5.0.1, 5.1.1, 5.5.0
-
Security Level: Public
-
Untriaged
-
Release Note
-
No
Description
Collecting the hash statistic from memcached causes significant replication delay, which severely affects the response times of replicateTo requests.
It's suspected that this is because the hash stat uses the visitDepth() method of the hashtable (this is the only piece in the codebase which uses it) which uses inefficient locking:
void HashTable::visitDepth(HashTableDepthVisitor &visitor) { |
if (valueStats.getNumItems() == 0 || !isActive()) { |
return; |
}
|
size_t visited = 0; |
VisitorTracker vt(&visitors);
|
|
for (int l = 0; l < static_cast<int>(mutexes.size()); l++) { |
LockHolder lh(mutexes[l]);
|
for (int i = l; i < static_cast<int>(size); i+= mutexes.size()) { |
size_t depth = 0; |
StoredValue* p = values[i].get().get();
|
In this code the lock for the relevant hashtable 'buckets' is held until all have been iterated over, rather than releasing it between iterations.
This is a significant problem as even though the hash statistics are very rarely required, they are requested as a part of every single cbcollect_info.
Reproduction
Below is a very basic async Java application which runs upserts with replicateTo=1:
package com.matt; |
|
import com.couchbase.client.java.*; |
import com.couchbase.client.java.document.JsonDocument; |
import com.couchbase.client.java.document.json.JsonArray; |
import com.couchbase.client.java.document.json.JsonObject; |
import rx.Observable; |
|
import java.text.DateFormat; |
import java.text.SimpleDateFormat; |
import java.util.Date; |
import java.util.TimeZone; |
import java.util.concurrent.TimeUnit; |
import java.util.concurrent.TimeoutException; |
|
|
public class Main { |
|
public static void main(String... args) { |
|
// Initialize the Connection |
Cluster cluster = CouchbaseCluster.create("localhost"); |
cluster.authenticate("matt.carabine", "correcthorsebatterystaple"); |
AsyncBucket bucket = cluster.openBucket("default").async(); |
// Create a JSON Document |
JsonObject arthur = JsonObject.create()
|
.put("name", "Arthur") |
.put("email", "kingarthur@couchbase.com") |
.put("interests", JsonArray.from("Holy Grail", "African Swallows")) |
.put("lorem_ipsum", "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."); |
|
for (int i = 0; i < 100000000; i++) { |
JsonDocument doc = JsonDocument.create("Doc::" + i, arthur); |
Observable
|
.just(doc)
|
.flatMap(v -> bucket.upsert(v, ReplicateTo.ONE).timeout(1, TimeUnit.SECONDS)) |
.forEach(document -> {
|
}, error -> {
|
if (error.getClass() == TimeoutException.class) { |
TimeZone tz = TimeZone.getTimeZone("UTC"); |
DateFormat df = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZ"); // Quoted "Z" to indicate UTC, no timezone offset |
df.setTimeZone(tz);
|
String nowAsISO = df.format(new Date()); |
System.out.println(nowAsISO);
|
} else { |
error.printStackTrace();
|
}
|
}
|
);
|
try { |
Thread.sleep(2); |
} catch (InterruptedException e) { |
|
}
|
}
|
}
|
}
|
Running the following command during the execution of the program causes timeouts:
/opt/couchbase/bin/cbstats -u matt.carabine -p correcthorsebatterystaple localhost:11210 -b default hash
|
As soon as the cbstats command finishes, the timeouts stop.