Loading...

XML

Word

Printable

Details

Type: Task
Resolution: Won't Fix
Priority: Major
Fix Version/s: 6.5.0
Affects Version/s: 4.1.1
Component/s: secondary-index
Labels:
None
Environment:
Ubuntu, 64-bit, 3 nodes 4CPU x 4GB RAM

Description

There is a problem with handling expired documents by GSI that causes index fragmentation (up to 98%) and results with enormous grow of used disk space ([index data size]:[index disk size] =~ [1]:[1/(1-fragmentation rate)]). Disk space usage returns to normal only after CB-server restart.
How to reproduce:
1. 4.1.1-EE GA (5914), 3 nodes (4 CPU x 4 GB RAM)
2. All nodes have all services enabled
3. Establish a cluster with default bucket (256Mb, full eviction, no password, 1 replica, view index replicas, i/o proirity = high, flush = enable)
4. Run the following code (or write your own: 20 threads make sync-api-insertion of 100.000 docs per thread with indexed fields; need to create indexes too):

package highcpuafterload;

import com.couchbase.client.java.Bucket;

import com.couchbase.client.java.Cluster;

import com.couchbase.client.java.CouchbaseCluster;

import com.couchbase.client.java.document.JsonDocument;

import com.couchbase.client.java.document.json.JsonObject;

import com.couchbase.client.java.env.CouchbaseEnvironment;

import com.couchbase.client.java.env.DefaultCouchbaseEnvironment;

import com.couchbase.client.java.query.N1qlQuery;

import java.util.LinkedList;

import java.util.concurrent.Phaser;

public class BombardaMaxima extends Thread {

    private final int tid;

    // configure here

    private static final int threads = 20;

    private static final int docsPerThread = 100000;

    private static final int docTTLms = 30 * 1000;

    private static final int dumpToConsoleEachNDocs = 1000;

    private static final Phaser phaser = new Phaser(threads + 1);

    private static final CouchbaseEnvironment ce;

    private static final Cluster cluster;

    private static final String bucket = "default";

    static {

        ce = DefaultCouchbaseEnvironment.create();

        final LinkedList<String> nodes = new LinkedList();

        nodes.add("A.node");

        nodes.add("B.node");

        nodes.add("C.node");

        cluster = CouchbaseCluster.create(ce, nodes);

        final Bucket b = cluster.openBucket(bucket);

        final String iQA = "CREATE INDEX iQA ON `default`(a, b) WHERE a is valued USING GSI";

        final String iQX = "CREATE INDEX iQX ON `default`(a, c) WHERE a is valued USING GSI";

        b.query(N1qlQuery.simple(iQA));

        b.query(N1qlQuery.simple(iQX));

    public BombardaMaxima(final int tid) {

        this.tid = tid;

    public final void run() {

        try {

            Bucket b = null;

            synchronized(cluster) { b = cluster.openBucket(bucket); }

            final long stm = System.currentTimeMillis();

            final JsonObject jo = JsonObject

                    .empty()

                    .put("a", stm)

                    .put("b", stm)

                    .put("c", stm);

            for(int i = 0; i< docsPerThread; i++) {

                b.upsert(JsonDocument.create(

                        tid + ":" + System.currentTimeMillis(),

                        (int)((System.currentTimeMillis() + docTTLms) / 1000),

jo)

);

                if (i % dumpToConsoleEachNDocs == 0) System.out.println("T[" + tid + "] = " + i);

        } catch(final Exception e) {

            e.printStackTrace();

        } finally {

            phaser.arriveAndAwaitAdvance();

    public static void main(String[] args) {

        for(int i = 0; i< threads; i++) new BombardaMaxima(i).start();

        phaser.arriveAndAwaitAdvance();

        System.out.println("DONE");

5. Watch growning indexes defragmentation rate via UI stats.
6. Wait for code run to end.
7. Wait a little bit more (you can also force compaction) until all documents in bucket are expired (you should not press "Documents" because of https://issues.couchbase.com/browse/MB-19758)
8. Now, finally, goto UI stats. You will see big fragmentation rate, and [index data size]:[index disk size] =~ [1]:[1/(1-fragmentation rate)]

My examples of final results for this run (see images below)

Related forum thread: https://forums.couchbase.com/t/index-data-size-index-disk-size-1-50-is-it-normal/8467/2

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

if-1.png
30/May/16 3:50 AM
21 kB
Georgy Repin
if-2.png
30/May/16 3:50 AM
69 kB
Georgy Repin
if-3.png
30/May/16 3:50 AM
49 kB
Georgy Repin
if-4.png
30/May/16 3:50 AM
21 kB
Georgy Repin

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Prathibha Bisarahalli (Inactive)

Reporter:: Georgy Repin

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 30/May/16 3:49 AM

Updated:: 05/Nov/19 12:44 PM

Resolved:: 18/Jan/19 4:05 PM

Gerrit Reviews

There are no open Gerrit changes

GSI: incorrect handling of expired documents causes index fragmentation and enormous grow of used disk space

Details

Description

Attachments

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty