Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-41857

CouchKVStoreFileCache not "releasing" files after compaction

    XMLWordPrintable

Details

    Description

      Script to Repo

       

      ./testrunner -i /tmp/durability_volume.ini sdk_client_pool=True,rerun=False,get-cbcollect-info=True,GROUP=rebalance_with_collection_crud_durability_PERSIST_TO_MAJORITY -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_rebalance_in,nodes_init=3,nodes_in=2,override_spec_params=durability;replicas,durability=PERSIST_TO_MAJORITY,replicas=2,bucket_spec=multi_bucket.buckets_all_membase_for_rebalance_tests_more_collections,data_load_spec=volume_test_load_with_CRUD_on_collections,data_load_stage=during,quota_percent=80,GROUP=rebalance_with_collection_crud_durability_PERSIST_TO_MAJORITY

      Steps to Reproduce

      1.  Create a 3 node cluster
      2020-10-04 05:09:50,635 | test | INFO | pool-1-thread-7 | [table_view:display:72] Rebalance Overview
      -----------------------++-------------

      Nodes Services Status

      -----------------------++-------------

      172.23.105.211 kv Cluster node
      172.23.105.212 None <--- IN —
      172.23.105.213 None <--- IN —

      -----------------------++-------------
      2. Create buckets and initial data load
      2020-10-04 05:21:57,029 | test | INFO | MainThread | [table_view:display:72] Bucket statistics
      -----------------+----------------------------------------------------+----------

      Bucket Type Replicas Durability TTL Items RAM Quota RAM Used Disk Used

      -----------------+----------------------------------------------------+----------

      bucket1 couchbase 2 none 0 3000 629145600 186185152 348621350
      bucket2 couchbase 2 none 0 3000 629145600 186140784 217369177
      default couchbase 2 none 0 500000 6291456000 473287136 396946078

      -----------------+----------------------------------------------------+----------
      3. Rebalance-in with crud on collections in parallel
      2020-10-04 05:22:05,269 | test | INFO | pool-1-thread-16 | [table_view:display:72] Rebalance Overview
      -----------------------++-------------

      Nodes Services Status

      -----------------------++-------------

      172.23.105.212 kv Cluster node
      172.23.105.213 kv Cluster node
      172.23.105.211 kv Cluster node
      172.23.105.215 None <--- IN —
      172.23.105.217 None <--- IN —

      -----------------------++-------------
      rebalance op fails

      Observations

      grep WARN memcached.log  | grep -v Slow | grep -v "The stream closed early because the conn was disconnected" on .211 

      2020-10-04T05:25:31.141474-07:00 WARNING (default) CouchKVStore::compactDB openDB error:error opening file, vb:292, fileRev:15
      2020-10-04T05:25:53.246033-07:00 WARNING (default) VBucket::addStats: Exception caught during getDbFileInfo for vb:0 - what(): CouchKVStore::getDbInfo: failed to open database file for vb:0 rev = 16 with error:error opening file: No such file or directory
      2020-10-04T05:25:53.247064-07:00 WARNING (default) VBucket::addStats: Exception caught during getDbFileInfo for vb:1 - what(): CouchKVStore::getDbInfo: failed to open database file for vb:1 rev = 16 with error:error opening file: No such file or directory
      

       

       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Build couchbase-server-7.0.0-3480 contains kv_engine commit 9ab0c3e with commit message:
            MB-41857: Make rollback use openDb

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-3480 contains kv_engine commit 9ab0c3e with commit message: MB-41857 : Make rollback use openDb

            Build couchbase-server-7.0.0-3482 contains kv_engine commit fe97b16 with commit message:
            MB-41857: Only adjust file cache limit if open db successful

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-3482 contains kv_engine commit fe97b16 with commit message: MB-41857 : Only adjust file cache limit if open db successful

            Thanks for the logs Ritam Sharma. This looks like a different issue. We have errors such as these in the logs.

            2020-10-21T08:03:59.718420-07:00 WARNING (testbucket1) CouchKVStore::openSpecificDB: error:no such file [No such file or directory], name:/opt/couchbase/var/lib/couchbase/data/testbucket1/0.couch.1, option:2, fileRev:1 
            ...
            2020-10-21T08:03:59.718697-07:00 WARNING (testbucket1) CouchKVStore::openSpecificDB: No such file, found:0 alternative files for /opt/couchbase/var/lib/couchbase/data/testbucket1/0.couch.1
            2020-10-21T08:03:59.718740-07:00 WARNING (testbucket1) CouchKVStore::initBySeqnoScanContext: makeFileHandle failure 0
            2020-10-21T08:03:59.718764-07:00 WARNING 2007: (testbucket1) DCP (Producer) eq_dcpq:replication:ns_1@172.23.120.121->ns_1@172.23.123.41:testbucket1 - DCPBackfillBySeqnoDisk::create() failed to create scan for vb:0, startSeqno:1, PointInTimeEnabled:false
            

            We don't appear to have a couchstore_local.log in your cbcollect (at least for the first node I looked at) and we didn't try to run couch_dbdump for anything either. This leads me to believe that your vBucket files are no longer present. Could you please raise this as a new bug and confirm the presence of these files after your test run.

            ben.huddleston Ben Huddleston added a comment - Thanks for the logs Ritam Sharma . This looks like a different issue. We have errors such as these in the logs. 2020-10-21T08:03:59.718420-07:00 WARNING (testbucket1) CouchKVStore::openSpecificDB: error:no such file [No such file or directory], name:/opt/couchbase/var/lib/couchbase/data/testbucket1/0.couch.1, option:2, fileRev:1 ... 2020-10-21T08:03:59.718697-07:00 WARNING (testbucket1) CouchKVStore::openSpecificDB: No such file, found:0 alternative files for /opt/couchbase/var/lib/couchbase/data/testbucket1/0.couch.1 2020-10-21T08:03:59.718740-07:00 WARNING (testbucket1) CouchKVStore::initBySeqnoScanContext: makeFileHandle failure 0 2020-10-21T08:03:59.718764-07:00 WARNING 2007: (testbucket1) DCP (Producer) eq_dcpq:replication:ns_1@172.23.120.121->ns_1@172.23.123.41:testbucket1 - DCPBackfillBySeqnoDisk::create() failed to create scan for vb:0, startSeqno:1, PointInTimeEnabled:false We don't appear to have a couchstore_local.log in your cbcollect (at least for the first node I looked at) and we didn't try to run couch_dbdump for anything either. This leads me to believe that your vBucket files are no longer present. Could you please raise this as a new bug and confirm the presence of these files after your test run.

            The bulk of the file cache code was reverted which should fix these issues. This went into build CC-3509.

            ben.huddleston Ben Huddleston added a comment - The bulk of the file cache code was reverted which should fix these issues. This went into build CC-3509.

            Did not see the issue after the fix (based on weekly build results), hence closing the ticket

            sumedh.basarkod Sumedh Basarkod (Inactive) added a comment - Did not see the issue after the fix (based on weekly build results), hence closing the ticket

            People

              sumedh.basarkod Sumedh Basarkod (Inactive)
              sumedh.basarkod Sumedh Basarkod (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty