Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51414

[Magma] - ActiveStream::processItems checkpoint_end:42608 should not be in the current snapshot range

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 1
    • Yes
    • KV March-22

    Description

      Script to Repro

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/testexec.54049.ini GROUP=rebalance_in_out_P0_set1,rerun=False,disk_optimized_thread_settings=True,get-cbcollect-info=True,autoCompactionDefined=true,get-cbcollect-info=True,infra_log_level=info,log_level=info,upgrade_version=7.1.0-2475 -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_rebalance_in_out,nodes_init=5,nodes_in=2,nodes_out=1,update_replica=True,updated_num_replicas=3,bucket_spec=magma_dgm.5_percent_dgm.5_node_2_replica_magma_512,doc_size=512,randomize_value=True,data_load_stage=during,skip_validations=False,GROUP=rebalance_in_out_P0_set1'
      

      Steps to Repro
      1. Create a 5 node cluster

      2022-03-12 06:05:05,697 | test  | INFO    | MainThread | [table_view:display:72] Cluster statistics
      +----------------+----------+-----------------+-----------+-----------+---------------------+-------------------+-----------------------+
      | Node           | Services | CPU_utilization | Mem_total | Mem_free  | Swap_mem_used       | Active / Replica  | Version               |
      +----------------+----------+-----------------+-----------+-----------+---------------------+-------------------+-----------------------+
      | 172.23.106.163 | kv       | 0.313008639038  | 11.45 GiB | 10.67 GiB | 0.0 Byte / 3.50 GiB | 0 / 0             | 7.1.0-2475-enterprise |
      | 172.23.105.36  | kv       | 1.01669386218   | 11.45 GiB | 10.72 GiB | 0.0 Byte / 3.50 GiB | 0 / 0             | 7.1.0-2475-enterprise |
      | 172.23.105.33  | kv       | 1.83624701295   | 11.45 GiB | 10.61 GiB | 0.0 Byte / 3.50 GiB | 0 / 0             | 7.1.0-2475-enterprise |
      | 172.23.107.164 | kv       | 0               | 0.0 Byte  | 0.0 Byte  | 0.0 Byte / 0.0 Byte | 0 / 0             | 7.1.0-2475-enterprise |
      | 172.23.105.37  | kv       | 0.162886856284  | 11.45 GiB | 10.72 GiB | 0.0 Byte / 3.50 GiB | 0 / 0             | 7.1.0-2475-enterprise |
      +----------------+----------+-----------------+-----------+-----------+---------------------+-------------------+-----------------------+
      

      2. Create Bucket/scopes/collections/data

      2022-03-12 06:13:31,703 | test  | INFO    | MainThread | [table_view:display:72] Bucket statistics
      +---------+-----------+-----------------+----------+------------+-----+----------+-----------+------------+------------+---------------+
      | Bucket  | Type      | Storage Backend | Replicas | Durability | TTL | Items    | RAM Quota | RAM Used   | Disk Used  | ARR           |
      +---------+-----------+-----------------+----------+------------+-----+----------+-----------+------------+------------+---------------+
      | bucket1 | couchbase | couchstore      | 2        | none       | 0   | 50000    | 9.77 GiB  | 237.54 MiB | 171.67 MiB | 100           |
      | bucket2 | couchbase | magma           | 2        | none       | 0   | 50000    | 4.88 GiB  | 510.53 MiB | 317.55 MiB | 100           |
      | default | couchbase | magma           | 2        | none       | 0   | 32575000 | 2.50 GiB  | 1.87 GiB   | 35.97 GiB  | 5.32952264006 |
      +---------+-----------+-----------------+----------+------------+-----+----------+-----------+------------+------------+---------------+
      

      3. Add 2 nodes(172.23.106.156,172.23.106.159), Remove 1 node(172.23.105.37), Update all bucket replicas to 3 and do a rebalance. Rebalance completes fine.

      2022-03-12 06:13:43,536 | test  | INFO    | pool-7-thread-14 | [table_view:display:72] Rebalance Overview
      +----------------+----------+-----------------------+---------------+--------------+-----------------------+
      | Nodes          | Services | Version               | CPU           | Status       | Membership / Recovery |
      +----------------+----------+-----------------------+---------------+--------------+-----------------------+
      | 172.23.106.163 | kv       | 7.1.0-2475-enterprise | 6.43863179074 | Cluster node | active / none         |
      | 172.23.105.36  | kv       | 7.1.0-2475-enterprise | 7.10230856566 | Cluster node | active / none         |
      | 172.23.105.33  | kv       | 7.1.0-2475-enterprise | 6.43982356648 | Cluster node | active / none         |
      | 172.23.107.164 | kv       | 7.1.0-2475-enterprise | 5.58842039018 | Cluster node | active / none         |
      | 172.23.106.156 | kv       | 7.1.0-2475-enterprise | 0             | Cluster node | inactiveAdded / none  |
      | 172.23.106.159 | kv       | 7.1.0-2475-enterprise | 0             | Cluster node | inactiveAdded / none  |
      | 172.23.105.37  | kv       | 7.1.0-2475-enterprise | 6.41590137124 | --- OUT ---> | active / none         |
      +----------------+----------+-----------------------+---------------+--------------+-----------------------+
      

      Once rebalance completes we noticed following error message on 172.23.106.163.
      On 172.23.106.163

      2022-03-12 06:25:10,500 | test  | CRITICAL | MainThread | [basetestcase:check_coredump_exist:933] 172.23.106.163: Found ' ERROR ' logs - ['2022-03-12T06:17:18.604112-08:00 ERROR 5726: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.106.163->ns_1@172.23.107.164:default - ActiveStream::processItems checkpoint_end:37081 should not be in the current snapshot range s:37022->e:37090\n', '2022-03-12T06:17:19.381800-08:00 ERROR 5726: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.106.163->ns_1@172.23.107.164:default - ActiveStream::processItems checkpoint_end:42608 should not be in the current snapshot range s:42539->e:42612\n']
      

      cbcollect_info attached. This issue was not seen on 7.1.0-2434.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Balakumaran.Gopal Balakumaran Gopal
            Balakumaran.Gopal Balakumaran Gopal
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty