Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-36392

[Volume]: KV_TEMPORARY_FAILURE, CHANNEL_CLOSED_WHILE_IN_FLIGHT occurred on changing the bucket replica from 2 to 1

    XMLWordPrintable

Details

    • Untriaged
    • Unknown

    Description

      1. Start with 7 nodes in the cluster and GleamBookUsers bucket with replica=1. Loaded 10M items
      2. rebalance_in 1 node and 2M creates in parallel.
      3. rebalance_out 1 and 2M creates in parallel.
      4. rebalance_in 2 and out 1 node and 2M creates in parallel.
      5. Rebalance_swap 1 node and 2M creates in parallel.
      6. Change bucket Replica from 1 to 2. Hit rebalance and load 2M creates in parallel.
      7. Memcached SIGSTOP/SIGSTART on 1 node and 2M creates.
      8. Graceful Failover and rebalance out 1 node. Load 2M creates in parallel.
      9. Graceful Failover 1 node and addback with full recovery. Hit rebalance and load 2M creates in parallel.
      10 Graceful Failover 1 node and addback with delta recovery. Hit rebalance and load 2M creates in parallel.
      *Everything is fine till this step. Total items in the bucket at this stage are: 28M, vb_active_resident_items_ratio: 94.7402464286 *

      11. Change bucket Replica 2 to 1, hit rebalance and load 2M creates in parallel.

      Actual Result at step 11: Insert request are failing with below mentioned messages for few documents.

      Few docs InsertRequest failed due to :
      com.couchbase.client.core.error.RequestCanceledException: InsertRequest {"retried":0,"reason":"NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT)","requestId":74280843,"timeoutMs":30000,"service":{"bucket":"GleamBookUsers","scope":"_default","collection":"_default","type":"kv","key":"Users-29007959"},"cancelled":true,"coreId":103,"completed":true} 
      

      and few failed because of:

      com.couchbase.client.core.error.RequestTimeoutException: InsertRequest {"retried":58,"reason":"TIMEOUT","requestId":74281065,"timeoutMs":30000,"service":{"bucket":"GleamBookUsers","scope":"_default","collection":"_default","type":"kv","key":"Users-28007784"},"timings":null,"cancelled":true,"coreId":99,"completed":true,"retryReasons":["KV_TEMPORARY_FAILURE"]}
      

      Expected: InsertRequest should not fail with reason: KV_TEMPORARY_FAILURE or CHANNEL_CLOSED_WHILE_IN_FLIGHT

      Logs:
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/replica_change/collectinfo-2019-10-09T061439-ns_1%40172.23.105.168.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/replica_change/collectinfo-2019-10-09T061439-ns_1%40172.23.106.134.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/replica_change/collectinfo-2019-10-09T061439-ns_1%40172.23.106.137.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/replica_change/collectinfo-2019-10-09T061439-ns_1%40172.23.106.138.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/replica_change/collectinfo-2019-10-09T061439-ns_1%40172.23.106.82.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/replica_change/collectinfo-2019-10-09T061439-ns_1%40172.23.106.85.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/replica_change/collectinfo-2019-10-09T061439-ns_1%40172.23.106.86.zip

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ritesh.agarwal Ritesh Agarwal
              ritesh.agarwal Ritesh Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty