Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-60754

Failover: Server responding with unknown status code for few mutations

    XMLWordPrintable

Details

    Description

      Build: 7.6.0-2111

      Steps:

      • 5 node KV cluster

        172.23.108.67 (Orchestrator)
        172.23.108.68
        172.23.108.69
        172.23.108.70
        172.23.108.72

      • 3 buckets with replicas=3 with 100 collections each all loaded with some docs

      +---------+-------------------+----------+--------+
      | Bucket  | Type / Storage    | Replicas | Items  +
      +---------+-------------------+----------+--------+
      | bucket1 | couchbase / magma | 3        | 30000  |
      | bucket2 | ephemeral / -     | 3        | 30000  |
      | default | couchbase / magma | 3        | 500000 |
      +---------+-------------------+----------+--------+

      • Failover 2 nodes while data loading is happening 

        172.23.108.70
        172.23.108.72

      Observation:

      With failover happening, seeing mutation response with status 'UNKNOWN' and description "Command can't be executed in a config-only bucket"

      u'test_collections-731': {'error': 'com.couchbase.client.core.error.CouchbaseException | RemoveRequest failed with unexpected status code UNKNOWN {"completed":true,"coreId":"0x6cc277ce0000001a","idempotent":false,"lastChannelId":"6CC277CE0000001A/000000001A4111DC","lastDispatchedFrom":"172.23.106.205:48408","lastDispatchedTo":"172.23.108.70:11210","requestId":602081,"requestType":"RemoveRequest","retried":0,"serverDuration":3,"service":{"bucket":"default","collection":"collection-46","documentId":"test_collections-731","errorCode":{"description":"Command can\'t be executed in a config-only bucket","name":"ECONFIG_ONLY"},"opaque":"0x98eb4","scope":"scope-1","type":"kv","vbucket":773},"status":"UNKNOWN","timeoutMs":120000,"timings":{"dispatchMicros":884,"totalDispatchMicros":884,"totalServerMicros":0,"totalMicros":1377,"serverMicros":0}}', 'cas': 0, 'value': {}}

      Sample pcap log:

      Request:
      Frame: 1772508
      Col: 0x3e
      key: test_collections-731
      Opaque: 0x66240a00
      Dest server: 172.23.108.70
       
      Response:
      Frame: 1772547
      Opaque: 0x66240a00

      Full Pcap:

      https://cb-jira.s3.us-east-2.amazonaws.com/logs/trinity_unknown_status_code.pcap.zip

      TAF test

      ./guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i node.ini -p get-cbcollect-info=False,skip_cluster_reset=True,skip_collections_cleanup=True,use_https=False -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_hard_failover_rebalance_out,data_load_stage=before,nodes_failover=2,rebalance_moves_per_node=32,nodes_init=5,bucket_spec=multi_bucket.buckets_for_rebalance_tests,scrape_interval=5,infra_log_level=debug,log_level=debug,batch_size=20'
      

       

       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ashwin.govindarajulu Ashwin Govindarajulu
              ashwin.govindarajulu Ashwin Govindarajulu
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty