Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51435

[BP 6.6.6 MB-42968] - Eventing Enabled Cluster Fails to Recover - long DCP connection names

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • Cheshire-Cat, 6.6.5
    • 6.6.6
    • eventing
    • Kubernetes 1.19, Operator 2.1
    • Untriaged
    • 1
    • Unknown

    Description

      What the test does

      Spins up a 3 node cluster, kills a pod, waits for recovery.  Does this N times.

      What happened

      The first pod is killed, the operator sees it go down, failover and we scale back up to 3 nodes.  Same for the second instance.  On the third attempt, the rebalance of the new node fails, and continues to do so until the end of time.  The nature of the failure is the cluster continuing to report an unbalanced status.

      Expectation

      When things report as balanced at the very least, it's safe to go around killing stuff and the cluster should be recoverable.  This is a deadlock situation for the Operator and Couchbase Cloud.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Build couchbase-server-6.6.6-10557 contains eventing commit 38b8a29 with commit message:
            MB-51435: Restrict DCP feed name length to 200 chars

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.6-10557 contains eventing commit 38b8a29 with commit message: MB-51435 : Restrict DCP feed name length to 200 chars
            sujay.gad Sujay Gad added a comment -

            Reproduced this issue on 6.6.6-10556 and validated the fix on 6.6.6-10574.

            STEPS
            Deploy single node cluster having KV and Eventing services colocated.
            Specify a fairly long hostname for the node.
            Create source and metadata bucket for Eventing function.
            Deploy Eventing function.

            Observation
            On 6.6.6-10556, function is stuck in deploying state as connection string length exceeds 200 characters.

            2023-01-10T04:27:47.215385-08:00 WARNING 61: Invalid format specified for "DCP_OPEN" - Status: "Invalid arguments" - Closing connection. Packet:[{"bodylen":258,"cas":0,"datatype":"raw","extlen":8,"keylen":250,"magic":"ClientRequest","opaque":16822206,"opcode":"DCP_OPEN","vbucket":0}] Reason:"Dcp name limit is 200 characters"
            2023-01-10T04:27:47.302572-08:00 INFO 64: Client {"ip":"172.23.106.64","port":50416} authenticated as <ud>@eventing</ud>
            2023-01-10T04:27:47.303328-08:00 WARNING 64: Invalid format specified for "DCP_OPEN" - Status: "Invalid arguments" - Closing connection. Packet:[{"bodylen":258,"cas":0,"datatype":"raw","extlen":8,"keylen":250,"magic":"ClientRequest","opaque":16822206,"opcode":"DCP_OPEN","vbucket":0}] Reason:"Dcp name limit is 200 characters"
            

            On 6.6.6-10574, function deployment went through after the fix.

            sujay.gad Sujay Gad added a comment - Reproduced this issue on 6.6.6-10556 and validated the fix on 6.6.6-10574. STEPS Deploy single node cluster having KV and Eventing services colocated. Specify a fairly long hostname for the node. Create source and metadata bucket for Eventing function. Deploy Eventing function. Observation On 6.6.6-10556, function is stuck in deploying state as connection string length exceeds 200 characters. 2023 - 01 -10T04: 27 : 47.215385 - 08 : 00 WARNING 61 : Invalid format specified for "DCP_OPEN" - Status: "Invalid arguments" - Closing connection. Packet:[{ "bodylen" : 258 , "cas" : 0 , "datatype" : "raw" , "extlen" : 8 , "keylen" : 250 , "magic" : "ClientRequest" , "opaque" : 16822206 , "opcode" : "DCP_OPEN" , "vbucket" : 0 }] Reason: "Dcp name limit is 200 characters" 2023 - 01 -10T04: 27 : 47.302572 - 08 : 00 INFO 64 : Client { "ip" : "172.23.106.64" , "port" : 50416 } authenticated as <ud> @eventing </ud> 2023 - 01 -10T04: 27 : 47.303328 - 08 : 00 WARNING 64 : Invalid format specified for "DCP_OPEN" - Status: "Invalid arguments" - Closing connection. Packet:[{ "bodylen" : 258 , "cas" : 0 , "datatype" : "raw" , "extlen" : 8 , "keylen" : 250 , "magic" : "ClientRequest" , "opaque" : 16822206 , "opcode" : "DCP_OPEN" , "vbucket" : 0 }] Reason: "Dcp name limit is 200 characters" On 6.6.6-10574, function deployment went through after the fix.

            People

              sujay.gad Sujay Gad
              jeelan.poola Jeelan Poola
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty