Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-45892

[Upgrade][XDCR][UI] Errors thrown while upgrading source cluster from 6.6.2 to 7.0.0

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • Cheshire-Cat
    • 7.0.0
    • UI, XDCR
    • Untriaged
    • 1
    • Unknown

    Description

      Build : 7.0.0-5016

      Steps :
      1. Setup 2x2 clusters running 6.6.2 with kv+n1ql+index+fts on both the nodes and install all 3 sample buckets - add a default bucket as well and pump in data
      2. Setup replications between all the buckets
      3. Upgrade target cluster to 7.0 (pump in data while cluster is in mixed mode)
      4. Ensure data is all caught up
      5. Start upgrading src cluster - pause in mixed mode

      Check replications - errors observed in UI - attaching screenshots

      Logs:
      https://s3.amazonaws.com/bugdb/jira/upgradetesting/collectinfo-2021-04-23T215242-ns_1%40172.23.107.126.zip
      https://s3.amazonaws.com/bugdb/jira/upgradetesting/collectinfo-2021-04-23T215242-ns_1%40172.23.107.142.zip

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-45892
          # Subject Branch Project Status CR V

          Activity

            neil.huang Neil Huang added a comment -

            The errors are coming from the 7.0.0 node in the mixed-mode cluster. When DCP attempts to start, it sends HELO collections to its local memcached, but memcached responds saying that collection is not supported.

            2021-04-23T14:57:21.095-07:00 INFO GOXDCR.DcpNozzle: Successfully sent HELO command with userAgent=Goxdcr Dcp  SourceBucket:travel-sample TargetBucket:travel-sample. attributes={false 2 false false}
            

            With the last attribute being "collections".

            Jim Walker, is it true that in a mixed-mode cluster, for a 7.0 node to send HELO collection, that the response will not include an acknowledgement that collection is supported?

            According to the KV doc:

            Changing: HELLO
            A new capability flag will be available for clients to indicate that they are collection aware and thus can access documents in collections and understand new error codes. The setting of this flag changes the interpretation of data access commands and enables new error codes.
             
            The new flag is defined as:
            PROTOCOL_BINARY_FEATURE_COLLECTIONS = 0x12
            

            When implementing per the spec doc, my understanding was that in a mixed-mode cluster, clients can connect to a 7.0 node with collections enabled. All documents are sitting in the default collection anyway. In a mixed-mode cluster, it's probably impossible to create scopes/collections either. Then only once all nodes are upgraded, can scopes/collections be created, and all the associated system events will then be able to come down to all those pre-existing DCP connections by then.

            Otherwise, it would mean that XDCR on a 7.0 node in a mixed-mode cluster would then need temporarily connect without HELO'ing collections... and then keep track of all nodes of a cluster until they all have been upgraded to 7.0, and then reconnect with HELO collections. I see no harm just letting the client in this case connect with HELO collections... all the mutations would be in default collection anyway.

            neil.huang Neil Huang added a comment - The errors are coming from the 7.0.0 node in the mixed-mode cluster. When DCP attempts to start, it sends HELO collections to its local memcached, but memcached responds saying that collection is not supported. 2021-04-23T14:57:21.095-07:00 INFO GOXDCR.DcpNozzle: Successfully sent HELO command with userAgent=Goxdcr Dcp SourceBucket:travel-sample TargetBucket:travel-sample. attributes={false 2 false false} With the last attribute being "collections". Jim Walker , is it true that in a mixed-mode cluster, for a 7.0 node to send HELO collection, that the response will not include an acknowledgement that collection is supported? According to the KV doc: Changing: HELLO A new capability flag will be available for clients to indicate that they are collection aware and thus can access documents in collections and understand new error codes. The setting of this flag changes the interpretation of data access commands and enables new error codes.   The new flag is defined as: PROTOCOL_BINARY_FEATURE_COLLECTIONS = 0x12 When implementing per the spec doc , my understanding was that in a mixed-mode cluster, clients can connect to a 7.0 node with collections enabled. All documents are sitting in the default collection anyway. In a mixed-mode cluster, it's probably impossible to create scopes/collections either. Then only once all nodes are upgraded, can scopes/collections be created, and all the associated system events will then be able to come down to all those pre-existing DCP connections by then. Otherwise, it would mean that XDCR on a 7.0 node in a mixed-mode cluster would then need temporarily connect without HELO'ing collections... and then keep track of all nodes of a cluster until they all have been upgraded to 7.0, and then reconnect with HELO collections. I see no harm just letting the client in this case connect with HELO collections... all the mutations would be in default collection anyway.
            jwalker Jim Walker added a comment -

            Neil Huang I think is a side-effect of Collections orginally being a developer preview feature in 6.5, where it would be disabled or forcefully enabled. Now in 7.0 the HELLO code is still checking the memcached settings, which in mixed-mode correctly (depending on your view) say that collections are not enabled, e.g. no collections can be created/dropped in mixed mode. However there is no harm in a collection aware SDK connecting to a 7.0 node and using the default collection, or even requesting collection aware DCP.

            jwalker Jim Walker added a comment - Neil Huang I think is a side-effect of Collections orginally being a developer preview feature in 6.5, where it would be disabled or forcefully enabled. Now in 7.0 the HELLO code is still checking the memcached settings, which in mixed-mode correctly (depending on your view) say that collections are not enabled, e.g. no collections can be created/dropped in mixed mode. However there is no harm in a collection aware SDK connecting to a 7.0 node and using the default collection, or even requesting collection aware DCP.
            neil.huang Neil Huang added a comment -

            Thank you, Jim Walker

            neil.huang Neil Huang added a comment - Thank you, Jim Walker

            Build couchbase-server-7.0.0-5038 contains kv_engine commit 14cecb5 with commit message:
            MB-45892: Clients should be able to HELO collections on any 7.0 node.

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-5038 contains kv_engine commit 14cecb5 with commit message: MB-45892 : Clients should be able to HELO collections on any 7.0 node.
            neil.huang Neil Huang added a comment - According to Mihir's comment on https://issues.couchbase.com/browse/MB-45893?focusedCommentId=495809&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-495809 this fix is verified

            People

              pavithra.mahamani Pavithra Mahamani
              arunkumar Arunkumar Senthilnathan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty