Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-54361

Cluster run indexer invalid packet header size

    XMLWordPrintable

Details

    • Untriaged
    • MacOSX 64-bit
    • 1
    • Unknown

    Description

      There seems to be an issue on a fresh, cluster_run cluster, when using indexing on mad-hatter branch (6.6.6): 

      ns_server (6c52c5b) via △ v3.24.2 via  v24.1.6 took 14s❯ history
      ./cluster_connect -n1 -Tn0:index+kv+n1ql # Last called to initialize cluster
      ./cluster_run -n1
      make -j8
      repo sync
      repo init -u https://github.com/couchbase/manifest.git -m couchbase-server/mad-hatter.xml -g all

      That after creating a bucket that has indexing, I just start seeing this over and over from the indexing logs:

      2022-11-01T09:44:58.188-07:00 [Error] PeerPipe.doRecieve() : ecounter error when received mesasage from Peer 127.0.0.1:53739.  Error = Validate packet header: Invalid size 5135603447292250196. Kill Pipe.
      2022-11-01T09:44:58.188-07:00 [Error] LeaderSyncProxy.updateAcceptEpochAfterQuorum(): Error encountered = Server Error : SyncProxy.listen(): channel closed. Terminate
      2022-11-01T09:44:58.188-07:00 [Error] LeaderServer:startProxy(): Leader Fail to synchronization with follower (TCP conn = 127.0.0.1:53739)
      2022-11-01T09:44:59.203-07:00 [Info] [Queryport ":9101"] connection "127.0.0.1:53740" doReceive() ...
      2022-11-01T09:44:59.203-07:00 [Error] transport - unknown encoding scheme 0x20 flags 0x2f20
      2022-11-01T09:44:59.203-07:00 [Error] [Queryport ":9101"] connection "127.0.0.1:53740" exited transport.decoderUnknown
      2022-11-01T09:44:59.203-07:00 [Info] [Queryport ":9101"] connection 127.0.0.1:53740 closed
      2022-11-01T09:45:03.363-07:00 [Info] ServiceMgr::GetCurrentTopology []
      2022-11-01T09:45:03.363-07:00 [Info] ServiceMgr::GetCurrentTopology returns &{[0 0 0 0 0 0 0 4] [58ab6599e3e981b9d03263dac7bd5573] true []}
      2022-11-01T09:45:03.363-07:00 [Info] ServiceMgr::GetTaskList []
      2022-11-01T09:45:03.363-07:00 [Info] ServiceMgr::GetTaskList returns &{[0 0 0 0 0 0 0 4] []}
      2022-11-01T09:45:03.363-07:00 [Info] ServiceMgr::GetCurrentTopology [0 0 0 0 0 0 0 4]
      2022-11-01T09:45:03.364-07:00 [Info] ServiceMgr::GetTaskList [0 0 0 0 0 0 0 4]
      2022-11-01T09:45:06.916-07:00 [Info] janitor: running cleanup.
      

      And it fails to function properly from this. I cannot even create indexes properly, most of the time. So it completely blocks and usage of indexing, in this situation.

      This really feels like some sort of weird compilation / integer width / conversion type thing and has to do with decoding a protobuf based protocol and one of the integers, that represents the size of a particular header, is significantly too large indicating some sort of integer voodoo, perhaps related to endianness, integer width, etc.. Here are the versions of golang that I am linked with:

      dev/workspace/mh666 via △ v3.24.2
      ❯ for f in install/bin/go*
            echo $f && strings $f | grep 'go1\.'
            echo "------------------------------"
        end
      install/bin/gometa
      go1.17.13
      /Users/bryanmccoid/.cbdepscache/exploded/x86_64/go-1.17.13/go/src/vendor/golang.org/x/crypto/poly1305/bits_go1.13.go
      ------------------------------
      install/bin/goport
      go1.17.13
      ------------------------------
      install/bin/gosecrets
      go1.17.13
      ------------------------------
      install/bin/goxdcr
      go1.18.7
      /Users/bryanmccoid/.cbdepscache/exploded/x86_64/go-1.18.7/go/src/vendor/golang.org/x/crypto/internal/poly1305/bits_go1.13.go
      go1.18.7
      ------------------------------
      install/bin/gozip
      go1.17.13
      ------------------------------

      Specifically for indexer:

      dev/workspace/mh666 via △ v3.24.2
      ❯ strings install/bin/indexer | grep 'go1\.'
      go1.17.13
      /Users/bryanmccoid/.cbdepscache/exploded/x86_64/go-1.17.13/go/src/vendor/golang.org/x/crypto/poly1305/bits_go1.13.go

      Not sure how to proceed from here but something is definitely wrong since I have re-created the issue many times from a fresh initialization of the repo. I am using MacOS Big Sur 11.6.8.. 

      All other versions of the product work fine on my system, it's only mad-hatter that has this issue. 

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            dhananjay.kshirsagar Dhananjay Kshirsagar
            bryan.mccoid Bryan McCoid
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty