Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-62768

[Mini Volume - BYOK] Rebalance hung while scaling columnar cluster from 8 to 16 nodes

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Critical
    • Columnar 1.0.0
    • Columnar 1.0.0
    • analytics
    • Columnar Edition 1.0.0 build 2209
    • Untriaged
    • Linux x86_64
    • 0
    • Unknown

    Description

      Test Steps

      1. Load 50m documents into Mongo collection.
      2. Create a Confluent kafka topic against Mongo collection created in previous step.
      3. Deploy a 4 node Columnar cluster with 8vCPUs and 64GB memory node configuration.
      4. Create 4 Kafka links and 5 collections on each of the 4 Kafka links.
      5. Connect 2 out of the 4 Kafka links.
      6. While data ingestion is happening from Kafka source, perform multiple scale up and scale down operations on the cluster.
      7. Start a continuous query workload on this cluster while data ingestion is ongoing.

      Observation

      Rebalance operation is hung while scaling up the Columnar cluster from 8 to 16 nodes.

      Following error observed on nodes which are being added to the cluster (for example - node-036).

      2024-07-17T04:54:42.939+00:00 WARN CBAS.work.StartTasksWork [Worker:613646d22f7701a36a478e9412217f3a] Failure starting a task
      org.apache.hyracks.api.exceptions.HyracksException: java.lang.ClassCastException: cannot assign instance of java.util.HashMap to field com.couchbase.analytics.runtime.SimpleTopicRouterOperatorDescriptor.topicStreamMap of type com.couchbase.analytics.lang.ConnectLinkStatement$OneToOneMap in instance of com.couchbase.analytics.runtime.SimpleTopicRouterOperatorDescriptor
      	at org.apache.hyracks.api.exceptions.HyracksException.create(HyracksException.java:43) ~[hyracks-api.jar:1.0.0-2216]
      	at org.apache.hyracks.control.common.deployment.DeploymentUtils.deserialize(DeploymentUtils.java:126) ~[hyracks-control-common.jar:1.0.0-2216]
      	at org.apache.hyracks.control.nc.work.StartTasksWork.getOrCreateLocalJoblet(StartTasksWork.java:212) ~[hyracks-control-nc.jar:1.0.0-2216]
      	at org.apache.hyracks.control.nc.work.StartTasksWork.run(StartTasksWork.java:125) [hyracks-control-nc.jar:1.0.0-2216]
      	at org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:127) [hyracks-control-common.jar:1.0.0-2216]
      Caused by: java.lang.ClassCastException: cannot assign instance of java.util.HashMap to field com.couchbase.analytics.runtime.SimpleTopicRouterOperatorDescriptor.topicStreamMap of type com.couchbase.analytics.lang.ConnectLinkStatement$OneToOneMap in instance of com.couchbase.analytics.runtime.SimpleTopicRouterOperatorDescriptor
      	at java.base/java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2096) ~[?:?]
      	at java.base/java.io.ObjectStreamClass$FieldReflector.checkObjectFieldValueTypes(ObjectStreamClass.java:2060) ~[?:?]
      	at java.base/java.io.ObjectStreamClass.checkObjFieldValueTypes(ObjectStreamClass.java:1347) ~[?:?]
      	at java.base/java.io.ObjectInputStream$FieldValues.defaultCheckFieldValues(ObjectInputStream.java:2679) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2486) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2257) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1733) ~[?:?]
      	at java.base/java.io.ObjectInputStream$FieldValues.<init>(ObjectInputStream.java:2606) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2457) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2257) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1733) ~[?:?]
      	at java.base/java.io.ObjectInputStream.skipCustomData(ObjectInputStream.java:2518) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2467) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2257) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1733) ~[?:?]
      	at java.base/java.io.ObjectInputStream$FieldValues.<init>(ObjectInputStream.java:2606) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2457) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2257) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1733) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:509) ~[?:?]
      	at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:467) ~[?:?]
      	at java.base/java.util.HashMap.readObject(HashMap.java:1552) ~[?:?]
      

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ritik.raj Ritik Raj
            sujay.gad Sujay Gad
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty