Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-62203

[System Test] Nullpointer exception seen after turning the cluster on

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Columnar 1.0.0
    • Columnar 1.0.0
    • analytics
    • Columnar Edition 1.0.0 build 2126
    • Untriaged
    • 0
    • Unknown

    Description

      After turning the cluster on (it was off for about 12 hours), I can see the following exceptions in the log-

      seen on 002

      2024-06-06T05:53:55.591+00:00 INFO CBAS.active.RecoveryTask [RecoveryTask (Default.KafkaLinkjMyWDHzKBS.b-3-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-2-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-1-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196(CB))] Attempt to revive Default.KafkaLinkjMyWDHzKBS.b-3-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-2-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-1-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196(CB) failed
      java.lang.NullPointerException: Cannot invoke "com.couchbase.analytics.lang.CBStatementExecutor.detachFromActiveListener(com.couchbase.analytics.metadata.KafkaEventsListener)" because "this.statementExecutor" is null
      	at com.couchbase.analytics.metadata.KafkaEventsListener.stopCheck(KafkaEventsListener.java:192) ~[columnar-connector.jar:1.0.0-2126]
      	at com.couchbase.analytics.metadata.KafkaEventsListener.doRecover(KafkaEventsListener.java:174) ~[columnar-connector.jar:1.0.0-2126]
      	at org.apache.asterix.app.active.RecoveryTask.doRecover(RecoveryTask.java:142) ~[asterix-app.jar:1.0.0-2126]
      	at org.apache.asterix.app.active.RecoveryTask.lambda$recover$1(RecoveryTask.java:70) ~[asterix-app.jar:1.0.0-2126]
      	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
      	at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
      2024-06-06T05:53:55.595+00:00 INFO CBAS.metadata.RecoveryRetryPolicy [RecoveryTask (Default.KafkaLinkjMyWDHzKBS.b-3-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-2-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-1-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196(CB))] will retry recovery (attempt 1) in 1s
      2024-06-06T05:53:56.599+00:00 INFO CBAS.active.RecoveryTask [RecoveryTask (Default.KafkaLinkjMyWDHzKBS.b-3-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-2-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-1-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196(CB))] Attempt to revive Default.KafkaLinkjMyWDHzKBS.b-3-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-2-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-1-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196(CB) failed
      java.lang.NullPointerException: Cannot invoke "com.couchbase.analytics.lang.CBStatementExecutor.detachFromActiveListener(com.couchbase.analytics.metadata.KafkaEventsListener)" because "this.statementExecutor" is null
      	at com.couchbase.analytics.metadata.KafkaEventsListener.stopCheck(KafkaEventsListener.java:192) ~[columnar-connector.jar:1.0.0-2126]
      	at com.couchbase.analytics.metadata.KafkaEventsListener.doRecover(KafkaEventsListener.java:174) ~[columnar-connector.jar:1.0.0-2126]
      	at org.apache.asterix.app.active.RecoveryTask.doRecover(RecoveryTask.java:142) ~[asterix-app.jar:1.0.0-2126]
      	at org.apache.asterix.app.active.RecoveryTask.lambda$recover$1(RecoveryTask.java:70) ~[asterix-app.jar:1.0.0-2126]
      	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
      	at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
      2024-06-06T05:53:56.600+00:00 INFO CBAS.metadata.RecoveryRetryPolicy [RecoveryTask (Default.KafkaLinkjMyWDHzKBS.b-3-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-2-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-1-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196(CB))] will retry recovery (attempt 2) in 1s
      2024-06-06T05:53:56.609+00:00 INFO CBAS.adapter.CouchbaseConnectorFactory [default1 on link linkTvXUzluG aggregated shadow state calculator] beginning starting state calculation for [remotedatasetYRyeBJov.scope0JMMmOBWG.remotedatasetYRyeBJov, remotedatasetsJccscCv.scope0WDytWTpM.remotedatasetsJccscCv, remotedatasethVXfLAZu.scope1UMKjAwGw.remotedatasethVXfLAZu, remotedatasetdybfXYbi.scope1mtADVgPF.remotedatasetdybfXYbi, remotedatasetxKlPVlmK.scope1lrXRPpsi.remotedatasetxKlPVlmK, remotedatasetUixMovfA.scope1sfhVYNAE.remotedatasetUixMovfA, remotedatasetdZavqilB.scope1wqVnpwJK.remotedatasetdZavqilB, remotedatasetYLPTAaPH.scope1SNkVUOZO.remotedatasetYLPTAaPH, remotedatasetwuJUDOgE.scope1lbYMFXFl.remotedatasetwuJUDOgE, remotedatasetohTMGTyE.scope0lKYRbKuL.remotedatasetohTMGTyE, remotedatasetalCVKIrj.scope1dqsugHIM.remotedatasetalCVKIrj, remotedatasetdOzNuVIj.scope0lKYRbKuL.remotedatasetdOzNuVIj]
      2024-06-06T05:53:56.642+00:00 INFO CBAS.messaging.NCMessageBroker [Worker:bc7826b85537b9e89e0be60443b84b47] Received message: ShadowStatesRequest{(linkTvXUzluG/default1)[-1]:StateRequest-0xc}
      2024-06-06T05:53:56.646+00:00 INFO CBAS.messaging.NCMessageBroker [Worker:bc7826b85537b9e89e0be60443b84b47] Received message: ShadowStatesRequest{(linkTvXUzluG/default1)[-1]:StateRequest-0xd}
      2024-06-06T05:53:57.604+00:00 INFO CBAS.active.RecoveryTask [RecoveryTask (Default.KafkaLinkjMyWDHzKBS.b-3-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-2-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-1-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196(CB))] Attempt to revive Default.KafkaLinkjMyWDHzKBS.b-3-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-2-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-1-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196(CB) failed
      java.lang.NullPointerException: Cannot invoke "com.couchbase.analytics.lang.CBStatementExecutor.detachFromActiveListener(com.couchbase.analytics.metadata.KafkaEventsListener)" because "this.statementExecutor" is null
      	at com.couchbase.analytics.metadata.KafkaEventsListener.stopCheck(KafkaEventsListener.java:192) ~[columnar-connector.jar:1.0.0-2126]
      	at com.couchbase.analytics.metadata.KafkaEventsListener.doRecover(KafkaEventsListener.java:174) ~[columnar-connector.jar:1.0.0-2126]
      	at org.apache.asterix.app.active.RecoveryTask.doRecover(RecoveryTask.java:142) ~[asterix-app.jar:1.0.0-2126]
      	at org.apache.asterix.app.active.RecoveryTask.lambda$recover$1(RecoveryTask.java:70) ~[asterix-app.jar:1.0.0-2126]
      	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
      	at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
      2024-06-06T05:53:57.604+00:00 INFO CBAS.metadata.RecoveryRetryPolicy [RecoveryTask (Default.KafkaLinkjMyWDHzKBS.b-3-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-2-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-1-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196(CB))] will retry recovery (attempt 3) in 1s
      2024-06-06T05:53:58.610+00:00 INFO CBAS.active.RecoveryTask [RecoveryTask (Default.KafkaLinkjMyWDHzKBS.b-3-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-2-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-1-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196(CB))] Attempt to revive Default.KafkaLinkjMyWDHzKBS.b-3-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-2-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-1-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196(CB) failed
      java.lang.NullPointerException: Cannot invoke "com.couchbase.analytics.lang.CBStatementExecutor.detachFromActiveListener(com.couchbase.analytics.metadata.KafkaEventsListener)" because "this.statementExecutor" is null
      	at com.couchbase.analytics.metadata.KafkaEventsListener.stopCheck(KafkaEventsListener.java:192) ~[columnar-connector.jar:1.0.0-2126]
      	at com.couchbase.analytics.metadata.KafkaEventsListener.doRecover(KafkaEventsListener.java:174) ~[columnar-connector.jar:1.0.0-2126]
      	at org.apache.asterix.app.active.RecoveryTask.doRecover(RecoveryTask.java:142) ~[asterix-app.jar:1.0.0-2126]
      	at org.apache.asterix.app.active.RecoveryTask.lambda$recover$1(RecoveryTask.java:70) ~[asterix-app.jar:1.0.0-2126]
      	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
      	at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
      2024-06-06T05:53:58.611+00:00 INFO CBAS.metadata.RecoveryRetryPolicy [RecoveryTask (Default.KafkaLinkjMyWDHzKBS.b-3-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-2-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-1-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196(CB))] will retry recovery (attempt 4) in 2s
      

      There appear to be 3 different exceptions. Unsure if they're all related, but I discussed it with Ritik Raj and we concluded that it's better to have 3 separate tickets and if they all have the same root cause then they can be closed as duplicates.

      cbcollect ->

      https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun5/collectinfo-2024-06-06T055222-ns_1%40svc-da-node-001.ojc8vwsi23xsigjw.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun5/collectinfo-2024-06-06T055222-ns_1%40svc-da-node-002.ojc8vwsi23xsigjw.sandbox.nonprod-project-avengers.com.zip

      Attachments

        For Gerrit Dashboard: MB-62203
        # Subject Branch Project Status CR V

        Activity

          People

            pavan.pb Pavan PB
            pavan.pb Pavan PB
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty