Uploaded image for project: 'Couchbase Java Client'
  1. Couchbase Java Client
  2. JCBC-134

resubscriber IllegalArgumentException during topology changes

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.0, 1.1.1
    • Fix Version/s: 1.1.1
    • Component/s: Core
    • Security Level: Public
    • Labels:
      None
    • Environment:

      Description

      Exception in thread "couchbase cluster resubscriber - running" java.lang.IllegalArgumentException: Bucket name cannot be null and must never be re-set to a new object.
      at com.couchbase.client.vbucket.ConfigurationProviderHTTP.subscribe(ConfigurationProviderHTTP.java:240)
      at com.couchbase.client.vbucket.ConfigurationProviderHTTP.finishResubscribe(ConfigurationProviderHTTP.java:215)
      at com.couchbase.client.CouchbaseConnectionFactory$Resubscriber.run(CouchbaseConnectionFactory.java:322)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
      at java.lang.Thread.run(Thread.java:679)

      Unfortunately I don't have a whole lot more insight into what's happening, but the stack trace might be helpful to examine.. assigning to myself until I have more info..

      1. daschl-4-rebealance_two_nodes.log
        1.74 MB
        Michael Nitschinger
      2. daschl-4-restart.log
        255 kB
        Michael Nitschinger
      3. log2.txt.bz2
        54 kB
        Mark Nunberg
      4. log2.txt.bz2
        54 kB
        Mark Nunberg
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        mnunberg Mark Nunberg created issue -
        Hide
        daschl Michael Nitschinger added a comment -

        Hi Mark,

        can you elaborate a bit more whats going on through the test? This particular exception can come up when the java sdk tries to subcribe to a new node when the old connection is closed. I think the scenario should give us a connection to what is happening at runtime in the java sdk.

        Thanks,
        Michael

        Show
        daschl Michael Nitschinger added a comment - Hi Mark, can you elaborate a bit more whats going on through the test? This particular exception can come up when the java sdk tries to subcribe to a new node when the old connection is closed. I think the scenario should give us a connection to what is happening at runtime in the java sdk. Thanks, Michael
        daschl Michael Nitschinger made changes -
        Field Original Value New Value
        Fix Version/s 1.1-dp5 [ 10410 ]
        Affects Version/s 1.1-dp3 [ 10372 ]
        Component/s library [ 10140 ]
        daschl Michael Nitschinger made changes -
        Priority Major [ 3 ] Blocker [ 1 ]
        mnunberg Mark Nunberg made changes -
        Link This issue depends on SDKQE-2 [ SDKQE-2 ]
        daschl Michael Nitschinger made changes -
        Assignee Mark Nunberg [ mnunberg ] Michael Nitschinger [ daschl ]
        daschl Michael Nitschinger made changes -
        Priority Blocker [ 1 ] Critical [ 2 ]
        daschl Michael Nitschinger made changes -
        Fix Version/s 1.1beta [ 10370 ]
        Fix Version/s 1.1-dp5 [ 10410 ]
        ingenthr Matt Ingenthron made changes -
        Fix Version/s 1.1.0 [ 10274 ]
        Fix Version/s 1.1-beta [ 10370 ]
        daschl Michael Nitschinger made changes -
        Fix Version/s 1.1.1 [ 10430 ]
        Fix Version/s 1.1.0 [ 10274 ]
        Hide
        mnunberg Mark Nunberg added a comment -

        I've run the Java SDKD tests several times already and cannot reproduce this. Will re-open it if i see it again

        Show
        mnunberg Mark Nunberg added a comment - I've run the Java SDKD tests several times already and cannot reproduce this. Will re-open it if i see it again
        mnunberg Mark Nunberg made changes -
        Status Open [ 1 ] Closed [ 6 ]
        Resolution Cannot Reproduce [ 5 ]
        Hide
        mnunberg Mark Nunberg added a comment -
        Show
        mnunberg Mark Nunberg added a comment - Seen again at: http://review.couchbase.org/#/c/24092/
        mnunberg Mark Nunberg made changes -
        Resolution Cannot Reproduce [ 5 ]
        Status Closed [ 6 ] Reopened [ 4 ]
        mnunberg Mark Nunberg made changes -
        Summary resubscriber IllegalArgumentException during swap-rebalance resubscriber IllegalArgumentException during topology changes
        mnunberg Mark Nunberg made changes -
        Affects Version/s 1.0 [ 10273 ]
        Affects Version/s 1.1.1 [ 10430 ]
        mnunberg Mark Nunberg made changes -
        Attachment log2.txt.bz2 [ 16236 ]
        Hide
        mnunberg Mark Nunberg added a comment -

        So as I mentioned in the bug, I closed it because I haven't seen this error. Just now, both me and Michael encountered this error while running the SDKD tests

        Show
        mnunberg Mark Nunberg added a comment - So as I mentioned in the bug, I closed it because I haven't seen this error. Just now, both me and Michael encountered this error while running the SDKD tests
        mnunberg Mark Nunberg made changes -
        Attachment log2.txt.bz2 [ 16237 ]
        Hide
        daschl Michael Nitschinger added a comment -

        Attaching the logs for http://review.couchbase.org/#/c/24092 changeset 4 (prefixed with daschl-4-) on some test runs.

        Deepti is currently running the changeset against the brun cluster, expect some info in 2 hours.

        Show
        daschl Michael Nitschinger added a comment - Attaching the logs for http://review.couchbase.org/#/c/24092 changeset 4 (prefixed with daschl-4-) on some test runs. Deepti is currently running the changeset against the brun cluster, expect some info in 2 hours.
        Hide
        daschl Michael Nitschinger added a comment -

        ./stester -i 20devcluster.ini --service ALL --svcaction RESTART --num_nodes 3 --no_fo 1 -c failover.Once --dsw_timeres 1 -d -o restart.log -C 127.0.0.1:8050

        Show
        daschl Michael Nitschinger added a comment - ./stester -i 20devcluster.ini --service ALL --svcaction RESTART --num_nodes 3 --no_fo 1 -c failover.Once --dsw_timeres 1 -d -o restart.log -C 127.0.0.1:8050
        daschl Michael Nitschinger made changes -
        Attachment daschl-4-restart.log [ 16244 ]
        Hide
        daschl Michael Nitschinger added a comment -

        ./stester -i 20devcluster.ini -c rebalance.Once --mode out --rbcount 2 --dsw_timeres 1 -d -o rebealance_two_nodes.log -C 127.0.0.1:8050

        Show
        daschl Michael Nitschinger added a comment - ./stester -i 20devcluster.ini -c rebalance.Once --mode out --rbcount 2 --dsw_timeres 1 -d -o rebealance_two_nodes.log -C 127.0.0.1:8050
        daschl Michael Nitschinger made changes -
        Attachment daschl-4-rebealance_two_nodes.log [ 16245 ]
        Hide
        deeptida Deepti Dawar added a comment -

        Attaching the functional test results.
        This was run against a local 2.0.0 node.
        Pass rate is better this time - 92%.

        Show
        deeptida Deepti Dawar added a comment - Attaching the functional test results. This was run against a local 2.0.0 node. Pass rate is better this time - 92%.
        deeptida Deepti Dawar made changes -
        Attachment junit.zip [ 16246 ]
        Hide
        deeptida Deepti Dawar added a comment -

        For the Hybrid tests - failures are still coming.

        Attaching the intermittent log.

        Show
        deeptida Deepti Dawar added a comment - For the Hybrid tests - failures are still coming. Attaching the intermittent log.
        Hide
        deeptida Deepti Dawar added a comment - - edited

        The error that seems to be problematic in the unit test logs is this one -

        'Timeout occurred. Please note the time in the report does not reflect the time until the timeout.
        junit.framework.AssertionFailedError: Timeout occurred. Please note the time in the report does not reflect the time until the timeout.'

        Most of the issues coming due to timeout.

        Note : that these tests were run against a local cluster. Hence, such problems should not be occurring.

        Show
        deeptida Deepti Dawar added a comment - - edited The error that seems to be problematic in the unit test logs is this one - 'Timeout occurred. Please note the time in the report does not reflect the time until the timeout. junit.framework.AssertionFailedError: Timeout occurred. Please note the time in the report does not reflect the time until the timeout.' Most of the issues coming due to timeout. Note : that these tests were run against a local cluster. Hence, such problems should not be occurring.
        Hide
        daschl Michael Nitschinger added a comment -

        Merged in today, right before the 1.1.1 release.

        Show
        daschl Michael Nitschinger added a comment - Merged in today, right before the 1.1.1 release.
        daschl Michael Nitschinger made changes -
        Status Reopened [ 4 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        ingenthr Matt Ingenthron made changes -
        Workflow jira [ 21340 ] Couchbase SDK Workflow [ 38418 ]

          People

          • Assignee:
            daschl Michael Nitschinger
            Reporter:
            mnunberg Mark Nunberg
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes