Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-8038

apparent deadlock in ep-engine (was: [system test] cluster crashed with error "exception exit: couldnt_connect_to_memcached")

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.0.1
    • Fix Version/s: 2.1.0
    • Component/s: couchbase-bucket
    • Security Level: Public
    • Labels:
      None
    • Environment:
      Ubuntu 12.04 64bit

      Description

      Environment: Ubuntu 12.04 64 bit
      1:10.3.3.239
      2:10.3.3.199
      3:10.3.3.215
      4:10.3.3.240
      5:10.3.2.85
      6:10.3.3.218

      Create 2 buckets: default and sasl. Each bucket is 1.2 GB
      Create one doc with 2 views for each bucket.
      Default bucket has 1 replica, replica index enable
      Sasl bucket has 1 replica, replica index disable
      Do the system test as in the following specification:
      http://hub.internal.couchbase.com/confluence/display/QA/views+%28and+now+with+XDCR%29+tests

      Do rebalance out node 10.3.3.218, rebalance failed
      Then one by one, each node is going to unstable state (yellow)

      Link to collect info of all nodes https://s3.amazonaws.com/packages.couchbase/collect_info/2_0_1/201304/6ubuntu_1204_build_210-185_cluster-unstable_20130408-160059.tgz

      Link to manifest file of build 2.0.1-170 http://builds.hq.northscale.net/latestbuilds/couchbase-server-enterprise_x86_64_2.0.1-170-rel.deb.manifest.xml

      On node 10.3.3.239, do kill -s SIGUSR1 1074 to generate erlang crash dump
      Link to erlang crash dump https://s3.amazonaws.com/packages.couchbase/erlang/unix/erl_crash.dump.04-05-2013-node-3_239.gz

      Cluster is currently in failed state.

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        chiyoung Chiyoung Seo added a comment -

        The fix is in gerrit for review:

        http://review.couchbase.org/#/c/25834/

        Show
        chiyoung Chiyoung Seo added a comment - The fix is in gerrit for review: http://review.couchbase.org/#/c/25834/
        Hide
        chiyoung Chiyoung Seo added a comment -

        The fix was just merged.

        Show
        chiyoung Chiyoung Seo added a comment - The fix was just merged.
        Hide
        maria Maria McDuff (Inactive) added a comment -

        tony, pls verify / close on newest build that will come out tomorrow, 4/23.

        Show
        maria Maria McDuff (Inactive) added a comment - tony, pls verify / close on newest build that will come out tomorrow, 4/23.
        Hide
        thuan Thuan Nguyen added a comment -

        Integrated in github-ep-engine-2-0 #485 (See http://qa.hq.northscale.net/job/github-ep-engine-2-0/485/)
        MB-8038 Release hash table partition lock before notifyIOComplete (Revision e38e9e49855362bcab0fa72258d888cf2423e4d5)

        Result = SUCCESS
        Chiyoung Seo :
        Files :

        • src/tapconnection.cc
        Show
        thuan Thuan Nguyen added a comment - Integrated in github-ep-engine-2-0 #485 (See http://qa.hq.northscale.net/job/github-ep-engine-2-0/485/ ) MB-8038 Release hash table partition lock before notifyIOComplete (Revision e38e9e49855362bcab0fa72258d888cf2423e4d5) Result = SUCCESS Chiyoung Seo : Files : src/tapconnection.cc
        Hide
        thuan Thuan Nguyen added a comment -

        I don't see this issue any more in system test on windows 2008 R2 64bit with build 2.1.0-7xx.

        Show
        thuan Thuan Nguyen added a comment - I don't see this issue any more in system test on windows 2008 R2 64bit with build 2.1.0-7xx.

          People

          • Assignee:
            thuan Thuan Nguyen
            Reporter:
            thuan Thuan Nguyen
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes