Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-11764

Memcached crashed during rebalance-in

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • 3.0
    • 3.0
    • couchbase-bucket
    • Security Level: Public
    • None
    • Build 3.0.0-973

    Description

      http://qa.hq.northscale.net/job/centos_x64--107_01--rebalanceXDCR-P1/27/consoleFull

      [Test]
      ./testrunner i centos_x64107_01-rebalanceXDCR-P1.ini get-cbcollect-info=True,get-logs=False,stop-on-failure=False,get-coredumps=True -t xdcr.rebalanceXDCR.Rebalance.async_rebalance_in,items=100000,rdirection=unidirection,ctopology=chain,doc-ops=update-delete,expires=60,rebalance=destination,num_rebalance=2,GROUP=P0;xmem

      [Test error]
      [2014-07-17 03:21:01,543] - [rest_client:1216] INFO - rebalance percentage : 38.8497796331 %
      [2014-07-17 03:21:11,557] - [rest_client:1200] ERROR -

      {u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'}

      - rebalance failed
      [2014-07-17 03:21:11,577] - [rest_client:2011] INFO - Latest logs from UI on 10.5.2.231:
      [2014-07-17 03:21:11,578] - [rest_client:2012] ERROR - {u'node': u'ns_1@10.5.2.231', u'code': 2, u'text': u'Rebalance exited with reason {{badmatch,{error,closed,\n [{mc_client_binary,cmd_vocal_recv,5,\n [

      {file,"src/mc_client_binary.erl"},\n {line,151}]},\n {mc_client_binary,select_bucket,2,\n [{file,"src/mc_client_binary.erl"}

      ,\n

      {line,346}]},\n {ns_memcached,ensure_bucket,2,\n [{file,"src/ns_memcached.erl"},\n {line,1269}]},\n {ns_memcached,handle_info,2,\n [{file,"src/ns_memcached.erl"},\n {line,744}]},\n {gen_server,handle_msg,5,\n [{file,"gen_server.erl"},{line,604}]},\n {ns_memcached,init,1,\n [{file,"src/ns_memcached.erl"},\n {line,171}]},\n {gen_server,init_it,6,\n [{file,"gen_server.erl"},{line,304}]},\n {proc_lib,init_p_do_apply,3,\n [{file,"proc_lib.erl"},{line,239}]}]},\n {gen_server,call,\n [\'ns_memcached-default\',\n {delete_vbucket,958},\n 360000]}},\n {gen_server,call,\n [{\'janitor_agent-default\',\'ns_1@10.5.2.233\'},\n {if_rebalance,<0.18873.2>,\n {get_vbucket_high_seqno,794}},\n infinity]}}\n', u'shortText': u'message', u'serverTime': u'2014-07-17T03:21:29.566Z', u'module': u'ns_orchestrator', u'tstamp': 1405592489566, u'type': u'info'}
      [2014-07-17 03:21:11,579] - [rest_client:2012] ERROR - {u'node': u'ns_1@10.5.2.231', u'code': 0, u'text': u'<0.2146.3> exited with {{badmatch,{error,closed,\n [{mc_client_binary,cmd_vocal_recv,5,\n [{file,"src/mc_client_binary.erl"},{line,151}]},\n {mc_client_binary,select_bucket,2,\n [{file,"src/mc_client_binary.erl"},{line,346}

      ]},\n {ns_memcached,ensure_bucket,2,\n [

      {file,"src/ns_memcached.erl"},{line,1269}]},\n {ns_memcached,handle_info,2,\n [{file,"src/ns_memcached.erl"}

      ,

      {line,744}

      ]},\n {gen_server,handle_msg,5,\n [

      {file,"gen_server.erl"},{line,604}]},\n {ns_memcached,init,1,\n [{file,"src/ns_memcached.erl"},{line,171}]},\n {gen_server,init_it,6,\n [{file,"gen_server.erl"}

      ,

      {line,304}

      ]},\n {proc_lib,init_p_do_apply,3,\n [

      {file,"proc_lib.erl"}

      ,

      {line,239}

      ]}]},\n {gen_server,call,\n [\'ns_memcached-default\',\n

      {delete_vbucket,958}

      ,\n 360000]}},\n {gen_server,call,\n [

      {\'janitor_agent-default\',\'ns_1@10.5.2.233\'}

      ,\n {if_rebalance,<0.18873.2>,\n {get_vbucket_high_seqno,794}},\n infinity]}}', u'shortText': u'message', u'serverTime': u'2014-07-17T03:21:29.554Z', u'module': u'ns_vbucket_mover', u'tstamp': 1405592489554, u'type': u'critical'}
      [2014-07-17 03:21:11,579] - [rest_client:2012] ERROR -

      {u'node': u'ns_1@10.5.2.233', u'code': 0, u'text': u"Port server memcached on node 'babysitter_of_ns_1@127.0.0.1' exited with status 134. Restarting. Messages: Thu Jul 17 03:21:29.110004 PDT 3: (default) Notified the completion of checkpoint persistence for vbucket 894, cookie 0x7258c00\nThu Jul 17 03:21:29.113711 PDT 3: (default) Notified the completion of checkpoint persistence for vbucket 756, cookie 0x7259200\nThu Jul 17 03:21:29.113797 PDT 3: (default) Notified the completion of checkpoint persistence for vbucket 824, cookie 0x7258f00\nThu Jul 17 03:21:29.121074 PDT 3: (default) Deletion of vbucket 809 was completed.\nmemcached: /buildbot/build_slave/centos-5-x64-300-builder/build/build/memcached/daemon/memcached.c:7731: decrement_session_ctr: Assertion `session_cas.ctr != 0' failed.", u'shortText': u'message', u'serverTime': u'2014-07-17T03:21:29.450Z', u'module': u'ns_log', u'tstamp': 1405592489450, u'type': u'info'}

      [2014-07-17 03:21:11,580] - [rest_client:2012] ERROR - {u'node': u'ns_1@10.5.2.233', u'code': 0, u'text': u'Control connection to memcached on \'ns_1@10.5.2.233\' disconnected: badmatch,\n {error,\n closed,\n [{mc_client_binary,\n cmd_vocal_recv,\n 5,\n [

      {file,\n "src/mc_client_binary.erl"},\n {line,\n 151}]},\n {mc_client_binary,\n select_bucket,\n 2,\n [{file,n "src/mc_client_binary.erl"}

      ,\n

      {line,\n 346}

      ]},\n {ns_memcached,\n ensure_bucket,\n 2,\n [

      {file,\n "src/ns_memcached.erl"},\n {line,\n 1269}]},\n {ns_memcached,\n handle_info,\n 2,\n [{file,n "src/ns_memcached.erl"}

      ,\n

      {line,\n 744}

      ]},\n {gen_server,\n handle_msg,\n 5,\n [

      {file,\n "gen_server.erl"},\n {line,\n 604}]},\n {ns_memcached,\n init,1,\n [{file,\n "src/ns_memcached.erl"},\n {line,\n 171}]},\n {gen_server,\n init_it,\n 6,\n [{file,n "gen_server.erl"}

      ,\n

      {line,\n 304}

      ]},\n {proc_lib,\n init_p_do_apply,\n 3,\n [

      {file,\n "proc_lib.erl"}

      ,\n

      {line,\n 239}

      ]}]}', u'shortText': u'message', u'serverTime': u'2014-07-17T03:21:29.447Z', u'module': u'ns_memcached', u'tstamp': 1405592489447, u'type': u'info'}
      [2014-07-17 03:21:11,582] - [rest_client:2012] ERROR -

      {u'node': u'ns_1@10.5.2.231', u'code': 0, u'text': u'Bucket "default" rebalance does not seem to be swap rebalance', u'shortText': u'message', u'serverTime': u'2014-07-17T03:20:32.787Z', u'module': u'ns_vbucket_mover', u'tstamp': 1405592432787, u'type': u'info'}

      [2014-07-17 03:21:11,582] - [rest_client:2012] ERROR -

      {u'node': u'ns_1@10.5.2.234', u'code': 0, u'text': u'Bucket "default" loaded on node \'ns_1@10.5.2.234\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2014-07-17T03:20:32.376Z', u'module': u'ns_memcached', u'tstamp': 1405592432376, u'type': u'info'}

      [2014-07-17 03:21:11,583] - [rest_client:2012] ERROR -

      {u'node': u'ns_1@10.3.5.68', u'code': 0, u'text': u'Bucket "default" loaded on node \'ns_1@10.3.5.68\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2014-07-17T03:20:32.318Z', u'module': u'ns_memcached', u'tstamp': 1405592432318, u'type': u'info'}

      [2014-07-17 03:21:11,583] - [rest_client:2012] ERROR -

      {u'node': u'ns_1@10.5.2.231', u'code': 0, u'text': u'Started rebalancing bucket default', u'shortText': u'message', u'serverTime': u'2014-07-17T03:20:31.422Z', u'module': u'ns_rebalancer', u'tstamp': 1405592431422, u'type': u'info'}

      [2014-07-17 03:21:11,584] - [rest_client:2012] ERROR -

      {u'node': u'ns_1@10.5.2.231', u'code': 4, u'text': u"Starting rebalance, KeepNodes = ['ns_1@10.5.2.233','ns_1@10.5.2.231',\n 'ns_1@10.5.2.234','ns_1@10.5.2.232',\n 'ns_1@10.3.5.68'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes\n", u'shortText': u'message', u'serverTime': u'2014-07-17T03:20:31.337Z', u'module': u'ns_orchestrator', u'tstamp': 1405592431337, u'type': u'info'}

      [2014-07-17 03:21:11,585] - [rest_client:2012] ERROR -

      {u'node': u'ns_1@10.3.5.68', u'code': 3, u'text': u'Node ns_1@10.3.5.68 joined cluster', u'shortText': u'message', u'serverTime': u'2014-07-17T03:20:31.293Z', u'module': u'ns_cluster', u'tstamp': 1405592431293, u'type': u'info'}

      ERROR

      [Core]
      Core was generated by `/opt/couchbase/bin/memcached -C /opt/couchbase/var/lib/couchbase/config/memcach'.
      Program terminated with signal 6, Aborted.
      #0 0x00007f00de1548a5 in raise () from /lib64/libc.so.6

      Thread 13 (Thread 0x7f00d3a99700 (LWP 1868)):
      #0 0x00007f00de20385d in fdatasync () from /lib64/libc.so.6
      #1 0x00007f00d71eb767 in couch_sync (errinfo=0xa041b20, handle=<value optimized out>) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/couchstore/src/os.c:133
      #2 0x00007f00d74e2dbb in cfs_sync(struct

      {...} *, couch_file_handle) (errinfo=0xa041b20, h=0x8eceaa0) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/couch-kvstore/couch-fs-stats.cc:128
      #3 0x00007f00d71d56fa in couchstore_commit (db=0xa041b00) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/couchstore/src/couch_db.cc:194
      #4 0x00007f00d74d5860 in CouchKVStore::saveDocs (this=0x7346a40, vbid=887, rev=<value optimized out>, docs=0x2a06480, docinfos=0x2a063a8, docCount=<value optimized out>, kvctx=...) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/couch-kvstore/couch-kvstore.cc:1819
      #5 0x00007f00d74d6306 in CouchKVStore::commit2couchstore (this=0x7346a40, cb=0x7f00d3a989a0) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/couch-kvstore/couch-kvstore.cc:1713
      #6 0x00007f00d74d6625 in CouchKVStore::commit (this=0x3d, cb=0x3d) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/couch-kvstore/couch-kvstore.cc:1138
      #7 0x00007f00d74270af in EventuallyPersistentStore::flushVBucket (this=0x72e4240, vbid=<value optimized out>) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/ep.cc:2581
      #8 0x00007f00d746532c in Flusher::flushVB (this=0x739c3c0) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/flusher.cc:281
      #9 0x00007f00d7465f9e in Flusher::step (this=0x3d, task=0x72f2780) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/flusher.cc:172
      #10 0x00007f00d7494676 in FlusherTask::run (this=0x0) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/tasks.cc:40
      #11 0x00007f00d746f51c in ExecutorThread::run (this=0x72f2460) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/executorthread.cc:95
      #12 0x00007f00d746fa56 in launch_executor_thread (arg=0x3d) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/executorthread.cc:33
      #13 0x00007f00e02b4bff in platform_thread_wrap (arg=<value optimized out>) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/platform/src/cb_pthreads.c:19
      #14 0x00007f00df068851 in start_thread () from /lib64/libpthread.so.0
      #15 0x00007f00de20a90d in clone () from /lib64/libc.so.6

      Thread 12 (Thread 0x7f00d449a700 (LWP 1867)):
      #0 0x00007f00df06f054 in __lll_lock_wait () from /lib64/libpthread.so.0
      #1 0x00007f00df06a388 in _L_lock_854 () from /lib64/libpthread.so.0
      #2 0x00007f00df06a257 in pthread_mutex_lock () from /lib64/libpthread.so.0
      #3 0x00007f00e02b4a89 in cb_mutex_enter (mutex=<value optimized out>) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/platform/src/cb_pthreads.c:85
      #4 0x00007f00d746e11d in Mutex::acquire (this=0x733f110) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/mutex.cc:31
      #5 0x00007f00d74e572b in lock (this=0x733f000, vbs=..., file_version=1, header_offset=18446744073709551615, cb=...) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/locks.h:66
      #6 LockHolder (this=0x733f000, vbs=..., file_version=1, header_offset=18446744073709551615, cb=...) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/locks.h:44
      #7 CouchNotifier::notify_update (this=0x733f000, vbs=..., file_version=1, header_offset=18446744073709551615, cb=...) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/couch-kvstore/couch-notifier.cc:846
      #8 0x00007f00d74d8782 in CouchKVStore::setVBucketState (this=0x7345d40, vbucketId=662, vbstate=..., vb_change_type=2, kvcb=0x7f00d4499c60, notify=true) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/couch-kvstore/couch-kvstore.cc:1064
      #9 0x00007f00d74d8dc7 in CouchKVStore::snapshotVBuckets (this=0x7345d40, vbstates=std::map with 138 elements = {...}

      , cb=0x7f00d4499c60) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/couch-kvstore/couch-kvstore.cc:932
      #10 0x00007f00d74279af in EventuallyPersistentStore::snapshotVBuckets (this=0x72e4240, priority=..., shardId=<value optimized out>) at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/ep.cc:921

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            chiyoung Chiyoung Seo (Inactive)
            sangharsh Sangharsh Agarwal
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty