Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-11861

{DCP} :: Rebalance-in exited with memcached crash

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Test Blocker
    • 3.0
    • 3.0
    • couchbase-bucket
    • Security Level: Public
    • None
    • 1:10.6.2.144
      2:10.6.2.145
      3:10.6.2.146
      4:10.6.2.147
      5:10.6.2.148
      6:10.6.2.149
      7:10.6.2.150
    • Untriaged
    • Unknown

    Description

      1072, centos 6x

      1. Create 1 node cluster
      2. Add default bucket with 100 K items
      3. Rebalance in 2 nodes with create ops running in parallel

      When the Step 3 is run the third time (with 5 nodes, 2 rebalance-in) it exits and crash dumps are seen

      Port server memcached on node 'babysitter_of_ns_1@127.0.0.1' exited with status 134. Restarting. Messages: Thu Jul 31 15:22:24.799695 PDT 3: (default) UPR (Producer) eq_uprq:replication:ns_1@10.6.2.144->ns_1@10.6.2.145:default - (vb 122) Stream closing, 0 items sent from disk, 572 items sent from memory, 1937 was last seqno sent
      Thu Jul 31 15:22:24.799711 PDT 3: (default) UPR (Producer) eq_uprq:replication:ns_1@10.6.2.144->ns_1@10.6.2.145:default - (vb 124) Stream closing, 0 items sent from disk, 526 items sent from memory, 1880 was last seqno sent
      Thu Jul 31 15:22:24.815042 PDT 3: (default) UPR (Producer) eq_uprq:replication:ns_1@10.6.2.144->ns_1@10.6.2.149:default - (vb 18) Stream closing, 0 items sent from disk, 0 items sent from memory, 1784 was last seqno sent
      Thu Jul 31 15:22:24.815093 PDT 3: (default) UPR (Producer) eq_uprq:replication:ns_1@10.6.2.144->ns_1@10.6.2.149:default - (vb 64) Stream closing, 1689 items sent from disk, 0 items sent from memory, 1743 was last seqno sent
      asssertion failed [highSeqno <= vb->getHighSeqno()] at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/ep.cc:2724

      Rebalance exited with reason {unexpected_exit,
      {'EXIT',<0.19183.31>,
      {wait_seqno_persisted_failed,"default",44,
      1677,
      [{'ns_1@10.6.2.146',
      {'EXIT',
      badmatch,{error,closed,
      {gen_server,call,
      [

      {'janitor_agent-default', 'ns_1@10.6.2.146'}

      ,
      {if_rebalance,<0.18471.31>,
      {wait_seqno_persisted,44,1677}},
      infinity]}}}}]}}}

      <0.19096.31> exited with {unexpected_exit,
      {'EXIT',<0.19183.31>,
      {wait_seqno_persisted_failed,"default",44,1677,
      [{'ns_1@10.6.2.146',
      {'EXIT',
      badmatch,{error,closed,
      {gen_server,call,
      [

      {'janitor_agent-default','ns_1@10.6.2.146'}

      ,
      {if_rebalance,<0.18471.31>,
      {wait_seqno_persisted,44,1677}},
      infinity]}}}}]}}}

      Port server memcached on node 'babysitter_of_ns_1@127.0.0.1' exited with status 134. Restarting. Messages: Thu Jul 31 15:22:24.346280 PDT 3: (default) UPR (Producer) eq_uprq:replication:ns_1@10.6.2.146->ns_1@10.6.2.149:default - (vb 30) Sending disk snapshot with start seqno 0 and end seqno 1762
      Thu Jul 31 15:22:24.352786 PDT 3: (default) UPR (Producer) eq_uprq:replication:ns_1@10.6.2.146->ns_1@10.6.2.149:default - (vb 30) Backfill complete, 605 items read from disk, last seqno read: 1762
      Thu Jul 31 15:22:24.352815 PDT 3: (default) Backfill task (1 to 1598) finished for vb 30 disk seqno 1762 memory seqno 1762
      Thu Jul 31 15:22:24.353154 PDT 3: (default) UPR (Producer) eq_uprq:replication:ns_1@10.6.2.146->ns_1@10.6.2.147:default - (vb 102) Stream closing, 1311 items sent from disk, 581 items sent from memory, 1931 was last seqno sent
      asssertion failed [highSeqno <= vb->getHighSeqno()] at /buildbot/build_slave/centos-5-x64-300-builder/build/build/ep-engine/src/ep.cc:2724

      test case

      ./testrunner -i centos_x64_rebalance_in.ini get-cbcollect-info=False,get-logs=False,stop-on-failure=False,get-coredumps=True,force_kill_memached=False,verify_unacked_bytes=True,total_vbuckets=128,std_vbuckets_dist=5 -t rebalance.rebalancein.RebalanceInTests.incremental_rebalance_in_with_ops,replicas=3,items=100000,doc_ops=create,max_verify=100000,GROUP=IN;P2

      Attaching logs

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            parag Parag Agarwal (Inactive)
            parag Parag Agarwal (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty