Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-36650

Ephemeral rebalance failed: abortStoredValue: Cannot call on a non-Pending StoredValue

    XMLWordPrintable

Details

    Description

      Steps to Reproduce:

      Create a 7 node cluster.

      +----------------+----------+--------------+
      | Nodes          | Services | Status       |
      +----------------+----------+--------------+
      | 172.23.106.134 | [u'kv']  | Cluster node |
      | 172.23.106.136 | None     | <--- IN ---  |
      | 172.23.106.137 | None     | <--- IN ---  |
      | 172.23.106.138 | None     | <--- IN ---  |
      | 172.23.105.168 | None     | <--- IN ---  |
      | 172.23.106.82  | None     | <--- IN ---  |
      | 172.23.106.83  | None     | <--- IN ---  |
      +----------------+----------+--------------+ 

             1. Create an ephemeral bucket with replicas=1, eviction policy=noEviction, compression=off.

             2. Load 10M docs with durability = MAJORITY. This step was successfully completed. Bucket Stats after this step:

      +----------------+-----------+----------+-----+----------+--------------+-------------+-----------+
      | Bucket         | Type      | Replicas | TTL | Items    | RAM Quota    | RAM Used    | Disk Used |
      +----------------+-----------+----------+-----+----------+--------------+-------------+-----------+
      | GleamBookUsers | ephemeral | 1        | 0   | 10000000 | 139732189184 | 29729510584 | 238       |
      +----------------+-----------+----------+-----+----------+--------------+-------------+-----------+
       

           3. Rebalance In 1 node(172.23.106.85) with another 2M creates, 4M updates with durability=MAJORITY in parallel. This step was successfully completed. Bucket Stats after this  Step:

      +----------------+-----------+----------+-----+----------+--------------+-------------+-----------+
      | Bucket         | Type      | Replicas | TTL | Items    | RAM Quota    | RAM Used    | Disk Used |
      +----------------+-----------+----------+-----+----------+--------------+-------------+-----------+
      | GleamBookUsers | ephemeral | 1        | 0   | 12000000 | 159693930496 | 37605497792 | 272       |
      +----------------+-----------+----------+-----+----------+--------------+-------------+-----------+
       

           4. Rebalance Out 1 node(172.23.106.83) with another 2M creates, 4M updates, 2M deletes with durability=MAJORITY in parallel. This step was successfully completed. Bucket Stats after this Step:

       +----------------+-----------+----------+-----+----------+--------------+-------------+-----------+
      | Bucket         | Type      | Replicas | TTL | Items    | RAM Quota    | RAM Used    | Disk Used |
      +----------------+-----------+----------+-----+----------+--------------+-------------+-----------+
      | GleamBookUsers | ephemeral | 1        | 0   | 12000000 | 139732189184 | 40522783760 | 238       |
      +----------------+-----------+----------+-----+----------+--------------+-------------+-----------+
      

          5. Rebalance In 2 nodes(172.23.106.83, 172.23.106.86) and Rebalance Out 1 node(172.23.106.137) with another 2M creates, 4M updates, 2M deletes with durability=MAJORITY in parallel. Rebalance Operation failed in this step.

      Error Messages:

      Rebalance exited with reason {mover_crashed,
      {unexpected_exit,
      {'EXIT',<0.15239.12>,
      {{{{{child_interrupted,
      {'EXIT',<23442.27660.5>,socket_closed}},
      [{dcp_replicator,spawn_and_wait,1,
      [{file,"src/dcp_replicator.erl"},
      {line,266}]},
      {dcp_replicator,handle_call,3,
      [{file,"src/dcp_replicator.erl"},
      {line,121}]},
      {gen_server,try_handle_call,4,
      [{file,"gen_server.erl"},{line,636}]},
      {gen_server,handle_msg,6,
      [{file,"gen_server.erl"},{line,665}]},
      {proc_lib,init_p_do_apply,3,
      [{file,"proc_lib.erl"},{line,247}]}]},
      {gen_server,call,
      [<23442.27659.5>,
      {setup_replication,[416,430,431,432,433]},
      infinity]}},
      {gen_server,call,
      ['replication_manager-GleamBookUsers',
      {change_vbucket_replication,430,
      'ns_1@172.23.106.136'},
      infinity]}},
      {gen_server,call,
      [{'janitor_agent-GleamBookUsers',
      'ns_1@172.23.106.83'},
      {if_rebalance,<0.6571.12>,
      {update_vbucket_state,966,active,
      undefined,undefined,
      [['ns_1@172.23.106.83',
      'ns_1@172.23.106.86']]}},
      infinity]}}}}}.
      Rebalance Operation Id = 6e1498b429d0c4f57e09d5db488c9f4e 

       Worker <0.6765.12> (for action {move,{966,
      ['ns_1@172.23.106.137',
      'ns_1@172.23.106.136'],
      ['ns_1@172.23.106.83',
      'ns_1@172.23.106.86'],
      []}}) exited with reason {unexpected_exit,
      {'EXIT',
      <0.15239.12>,
      {{{{{child_interrupted,
      {'EXIT',
      <23442.27660.5>,
      socket_closed}},
      [{dcp_replicator,
      spawn_and_wait,
      1,
      [{file,
      "src/dcp_replicator.erl"},
      {line,
      266}]},
      {dcp_replicator,
      handle_call,
      3,
      [{file,
      "src/dcp_replicator.erl"},
      {line,
      121}]},
      {gen_server,
      try_handle_call,
      4,
      [{file,
      "gen_server.erl"},
      {line,
      636}]},
      {gen_server,
      handle_msg,
      6,
      [{file,
      "gen_server.erl"},
      {line,
      665}]},
      {proc_lib,
      init_p_do_apply,
      3,
      [{file,
      "proc_lib.erl"},
      {line,
      247}]}]},
      {gen_server,
      call,
      [<23442.27659.5>,
      {setup_replication,
      [416,
      430,
      431,
      432,
      433]},
      infinity]}},
      {gen_server,
      call,
      ['replication_manager-GleamBookUsers',
      {change_vbucket_replication,
      430,
      'ns_1@172.23.106.136'},
      infinity]}},
      {gen_server,
      call,
      [{'janitor_agent-GleamBookUsers',
      'ns_1@172.23.106.83'},
      {if_rebalance,
      <0.6571.12>,
      {update_vbucket_state,
      966,
      active,
      undefined,
      undefined,
      [['ns_1@172.23.106.83',
      'ns_1@172.23.106.86']]}},
      infinity]}}}}

      Note: This warning message was not found in memcached logs.

      Disconnecting the connection as there is no memory to complete replication

       

      Attachments

        For Gerrit Dashboard: MB-36650
        # Subject Branch Project Status CR V

        Activity

          People

            prateek.kumar Prateek Kumar (Inactive)
            prateek.kumar Prateek Kumar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty