Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-42633

[Magma]: Bucket warmup is stuck on aborting rebalance by killing memcached on all nodes in cluster.

    XMLWordPrintable

Details

    Description

      1. Create a 17 nodes cluster.

      Rebalance Overview
      +----------------+----------+--------------+
      | Nodes          | Services | Status       |
      +----------------+----------+--------------+
      | 172.23.120.170 | kv       | Cluster node |
      | 172.23.121.115 | None     | <--- IN ---  |
      | 172.23.121.116 | None     | <--- IN ---  |
      | 172.23.121.123 | None     | <--- IN ---  |
      | 172.23.121.124 | None     | <--- IN ---  |
      | 172.23.121.126 | None     | <--- IN ---  |
      | 172.23.121.127 | None     | <--- IN ---  |
      | 172.23.121.128 | None     | <--- IN ---  |
      | 172.23.121.129 | None     | <--- IN ---  |
      | 172.23.121.130 | None     | <--- IN ---  |
      | 172.23.121.131 | None     | <--- IN ---  |
      | 172.23.121.132 | None     | <--- IN ---  |
      | 172.23.121.133 | None     | <--- IN ---  |
      | 172.23.121.134 | None     | <--- IN ---  |
      | 172.23.121.135 | None     | <--- IN ---  |
      | 172.23.121.136 | None     | <--- IN ---  |
      | 172.23.121.139 | None     | <--- IN ---  |
      +----------------+----------+--------------+
      

      2. Create a magma bucket
      3. Start an long running expiry load with maxttl=60
      4. Sleep for 300s while expiry load is running
      5. Rebalance in 1 node and kill memcahced on all the nodes when rebalance is running.

      Rebalance Overview
      +----------------+----------+--------------+
      | Nodes          | Services | Status       |
      +----------------+----------+--------------+
      | 172.23.121.128 | kv       | Cluster node |
      | 172.23.121.115 | kv       | Cluster node |
      | 172.23.121.132 | kv       | Cluster node |
      | 172.23.121.136 | kv       | Cluster node |
      | 172.23.121.139 | kv       | Cluster node |
      | 172.23.121.135 | kv       | Cluster node |
      | 172.23.120.170 | kv       | Cluster node |
      | 172.23.121.129 | kv       | Cluster node |
      | 172.23.121.124 | kv       | Cluster node |
      | 172.23.121.126 | kv       | Cluster node |
      | 172.23.121.134 | kv       | Cluster node |
      | 172.23.121.116 | kv       | Cluster node |
      | 172.23.121.127 | kv       | Cluster node |
      | 172.23.121.123 | kv       | Cluster node |
      | 172.23.121.131 | kv       | Cluster node |
      | 172.23.121.130 | kv       | Cluster node |
      | 172.23.121.133 | kv       | Cluster node |
      | 172.23.121.48  | None     | <--- IN ---  |
      +----------------+----------+--------------+
      

      6. Bucket is stuck on al the nodes under warmup.

      NOTE:
      Logs from the 2 nodes are attached as cbcollect is not working for these 2 nodes.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            owend Daniel Owen
            ritesh.agarwal Ritesh Agarwal
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty