Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Done
Priority: Critical
Fix Version/s: 3.0
Affects Version/s: 3.0
Component/s: XDCR
Security Level: Public
Labels:
- pending
- rebalance
- stuck
Environment:
CentOS 6.x
8 * 8 clusters, 1 bi-xdcr, 1 uni-xdcr. Each node : 15GB RAM, 419GB HDD for /data

Triage:
Untriaged
Is this a Regression?:
Unknown

Description

Build
--------
3.0.0-819(xdcr on upr, internal replication on upr)

Clusters
-----------
Source : http://172.23.105.44:8091/
Destination : http://172.23.105.54:8091/
The clusters are available to investigate.

Steps
--------
1. Load on both clusters till vb_active_resident_items_ratio < 30.
2. Access phase with 98% gets, 2%sets runs for 3 hours
3. Rebalance-out 1 node at cluster1 with workload [high dgm ~4%]
4. Rebalance-in the same node with workload
5. Failover one node with workload. Rebalance to remove the node ==> rebalance stuck, 4 nodes go to pending state.

Attached
--------------
Cbcollect info for source cluster. 172.23.105.52 was failed over and was getting rebalanced out.
Rebalance did not progress despite pausing xdcr. Let me know if you need logs from remote cluster.

Attachments

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Aruna Piravi (Inactive)

Reporter:: Aruna Piravi (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 17/Jun/14 1:41 PM

Updated:: 18/Jun/14 12:45 PM

Resolved:: 17/Jun/14 4:30 PM

Gerrit Reviews

There are no open Gerrit changes

KV+XDCR System Test : Rebalance after failover stuck, nodes go to pending state

Details

Description

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty