Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Duplicate
Priority: Critical
Fix Version/s: 3.0
Affects Version/s: 3.0
Component/s: couchbase-bucket
Security Level: Public
Labels:
None
Environment:
CentOS

Triage:
Untriaged
Is this a Regression?:
Unknown
Sprint:
June 30 - July 18

Description

Build
--------
3.0.0-900(xdcr on upr, internal replication on upr)

Steps
--------
1. Load on both clusters till vb_active_resident_items_ratio < 50.
2. Setup bi-xdcr on "standardbucket", uni-xdcr on "standardbucket1"
3. Access phase with 50% gets, 50%deletes for 3 hours.
4. Rebalance-out one node (.47) at C1.
5. Rebalance-in same node at C1.

Problem
-------------
During rebalance -in, right after 41.9% rebalance has not progressed(the rest call indicating progress shows no increase) for little more than 5 mins. As a result test times out as shown. This has never been the case in previous runs of the same test against 2.2.0, 2.5.0 or 2.5.1.

[2014-06-30 14:17:56,782: ERROR/MainProcess] Running Phase: rebalance_in_one_source (Rebalance-in-1)
[2014-06-30 14:18:01,901: ERROR/MainProcess] Started workload workload_37e0ed9
[2014-06-30 14:18:01,930: ERROR/MainProcess] kill task workload_19da4d6
[2014-06-30 14:18:01,931: ERROR/MainProcess]

{'update_perc': 22, 'indexed_keys': [], 'del_perc': 3, 'postcondition_handler': None, 'create_perc': 3, 'bucket': 'standardbucket', 'exp_perc': 2, 'miss_queue': None, 'ops_per_sec': 3000, 'consume_queue': None, 'postconditions': None, 'template': 'default', 'ttl': 3000, 'cc_queues': ['std1ph5keys'], 'preconditions': None, 'password': '', 'get_perc': 70, 'miss_perc': 5, 'wait': None}

[2014-06-30 14:18:02,005: ERROR/MainProcess] start task sent to 1 consumers
[2014-06-30 14:18:03,909: ERROR/MainProcess] Started workload workload_a2f6b8a
[2014-06-30 14:18:03,930: ERROR/MainProcess] kill task workload_f42122d
[2014-06-30 14:18:03,931: ERROR/MainProcess]

{'update_perc': 22, 'indexed_keys': [], 'del_perc': 3, 'postcondition_handler': None, 'create_perc': 3, 'bucket': 'standardbucket1', 'exp_perc': 2, 'miss_queue': None, 'ops_per_sec': 3000, 'consume_queue': None, 'postconditions': None, 'template': 'default', 'ttl': 3000, 'cc_queues': ['std2ph5keys'], 'preconditions': None, 'password': '', 'get_perc': 70, 'miss_perc': 5, 'wait': None}

[2014-06-30 14:18:03,996: ERROR/MainProcess] start task sent to 1 consumers
[2014-06-30 14:18:05,917: ERROR/MainProcess] Started workload workload_28f856f
[2014-06-30 14:18:05,949: ERROR/MainProcess] kill task workload_640273e
[2014-06-30 14:18:05,950: ERROR/MainProcess]

{'update_perc': 22, 'indexed_keys': [], 'del_perc': 3, 'postcondition_handler': None, 'create_perc': 3, 'bucket': 'saslbucket', 'exp_perc': 2, 'miss_queue': None, 'ops_per_sec': 3000, 'consume_queue': None, 'postconditions': None, 'template': 'default', 'ttl': 3000, 'cc_queues': ['saslph5keys'], 'preconditions': None, 'password': 'password', 'get_perc': 70, 'miss_perc': 5, 'wait': None}

[2014-06-30 14:18:06,039: ERROR/MainProcess] start task sent to 1 consumers
[2014-06-30 14:27:50,156: ERROR/MainProcess] apparently rebalance progress code in infinite loop: 41.942552351
[2014-06-30 14:27:52,158: ERROR/MainProcess] Stopping workload workload_37e0ed9
[2014-06-30 14:27:52,184: ERROR/MainProcess] kill task workload_37e0ed9
[2014-06-30 14:27:52,189: ERROR/MainProcess] Stopping workload workload_28f856f
[2014-06-30 14:27:52,226: ERROR/MainProcess] kill task workload_28f856f
[2014-06-30 14:27:54,229: ERROR/MainProcess] Stopping workload workload_a2f6b8a
[2014-06-30 14:27:54,260: ERROR/MainProcess] kill task workload_a2f6b8a
[2014-06-30 14:28:03,270: ERROR/MainProcess]

To continue testing, I'm increasing the timeout value to 15 mins. Please check if this has to do with rebalance performance.

Attaching cbcollect info.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

masterEvents
2.19 MB
11/Jul/14 2:04 PM
masterEvents.txt
19.63 MB
01/Jul/14 1:41 PM

Issue Links

is duplicated by

MB-11720 Backfilling the entire vbucket can starve other streams that also need to backfill

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Aruna Piravi (Inactive)

Reporter:: Aruna Piravi (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 30/Jun/14 5:23 PM

Updated:: 22/Jul/14 5:21 PM

Resolved:: 21/Jul/14 2:34 PM

Gerrit Reviews

There are no open Gerrit changes

KV+XDCR System test : Rebalance gets temporarily stuck but eventually proceeds to completion

Details

Description

Attachments

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty