Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Cannot Reproduce
Priority: Major
Fix Version/s: 2.0-beta
Affects Version/s: 2.0-beta
Component/s: couchbase-bucket, ns_server
Security Level: Public
Labels:
None

Description

Seeing this in (reopening) logs of ~~MB-4673~~. That was root cause of rebalance failure. Main bug is in ns_server (because we expect backfills from replicas in replica count > 1 case).

Still I think we should investigate possible issue in ep-engine. Or maybe incorrectness in how ns_server is ensuring replicas are built.

This happens after movement of vbucket 285 when after moving it we establish new replication chain. Second replica sees this:

[rebalance:info,2012-08-30T8:24:01.533,ns_1@10.3.121.94:<0.21104.17>:ebucketmigrator_srv:init:485]Some vbuckets were not yet ready to replicate from:
[285]

which means open checkpoint for this vbucket was either missing or 0 on first replica. Looking at first replica we indeed see this:

[ns_server:debug,2012-08-30T8:24:01.296,ns_1@10.3.121.98:<0.26188.21>:ebucketmigrator_srv:init:536]Reusing old upstream:
[

{vbuckets,[17,18,19,20,21,22,23,24,270,271,272,273,274,275,276,277,278,279, 280,281,282,283,284,285,321,322,323,324,325,329,330,331,332,333, 334,335,336,480,481,482,483,484,485,486,487,513,514,515,516,517, 518,519,520,521,626,627,628,629,630,631,632,633,975,976,977,978, 979,980,981,982]}

{name,<<"replication_ns_1@10.3.121.98">>}

{takeover,false}

]
[rebalance:debug,2012-08-30T8:24:01.300,ns_1@10.3.121.98:<0.26188.21>:ebucketmigrator_srv:init:555]upstream_sender pid: <0.26189.21>
[rebalance:info,2012-08-30T8:24:01.301,ns_1@10.3.121.98:<0.26188.21>:ebucketmigrator_srv:process_upstream:880]Initial stream for vbucket 285
[ns_server:debug,2012-08-30T8:24:01.311,ns_1@10.3.121.98:<0.28617.6>:mc_connection:do_delete_vbucket:118]Notifying mc_couch_events of vbucket deletion: bucket-1/285

Because first replica replicates from new master it cannot be ahead of it on open checkpoint and because we're doing reliable replica building we expect 285 on first replica to be up-to-date when new replication chain is built. In fact logs indicate that first replica had closed checkpoint 4 when vbucket filter was changed.

Attachments

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Chiyoung Seo (Inactive)

Reporter:: Aleksey Kondratenko (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 30/Aug/12 9:10 PM

Updated:: 23/Sep/16 2:47 PM

Resolved:: 11/Sep/12 8:02 PM

Gerrit Reviews

There are no open Gerrit changes

backfill from new master even after reliable replica building procedure

Details

Description

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty