Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
3.0
-
Security Level: Public
-
None
-
3.0.0-673
-
Untriaged
-
Centos 64-bit
-
Unknown
-
June 30 - July 18
Description
Unix XDCR:
source:
172.23.105.156
172.23.105.157
172.23.105.158
172.23.105.22
destination:
172.23.105.159
172.23.105.160
172.23.105.206
172.23.105.207
1) initially i have 3 node per cluster with 4 bcukets
2)After running load about 1 day, I started setup replication from cluster src to dest.
3) Do rebalance in at src
Do rebalance in at dest
4) select graceful failover for node in src cluster, add back and rebalance
5) select hard failover for node in src cluster, add back and rebalance
all this time the loader worked
then all activities were stopped and clusters and clusters are not touched a few hours
comparing the number of items on the UI:
bucket| source 172.23.105.156| destination 172.23.105.159
AbRegNums|637204|637124
MsgsCalls|0|0
RevAB|16184145|16184341
UserInfo|122640|122161
some investigation on the example UserInfo bucket:
created view to get all docs
172.23.105.156: "total_rows":122641
172.23.105.159: "total_rows":122651
diff 159_UserInfo 156_UserInfo
1c1
< {"total_rows":122651,"rows":[
—
> {"total_rows":122641,"rows":[
9072d9071
<
,
41842d41840
<
,
45483a45482
>
,
54151d54149
<
,
55788a55787
>
,
99450d99448
<
,
99560d99557
<
,
99761d99757
<
,
111226d111221
<
,
111278d111272
<
,
111830d111823
<
,
112470d112462
<
,
112475d112466
<
,
119047d119037
<
,
[root@centos-64-x64 tmp]# cat 159_UserInfo |grep DEV_efdhiftwqtcstipyzskubgzrwmlclkkwiuztxnsz
,
[root@centos-64-x64 tmp]# cat 156_UserInfo |grep DEV_efdhiftwqtcstipyzskubgzrwmlclkkwiuztxnsz
so, for example DEV_efdhiftwqtcstipyzskubgzrwmlclkkwiuztxnsz doesn't exist on source cluster but exist in destination
stats for the items:
source:
[root@centos-64-x64 bin]# ./cbstats 172.23.105.156:11210 -b UserInfo all |grep items
curr_items: 29687
curr_items_tot: 57911
curr_temp_items: 6
ep_access_scanner_num_items: 2
ep_chk_max_items: 5000
ep_diskqueue_items: 0
ep_items_rm_from_checkpoints: 144376
ep_total_del_items: 15800
ep_total_new_items: 73771
ep_uncommitted_items: 0
ep_warmup_min_items_threshold: 100
vb_active_curr_items: 29687
vb_pending_curr_items: 0
vb_replica_curr_items: 28224
[root@centos-64-x64 bin]# ./cbstats 172.23.105.157:11210 -b UserInfo all |grep items
curr_items: 29002
curr_items_tot: 57323
curr_temp_items: 1
ep_access_scanner_num_items: 0
ep_chk_max_items: 5000
ep_diskqueue_items: 0
ep_items_rm_from_checkpoints: 145805
ep_total_del_items: 25123
ep_total_new_items: 82469
ep_uncommitted_items: 0
ep_warmup_min_items_threshold: 100
vb_active_curr_items: 29002
vb_pending_curr_items: 0
vb_replica_curr_items: 28321
[root@centos-64-x64 bin]# ./cbstats 172.23.105.158:11210 -b UserInfo all |grep items
curr_items: 28687
curr_items_tot: 57285
curr_temp_items: 6
ep_access_scanner_num_items: 4
ep_chk_max_items: 5000
ep_diskqueue_items: 0
ep_items_rm_from_checkpoints: 143946
ep_total_del_items: 27137
ep_total_new_items: 84438
ep_uncommitted_items: 0
ep_warmup_min_items_threshold: 100
vb_active_curr_items: 28687
vb_pending_curr_items: 0
vb_replica_curr_items: 28598
[root@centos-64-x64 bin]# ./cbstats 172.23.105.22:11210 -b UserInfo all |grep items
curr_items: 35264
curr_items_tot: 63796
curr_temp_items: 5
ep_access_scanner_num_items: 435
ep_chk_max_items: 5000
ep_diskqueue_items: 0
ep_items_rm_from_checkpoints: 64697
ep_total_del_items: 13925
ep_total_new_items: 77727
ep_uncommitted_items: 0
ep_warmup_min_items_threshold: 100
vb_active_curr_items: 35264
vb_pending_curr_items: 0
vb_replica_curr_items: 28532
destination:
[root@centos-64-x64 bin]# ./cbstats localhost:11210 -b UserInfo all |grep items
curr_items: 29646
curr_items_tot: 60472
curr_temp_items: 42
ep_access_scanner_num_items: 15126
ep_chk_max_items: 5000
ep_diskqueue_items: 0
ep_items_rm_from_checkpoints: 108378
ep_total_del_items: 31032
ep_total_new_items: 91551
ep_uncommitted_items: 0
ep_warmup_min_items_threshold: 100
vb_active_curr_items: 29646
vb_pending_curr_items: 0
vb_replica_curr_items: 30826
[root@centos-64-x64 bin]# ./cbstats 172.23.105.159:11210 -b UserInfo all |grep items
curr_items: 29646
curr_items_tot: 60472
curr_temp_items: 42
ep_access_scanner_num_items: 15126
ep_chk_max_items: 5000
ep_diskqueue_items: 0
ep_items_rm_from_checkpoints: 108378
ep_total_del_items: 31032
ep_total_new_items: 91551
ep_uncommitted_items: 0
ep_warmup_min_items_threshold: 100
vb_active_curr_items: 29646
vb_pending_curr_items: 0
vb_replica_curr_items: 30826
[root@centos-64-x64 bin]# ./cbstats 172.23.105.160:11210 -b UserInfo all |grep items
curr_items: 28824
curr_items_tot: 60081
curr_temp_items: 179
ep_access_scanner_num_items: 14975
ep_chk_max_items: 5000
ep_diskqueue_items: 0
ep_items_rm_from_checkpoints: 105485
ep_total_del_items: 32127
ep_total_new_items: 92393
ep_uncommitted_items: 0
ep_warmup_min_items_threshold: 100
vb_active_curr_items: 28824
vb_pending_curr_items: 0
vb_replica_curr_items: 31257
[root@centos-64-x64 bin]# ./cbstats 172.23.105.206:11210 -b UserInfo all |grep items
curr_items: 28449
curr_items_tot: 60008
curr_temp_items: 242
ep_access_scanner_num_items: 15176
ep_chk_max_items: 5000
ep_diskqueue_items: 0
ep_items_rm_from_checkpoints: 103941
ep_total_del_items: 32142
ep_total_new_items: 92398
ep_uncommitted_items: 0
ep_warmup_min_items_threshold: 100
vb_active_curr_items: 28449
vb_pending_curr_items: 0
vb_replica_curr_items: 31559
[root@centos-64-x64 bin]# ./cbstats 172.23.105.207:11210 -b UserInfo all |grep items
curr_items: 35242
curr_items_tot: 64251
curr_temp_items: 27
ep_access_scanner_num_items: 15963
ep_chk_max_items: 5000
ep_diskqueue_items: 0
ep_items_rm_from_checkpoints: 114837
ep_total_del_items: 34698
ep_total_new_items: 98972
ep_uncommitted_items: 0
ep_warmup_min_items_threshold: 100
vb_active_curr_items: 35242
vb_pending_curr_items: 0
vb_replica_curr_items: 29009
clusters available:
source http://172.23.105.156:8091/
destination http://172.23.105.159:8091/
also there is ticket with the same problem, but with unclear status MB-11095
logs are loaded...
Attachments
Issue Links
- relates to
-
MB-11104 [XDCR on UPR, internal rep= TAP] item count mismatch: expired items not deleted, shows up in queries, get() returns value
- Closed