Details
Description
Scenario
1. Create 7 node cluster
2. Create default bucket and add 100K items
3. Graceful failover 1 node
4. During Graceful failover, kill memcached of 3 other nodes, this fails graceful failover
5. Restart Graceful failover and let it run to completion
6. Full recover the failed over node and rebalance
7. During rebalance, kill memcached of 3 other nodes, this fails rebalance
8. Restart Rebalance and run it to completion
After Step 8, we collect data using cbtransfer and compare it to the one we had in step 2. We see missing keys.
Note that there are no mutations running from step 3 to step 8. We always read from couch store after the queues have been drained and replication is complete. Also, before we run cbtransfer, we verified item counts and verified data items as well
This seems like a bug in cbtransfer
Missing keys
failover97727
failover96541
failover19942
failover72566
failover98994
failover21107
failover17597
failover47535
failover58469
failover47247
failover79250
failover95182
failover48606
failover885
failover98366
failover72214
failover24016
failover74124
failover51288
failover41177
failover47925
failover19220
failover6008
failover40281
failover94916
failover20361
failover29410
failover29800
failover61528
failover90103
failover73072
failover17817
failover46753
failover27955
failover91997
failover25502
failover99672
failover32149
failover19552
failover34279
failover26723
failover16113
failover79522
failover96951
failover11737
failover15332
failover70253
failover78036
failover20413
failover45200
failover13192
failover14154
failover31368
failover88099
failover44684
failover49460
failover25882
failover62699
failover12486
failover81678
failover23632
failover15850
failover27237
failover505
failover11045
failover49312
failover94496
failover95760
failover24186
failover10941
failover84769
failover72976
failover77295
failover20993
failover15440
failover12516
failover277
failover38589
failover92636
failover22844
failover72384
failover73700
failover95012
failover82459
failover22326
failover87548
failover87958
failover98414
failover78744
failover23140
failover27545
failover10223
failover61938
failover3119
failover14626
failover79932
failover74656
failover12896
failover41605
failover93322
failover42424
failover92144
failover99090
failover94274
failover91365
failover77867
failover44066
failover24764
failover40311
failover38809
failover20803
failover68259
failover54209
failover76163
failover70931
failover6198
failover44714
failover18734
failover51318
failover43642
failover98584
failover49870
failover43130
failover82849
failover41795
failover28104
failover13002
failover10551
failover76611
failover91807
failover17407
failover22454
failover91587
failover67618
failover29580
failover16661
failover915
failover90671
failover495
failover12906
failover9449
failover42356
failover97055
failover98804
failover77477
failover29990
failover40463
failover42834
failover45390
failover29362
failover9859
failover57028
failover38419
failover77305
failover17987
failover71035
failover76781
failover45962
failover71747
failover75820
failover18046
failover91417
failover20583
failover12264
failover25492
failover64439
failover94886
failover70521
failover28676
failover48796
failover45572
failover16083
failover25912
failover49282
failover64829
failover96233
failover26051
failover38999
failover99100
failover3089
failover48174
failover5229
failover21097
failover93450
failover37058
failover25270
failover46021
failover93840
failover62709
failover28094
failover40873
failover13770
failover58879
failover90093
failover88109
failover73690
failover67788
failover17375
failover94506
failover75342
failover54399
failover52139
failover75430
failover21675a