Details
Description
Build
4.0.0-3494
Testcase
--------
./testrunner -i INI_FILE.ini get-cbcollect-info=True,get-logs=False,stop-on-failure=True,get-coredumps=True,fail_on_errors=1,GROUP=ONLINE,upgrade_version=4.0.0-3528-rel,initial_vbuckets=1024 -t xdcr.upgradeXDCR.UpgradeTests.online_cluster_upgrade,initial_version=2.5.0-1059-rel,bucket_topology=default:1>2;standard_bucket_1:1<2;sasl_bucket_1:1><2,expires=500,GROUP=ONLINE
Steps
=====
1. Online upgrade C1[.11,.16] from 2.5.0 to 4.0 using extra node .21. All replications are intact.
2. Online upgrade C2[.19,.20] from 2.5.0 using .21.
At the end of this test (when C2 is upgraded), replications from C1 are missing.
Live cluster: http://10.1.2.11:8091/index.html#sec=replications
On .11 -
|
|
StatisticsManager 2015-07-25T20:11:37.811-07:00 [INFO] Rounter Router_dcp_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.11:11210_0 = map[xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_15:16 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_2:16 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_0:15 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_21:14 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_4:14 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_12:16 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_28:18 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_30:17 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_26:19 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_25:14 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_17:17 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_1:13 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_22:11 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_6:15 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_3:13 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_24:16 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_27:14 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_29:14 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_20:18 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_31:18 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_10:15 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_14:13 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_7:18 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_9:14 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_18:16 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_19:15 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_23:18 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_8:15 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_11:15 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_5:15 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_13:16 xmem_dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1_10.1.2.21:11210_16:15]
|
:
|
ReplicationSpecService 2015-07-25T20:11:41.328-07:00 [ERROR] spec dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1 refers to non-existent target bucket "sasl_bucket_1"
|
ReplicationSpecService 2015-07-25T20:11:41.328-07:00 [ERROR] Replication specification dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1 is no longer valid, garbage collect it. error=spec dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1 refers to non-existent target bucket "sasl_bucket_1"
|
|
ReplicationSpecChangeListener 2015-07-25T20:11:41.335-07:00 [INFO] metakvCallback called on listener ReplicationSpecChangeListener with path = /replicationSpec/dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1
|
ReplicationSpecService 2015-07-25T20:11:41.335-07:00 [INFO] ReplicationSpecServiceCallback called on path = /replicationSpec/dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1
|
ReplicationSpecChangeListener 2015-07-25T20:11:41.335-07:00 [INFO] specChangedCallback called on id = dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1, oldSpec=&{dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1 sasl_bucket_1 dcbe280157d4856d3c2b15315d6683b5 sasl_bucket_1 0xc20816e1b0 [131 108 0 0 0 1 104 2 109 0 0 0 32 50 48 99 51 100 49 52 53 54 53 50 54 101 98 57 50 100 100 99 55 48 55 102 53 55 50 100 51 54 100 57 100 104 2 97 1 110 5 0 150 197 40 207 14 106]}, newSpec=<nil>
|
ReplicationSpecChangeListener 2015-07-25T20:11:41.335-07:00 [INFO] old spec settings=&{xmem true 1800 500 2048 10 256 2 64 1000 50 Info 1000 <nil>}
|
PipelineManager 2015-07-25T20:11:41.335-07:00 [ERROR] Invalid replication status dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1, failed to retrieve spec. err=Requested resource not found
|
PipelineManager 2015-07-25T20:11:41.335-07:00 [INFO] Stopping pipeline dcbe280157d4856d3c2b15315d6683b5/sasl_bucket_1/sasl_bucket_1 since the replication spec has been deleted
|
At this time, from test log, we seem to have removed the extra node .21 to which .11 was replicating to, which caused the replication to get deleted.
Pls note we added .19 and .20 to C2 just minutes before we removed .21. goxdcr should be able to detect the new nodes and look for buckets in them.
Pls also note that C1 however points to the new C2 node (.19) as remote cluster ref IP. While goxdcr is able to change the remote cluster reference to .19, for replications, it still looks at .21 and deletes them because .21 is no longer a part of C2.
(Timing on jenkins slave and nodes are slightly off)
2015-07-25 20:11:37,657 - root - INFO - adding remote node @10.1.2.19:8091 to this cluster @10.1.2.21:8091
|
2015-07-25 20:11:40,082 - root - INFO - adding node 10.1.2.20:8091 to cluster
|
2015-07-25 20:11:40,083 - root - INFO - adding remote node @10.1.2.20:8091 to this cluster @10.1.2.21:8091
|
2015-07-25 20:11:43,251 - root - INFO - rebalance params : password=password&ejectedNodes=&user=Administrator&knownNodes=ns_1%4010.1.2.20%2Cns_1%4010.1.2.19%2Cns_1%4010.1.2.21
|
2015-07-25 20:11:43,262 - root - INFO - rebalance operation started
|
2015-07-25 20:11:43,277 - root - INFO - rebalance percentage : 0.00 %
|
2015-07-25 20:11:53,297 - root - INFO - rebalance percentage : 6.83 %
|
2015-07-25 20:12:03,317 - root - INFO - rebalance percentage : 14.19 %
|
2015-07-25 20:12:13,343 - root - INFO - rebalance percentage : 21.96 %
|
2015-07-25 20:12:23,363 - root - INFO - rebalance percentage : 28.91 %
|
2015-07-25 20:12:33,383 - root - INFO - rebalance percentage : 32.99 %
|
2015-07-25 20:12:43,403 - root - INFO - rebalance percentage : 39.15 %
|
2015-07-25 20:12:53,432 - root - INFO - rebalance percentage : 46.31 %
|
2015-07-25 20:13:03,451 - root - INFO - rebalance percentage : 53.43 %
|
2015-07-25 20:13:13,481 - root - INFO - rebalance percentage : 60.67 %
|
2015-07-25 20:13:23,499 - root - INFO - rebalance percentage : 64.76 %
|
2015-07-25 20:13:33,518 - root - INFO - rebalance percentage : 69.43 %
|
2015-07-25 20:13:43,547 - root - INFO - rebalance percentage : 76.63 %
|
2015-07-25 20:13:53,567 - root - INFO - rebalance percentage : 83.95 %
|
2015-07-25 20:14:03,586 - root - INFO - rebalance percentage : 91.32 %
|
2015-07-25 20:14:13,605 - root - INFO - rebalance percentage : 96.73 %
|
2015-07-25 20:14:23,644 - root - INFO - rebalancing was completed with progress: 100% in 160.381052971 sec
|
2015-07-25 20:14:23,645 - root - INFO - Rebalance in all 4.0.0-3528-enterprise nodes completed
|
2015-07-25 20:14:23,745 - root - INFO - Node versions in cluster [u'4.0.0-3528-enterprise', u'4.0.0-3528-enterprise', u'4.0.0-3528-enterprise']
|
2015-07-25 20:14:23,745 - root - INFO - sleep for 15 secs. ...
|
2015-07-25 20:14:38,777 - root - INFO - /diag/eval status on 10.1.2.21:8091: True content: 'ns_1@10.1.2.21' command: node(global:whereis_name(ns_orchestrator))
|
2015-07-25 20:14:38,778 - root - INFO - after rebalance in the master is ns_1@10.1.2.21
|
2015-07-25 20:14:38,778 - root - INFO - Rebalancing out all old version nodes
|
2015-07-25 20:14:39,707 - root - INFO - rebalance params : password=password&ejectedNodes=ns_1%4010.1.2.21&user=Administrator&knownNodes=ns_1%4010.1.2.20%2Cns_1%4010.1.2.19%2Cns_1%4010.1.2.21
|
2015-07-25 20:14:39,877 - root - INFO - rebalance operation started
|
2015-07-25 20:14:39,896 - root - INFO - rebalance percentage : 0.00 %
|
2015-07-25 20:14:49,916 - root - INFO - rebalance percentage : 11.26 %
|
2015-07-25 20:14:59,947 - root - INFO - rebalance percentage : 22.83 %
|
2015-07-25 20:15:09,966 - root - INFO - rebalance percentage : 33.33 %
|
2015-07-25 20:15:19,990 - root - INFO - rebalance percentage : 33.77 %
|
2015-07-25 20:15:30,017 - root - INFO - rebalance percentage : 44.87 %
|
2015-07-25 20:15:40,097 - root - INFO - rebalance percentage : 54.92 %
|
2015-07-25 20:15:50,124 - root - INFO - rebalance percentage : 65.04 %
|
2015-07-25 20:16:00,144 - root - INFO - rebalance percentage : 66.67 %
|
2015-07-25 20:16:10,175 - root - INFO - rebalance percentage : 75.06 %
|
2015-07-25 20:16:20,195 - root - INFO - rebalance percentage : 85.72 %
|
2015-07-25 20:16:30,216 - root - INFO - rebalance percentage : 97.01 %
|
2015-07-25 20:16:40,238 - root - INFO - rebalance percentage : 100.00 %
|
2015-07-25 20:16:43,734 - root - ERROR - socket error while connecting to http
|
Attachments
For Gerrit Dashboard: MB-15873 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
53940,4 | MB-15873 refresh remote ref when validating repl spec | master | goxdcr | Status: MERGED | +2 | +1 |