Details
-
Bug
-
Resolution: Fixed
-
Major
-
5.0.0
-
Triaged
-
Yes
Description
Build : 5.0.0-1648
Tests : To reproduce this issue, add the following commands in a shell script and run the script.
./testrunner -i FTS_INI_FILE.ini get-cbcollect-info=True,get-coredumps=True,get-logs=False,stop-on-failure=False,cluster=D+F,GROUP=ALL -t fts.stable_topology_fts.StableTopFTS.test_doc_config,mode=docid_regexp2,query=
{"query":"Delta"},expected_hits=4
./testrunner -i FTS_INI_FILE.ini get-cbcollect-info=True,get-coredumps=True,get-logs=False,stop-on-failure=False,cluster=D+F,GROUP=ALL -t fts.stable_topology_fts.StableTopFTS.test_doc_config,mode=docid_regexp_neg1,query=
{"query":"Delta"},expected_hits=0
Issue:
In some FTS tests where we use the travel-sample buckets, the buckets are deleted in the cleanup phase for the test. The subsequent test's setup method then rebalances out the nodes in the cluster and adds them back. This rebalance-in operation fails with the following error -
[2017-01-12 13:00:26,079] - [rest_client:2703] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 2, u'text': u'Rebalance exited with reason {buckets_shutdown_wait_failed,\n [{\'ns_1@127.0.0.1\',\n {\'EXIT\',\n
{old_buckets_shutdown_wait_failed,\n ["travel-sample"]}}}]}\n', u'shortText': u'message', u'serverTime': u'2017-01-12T13:00:25.957Z', u'module': u'ns_orchestrator', u'tstamp': 1484254825957, u'type': u'info'}
[2017-01-12 13:00:26,079] - [rest_client:2703] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Failed to wait deletion of some buckets on some nodes: [{\'ns_1@127.0.0.1\',\n {\'EXIT\',\n {old_buckets_shutdown_wait_failed,\n ["travel-sample"]}}}]\n', u'shortText': u'message', u'serverTime': u'2017-01-12T13:00:25.956Z', u'module': u'ns_rebalancer', u'tstamp': 1484254825956, u'type': u'critical'}
Here's the testrunner console output from the previous test's cleanup and the subsequent's setup :
[2017-01-12 12:59:28,073] - [fts_base:2801] INFO - ==== FTSbasetests cleanup is started for test #132 test_doc_config ==== |
[2017-01-12 12:59:28,095] - [fts_base:888] INFO - Deleting fulltext-index travel-index on 172.23.108.213 |
[2017-01-12 12:59:28,139] - [remote_util:181] INFO - connecting to 172.23.108.213 with username:root password:couchbase ssh_key: |
[2017-01-12 12:59:28,308] - [remote_util:215] INFO - Connected to 172.23.108.213 |
[2017-01-12 12:59:28,606] - [remote_util:2663] INFO - It could be other operating systyem. Go to check at other location |
[2017-01-12 12:59:29,061] - [remote_util:2511] INFO - running command.raw on 172.23.108.213: ls /opt/couchbase/var/lib/couchbase/data/@fts |grep travel-index*.pindex | wc -l |
[2017-01-12 12:59:29,150] - [remote_util:2548] INFO - command executed successfully |
[2017-01-12 12:59:29,152] - [fts_base:1605] INFO - 0 |
[2017-01-12 12:59:29,267] - [fts_base:895] INFO - Validated: all index files for travel-index deleted from disk |
[2017-01-12 12:59:29,267] - [fts_base:1697] INFO - removing nodes from cluster ... |
[2017-01-12 12:59:29,282] - [fts_base:1699] INFO - cleanup [ip:172.23.108.213 port:8091 ssh_username:root] |
[2017-01-12 12:59:29,307] - [bucket_helper:138] INFO - deleting existing buckets [u'travel-sample'] on 172.23.108.213 |
[2017-01-12 12:59:29,307] - [bucket_helper:140] INFO - remove bucket travel-sample ... |
[2017-01-12 12:59:59,323] - [rest_client:790] ERROR - [http://172.23.108.213:8091/pools/default/buckets/travel-sample] error 500 reason: unknown \{"_":"Bucket deletion not yet complete, but will continue.\r\n"} |
[2017-01-12 12:59:59,324] - [rest_client:1880] WARNING - Bucket deletion timed out waiting for all nodes |
[2017-01-12 12:59:59,342] - [bucket_helper:153] ERROR - Unable to get timings for bucket |
[2017-01-12 12:59:59,342] - [bucket_helper:154] INFO - deleted bucket : travel-sample from 172.23.108.213 |
[2017-01-12 12:59:59,343] - [bucket_helper:230] INFO - waiting for bucket deletion to complete.... |
[2017-01-12 12:59:59,349] - [rest_client:134] INFO - node 172.23.108.213 existing buckets : [] |
[2017-01-12 12:59:59,382] - [cluster_helper:78] INFO - waiting for ns_server @ 172.23.108.213:8091 |
[2017-01-12 12:59:59,388] - [cluster_helper:80] INFO - ns_server @ 172.23.108.213:8091 is running |
Cluster instance shutdown with force
|
[2017-01-12 12:59:59,389] - [fts_base:2807] INFO - ==== FTSbasetests cleanup is finished for test #132 test_doc_config === |
ok
|
|
----------------------------------------------------------------------
|
Ran 1 test in 107.368s |
|
OK
|
Cluster instance shutdown with force
|
grabbing core dumps files from 172.23.108.213 |
[2017-01-12 12:59:59,392] - [remote_util:181] INFO - connecting to 172.23.108.213 with username:root password:couchbase ssh_key: |
[2017-01-12 12:59:59,560] - [remote_util:215] INFO - Connected to 172.23.108.213 |
[2017-01-12 12:59:59,863] - [remote_util:2663] INFO - It could be other operating systyem. Go to check at other location |
[2017-01-12 13:00:00,337] - [remote_util:505] INFO - 172.23.108.213 **** The version file /opt/couchbase/ VERSION.txt exists |
looking for crashes on 172.23.108.213 ... |
erl_crash files under /opt/couchbase/var/lib/couchbase/
|
core* files under /opt/couchbase/var/lib/couchbase/
|
core* files under /tmp/
|
breakpad *dmp files under /opt/couchbase/var/lib/couchbase/
|
crashes not found on 172.23.108.213 |
grabbing core dumps files from 172.23.109.49 |
[2017-01-12 13:00:00,620] - [remote_util:181] INFO - connecting to 172.23.109.49 with username:root password:couchbase ssh_key: |
[2017-01-12 13:00:00,788] - [remote_util:215] INFO - Connected to 172.23.109.49 |
[2017-01-12 13:00:01,057] - [remote_util:2663] INFO - It could be other operating systyem. Go to check at other location |
[2017-01-12 13:00:01,512] - [remote_util:505] INFO - 172.23.109.49 **** The version file /opt/couchbase/ VERSION.txt exists |
looking for crashes on 172.23.109.49 ... |
erl_crash files under /opt/couchbase/var/lib/couchbase/
|
core* files under /opt/couchbase/var/lib/couchbase/
|
core* files under /tmp/
|
breakpad *dmp files under /opt/couchbase/var/lib/couchbase/
|
crashes not found on 172.23.109.49 |
summary so far suite fts.stable_topology_fts.StableTopFTS , pass 123 , fail 9 |
failures so far...
|
fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
|
fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
|
fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
|
fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
|
fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
|
fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
|
fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
|
fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
|
fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
|
testrunner logs, diags and results are available under /data/workspace/cen006-p0-fts-vset00-00-custom-map-rqg/logs/testrunner-17-Jan-12_10-32-40/test_132 |
Logs will be stored at /data/workspace/cen006-p0-fts-vset00-00-custom-map-rqg/logs/testrunner-17-Jan-12_10-32-40/test_133 |
|
./testrunner -i FTS_INI_FILE.ini get-cbcollect-info=True,get-coredumps=True,get-logs=False,stop-on-failure=False,cluster=D+F,GROUP=ALL -t fts.stable_topology_fts.StableTopFTS.test_doc_config,mode=docid_regexp_neg1,query=\{"query": "Delta"},expected_hits=0 |
|
Test Input params:
|
\{'get-coredumps': 'True', 'logs_folder': '/data/workspace/cen006-p0-fts-vset00-00-custom-map-rqg/logs/testrunner-17-Jan-12_10-32-40/test_133', 'GROUP': 'ALL', 'expected_hits': '0', 'conf_file': 'conf/fts/py-fts-custmap-rqg-queries.conf', 'num_nodes': 2, 'cluster_name': 'FTS_INI_FILE', 'cluster': 'D+F', 'stop-on-failure': 'False', 'mode': 'docid_regexp_neg1', 'case_number': 133, 'get-logs': 'False', 'query': '\{"query": "Delta"}', 'get-cbcollect-info': 'True', 'spec': 'py-fts-custmap-rqg-queries', 'ini': 'FTS_INI_FILE.ini'} |
[2017-01-12 13:00:01,696] - [remote_util:181] INFO - connecting to 172.23.108.213 with username:root password:couchbase ssh_key: |
[2017-01-12 13:00:01,864] - [remote_util:215] INFO - Connected to 172.23.108.213 |
[2017-01-12 13:00:02,132] - [remote_util:2663] INFO - It could be other operating systyem. Go to check at other location |
[2017-01-12 13:00:02,652] - [remote_util:505] INFO - 172.23.108.213 **** The version file /opt/couchbase/ VERSION.txt exists |
looking for Erlang/Memcached crashes on 172.23.108.213 ... |
dump files not found on 172.23.108.213 |
[2017-01-12 13:00:02,792] - [remote_util:181] INFO - connecting to 172.23.109.49 with username:root password:couchbase ssh_key: |
[2017-01-12 13:00:02,960] - [remote_util:215] INFO - Connected to 172.23.109.49 |
[2017-01-12 13:00:03,236] - [remote_util:2663] INFO - It could be other operating systyem. Go to check at other location |
[2017-01-12 13:00:03,712] - [remote_util:505] INFO - 172.23.109.49 **** The version file /opt/couchbase/ VERSION.txt exists |
test_doc_config (fts.stable_topology_fts.StableTopFTS) ... looking for Erlang/Memcached crashes on 172.23.109.49 ... |
dump files not found on 172.23.109.49 |
[2017-01-12 13:00:03,879] - [fts_base:2731] INFO - ==== FTSbasetests setup is started for test #133 test_doc_config ==== |
[2017-01-12 13:00:03,892] - [fts_base:1697] INFO - removing nodes from cluster ... |
[2017-01-12 13:00:03,903] - [fts_base:1699] INFO - cleanup [ip:172.23.108.213 port:8091 ssh_username:root] |
[2017-01-12 13:00:03,919] - [bucket_helper:138] INFO - deleting existing buckets [] on 172.23.108.213 |
[2017-01-12 13:00:03,951] - [cluster_helper:78] INFO - waiting for ns_server @ 172.23.108.213:8091 |
[2017-01-12 13:00:03,957] - [cluster_helper:80] INFO - ns_server @ 172.23.108.213:8091 is running |
[2017-01-12 13:00:03,957] - [fts_base:1642] INFO - Initializing Cluster ... |
[2017-01-12 13:00:04,895] - [task:117] INFO - server: ip:172.23.108.213 port:8091 ssh_username:root, nodes/self: \{'ip': u'127.0.0.1', 'availableStorage': [], 'rest_username': '', 'id': u'ns_1@127.0.0.1', 'uptime': u'8925', 'mcdMemoryReserved': 3048, 'hostname': u'172.23.108.213:8091', 'storage': [<membase.api.rest_client.NodeDataStorage object at 0x7f5391f36650>], 'moxi': 11211, 'port': u'8091', 'version': u'5.0.0-1648-enterprise', 'memcached': 11210, 'status': u'healthy', 'clusterCompatibility': 327680, 'curr_items': 0, 'services': [u'fts', u'kv'], 'rest_password': '', 'clusterMembership': u'active', 'memoryFree': 1955012608, 'memoryTotal': 3996319744, 'memoryQuota': 2032, 'mcdMemoryAllocated': 3048, 'os': u'x86_64-unknown-linux-gnu', 'ports': []} |
[2017-01-12 13:00:04,895] - [rest_client:891] INFO - pools/default params : memoryQuota=2032 |
[2017-01-12 13:00:04,900] - [rest_client:928] INFO - settings/indexes params : storageMode=forestdb |
[2017-01-12 13:00:04,905] - [rest_client:807] INFO - settings/web params on 172.23.108.213:8091:username=Administrator&password=password&port=8091 |
[2017-01-12 13:00:05,945] - [rest_client:1304] INFO - rebalance params : password=password&ejectedNodes=&user=Administrator&knownNodes=ns_1%40127.0.0.1 |
[2017-01-12 13:00:05,951] - [rest_client:1308] INFO - rebalance operation started |
[2017-01-12 13:00:05,969] - [rest_client:1439] INFO - rebalance percentage : 0.00 % |
[2017-01-12 13:00:15,993] - [rest_client:1439] INFO - rebalance percentage : 0.00 % |
[2017-01-12 13:00:26,055] - [rest_client:1422] ERROR - \{u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'} - rebalance failed |
[2017-01-12 13:00:26,079] - [rest_client:2702] INFO - Latest logs from UI on 172.23.108.213: |
[2017-01-12 13:00:26,079] - [rest_client:2703] ERROR - \{u'node': u'ns_1@127.0.0.1', u'code': 2, u'text': u'Rebalance exited with reason \{buckets_shutdown_wait_failed,\n [\{\'ns_1@127.0.0.1\',\n \{\'EXIT\',\n \{old_buckets_shutdown_wait_failed,n ["travel-sample"]} |
|
}}]}\n', u'shortText': u'message', u'serverTime': u'2017-01-12T13:00:25.957Z', u'module': u'ns_orchestrator', u'tstamp': 1484254825957, u'type': u'info'} |
[2017-01-12 13:00:26,079] - [rest_client:2703] ERROR - \{u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Failed to wait deletion of some buckets on some nodes: [\{\'ns_1@127.0.0.1\',\n \{\'EXIT\',\n |
|
\{old_buckets_shutdown_wait_failed,\n ["travel-sample"]} |
|
}}]\n', u'shortText': u'message', u'serverTime': u'2017-01-12T13:00:25.956Z', u'module': u'ns_rebalancer', u'tstamp': 1484254825956, u'type': u'critical'} |
[2017-01-12 13:00:26,080] - [rest_client:2703] ERROR - |
|
\{u'node': u'ns_1@127.0.0.1', u'code': 4, u'text': u"Starting rebalance, KeepNodes = ['ns_1@127.0.0.1'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes\n", u'shortText': u'message', u'serverTime': u'2017-01-12T13:00:05.944Z', u'module': u'ns_orchestrator', u'tstamp': 1484254805944, u'type': u'info'} |
|
[2017-01-12 13:00:26,080] - [rest_client:2703] ERROR - |
|
\{u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Shutting down bucket "travel-sample" on \'ns_1@127.0.0.1\' for deletion', u'shortText': u'message', u'serverTime': u'2017-01-12T12:59:29.410Z', u'module': u'ns_memcached', u'tstamp': 1484254769410, u'type': u'info'} |
|
[2017-01-12 13:00:26,080] - [rest_client:2703] ERROR - \{u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Loading sample bucket travel-sample failed: |
|
\{failed_to_load_samples_with_status,\n 1} |
|
', u'shortText': u'message', u'serverTime': u'2017-01-12T12:58:49.519Z', u'module': u'samples_loader_tasks', u'tstamp': 1484254729519, u'type': u'critical'} |
[2017-01-12 13:00:26,080] - [rest_client:2703] ERROR - |
|
\{u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Bucket "travel-sample" loaded on node \'ns_1@127.0.0.1\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2017-01-12T12:58:33.630Z', u'module': u'ns_memcached', u'tstamp': 1484254713630, u'type': u'info'} |
|
[2017-01-12 13:00:26,080] - [rest_client:2703] ERROR - \{u'node': u'ns_1@127.0.0.1', u'code': 12, u'text': u'Created bucket "travel-sample" of type: couchbase\n[ |
|
\{num_replicas,1} |
|
,\n
|
|
\{replica_index,false} |
|
,\n
|
|
\{ram_quota,104857600} |
|
,\n
|
|
\{auth_type,sasl}
|
|
,\n
|
|
\{flush_enabled,false} |
|
,\n
|
|
\{num_threads,3} |
|
,\n
|
|
\{eviction_policy,value_only}
|
|
,\n
|
|
\{conflict_resolution_type,seqno}
|
|
]', u'shortText': u'message', u'serverTime': u'2017-01-12T12:58:33.521Z', u'module': u'menelaus_web', u'tstamp': 1484254713521, u'type': u'info'} |
[2017-01-12 13:00:26,081] - [rest_client:2703] ERROR - |
|
\{u'node': u'ns_1@127.0.0.1', u'code': 11, u'text': u'Deleted bucket "default"\n', u'shortText': u'message', u'serverTime': u'2017-01-12T12:58:32.467Z', u'module': u'menelaus_web', u'tstamp': 1484254712467, u'type': u'info'} |
|
[2017-01-12 13:00:26,081] - [rest_client:2703] ERROR - |
|
\{u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Shutting down bucket "default" on \'ns_1@127.0.0.1\' for deletion', u'shortText': u'message', u'serverTime': u'2017-01-12T12:58:31.888Z', u'module': u'ns_memcached', u'tstamp': 1484254711888, u'type': u'info'} |
|
[2017-01-12 13:00:26,081] - [rest_client:2703] ERROR - |
|
\{u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Bucket "default" loaded on node \'ns_1@127.0.0.1\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2017-01-12T12:58:25.280Z', u'module': u'ns_memcached', u'tstamp': 1484254705280, u'type': u'info'} |
|
ERROR
|
NOTE : Not seeing this issue with other buckets, but for travel-sample, this issue is consistently seen.