Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-22317

Spatial view deletion hangs

    XMLWordPrintable

Details

    • Triaged
    • Yes

    Description

      Build : 5.0.0-1648

      Tests : To reproduce this issue, add the following commands in a shell script and run the script.

      ./testrunner -i FTS_INI_FILE.ini get-cbcollect-info=True,get-coredumps=True,get-logs=False,stop-on-failure=False,cluster=D+F,GROUP=ALL -t fts.stable_topology_fts.StableTopFTS.test_doc_config,mode=docid_regexp2,query=

      {"query":"Delta"}

      ,expected_hits=4

      ./testrunner -i FTS_INI_FILE.ini get-cbcollect-info=True,get-coredumps=True,get-logs=False,stop-on-failure=False,cluster=D+F,GROUP=ALL -t fts.stable_topology_fts.StableTopFTS.test_doc_config,mode=docid_regexp_neg1,query=

      {"query":"Delta"}

      ,expected_hits=0

      Issue:
      In some FTS tests where we use the travel-sample buckets, the buckets are deleted in the cleanup phase for the test. The subsequent test's setup method then rebalances out the nodes in the cluster and adds them back. This rebalance-in operation fails with the following error -

      [2017-01-12 13:00:26,079] - [rest_client:2703] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 2, u'text': u'Rebalance exited with reason {buckets_shutdown_wait_failed,\n [{\'ns_1@127.0.0.1\',\n {\'EXIT\',\n

      {old_buckets_shutdown_wait_failed,\n ["travel-sample"]}}}]}\n', u'shortText': u'message', u'serverTime': u'2017-01-12T13:00:25.957Z', u'module': u'ns_orchestrator', u'tstamp': 1484254825957, u'type': u'info'}
      [2017-01-12 13:00:26,079] - [rest_client:2703] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Failed to wait deletion of some buckets on some nodes: [{\'ns_1@127.0.0.1\',\n {\'EXIT\',\n {old_buckets_shutdown_wait_failed,\n ["travel-sample"]}}}]\n', u'shortText': u'message', u'serverTime': u'2017-01-12T13:00:25.956Z', u'module': u'ns_rebalancer', u'tstamp': 1484254825956, u'type': u'critical'}

      Here's the testrunner console output from the previous test's cleanup and the subsequent's setup :

       [2017-01-12 12:59:28,073] - [fts_base:2801] INFO - ==== FTSbasetests cleanup is started for test #132 test_doc_config ====
       [2017-01-12 12:59:28,095] - [fts_base:888] INFO - Deleting fulltext-index travel-index on 172.23.108.213
       [2017-01-12 12:59:28,139] - [remote_util:181] INFO - connecting to 172.23.108.213 with username:root password:couchbase ssh_key:
       [2017-01-12 12:59:28,308] - [remote_util:215] INFO - Connected to 172.23.108.213
       [2017-01-12 12:59:28,606] - [remote_util:2663] INFO - It could be other operating systyem. Go to check at other location
       [2017-01-12 12:59:29,061] - [remote_util:2511] INFO - running command.raw on 172.23.108.213: ls /opt/couchbase/var/lib/couchbase/data/@fts |grep travel-index*.pindex | wc -l
       [2017-01-12 12:59:29,150] - [remote_util:2548] INFO - command executed successfully
       [2017-01-12 12:59:29,152] - [fts_base:1605] INFO - 0
       [2017-01-12 12:59:29,267] - [fts_base:895] INFO - Validated: all index files for travel-index deleted from disk
       [2017-01-12 12:59:29,267] - [fts_base:1697] INFO - removing nodes from cluster ...
       [2017-01-12 12:59:29,282] - [fts_base:1699] INFO - cleanup [ip:172.23.108.213 port:8091 ssh_username:root]
       [2017-01-12 12:59:29,307] - [bucket_helper:138] INFO - deleting existing buckets [u'travel-sample'] on 172.23.108.213
       [2017-01-12 12:59:29,307] - [bucket_helper:140] INFO - remove bucket travel-sample ...
       [2017-01-12 12:59:59,323] - [rest_client:790] ERROR - [http://172.23.108.213:8091/pools/default/buckets/travel-sample] error 500 reason: unknown \{"_":"Bucket deletion not yet complete, but will continue.\r\n"}
       [2017-01-12 12:59:59,324] - [rest_client:1880] WARNING - Bucket deletion timed out waiting for all nodes
       [2017-01-12 12:59:59,342] - [bucket_helper:153] ERROR - Unable to get timings for bucket
       [2017-01-12 12:59:59,342] - [bucket_helper:154] INFO - deleted bucket : travel-sample from 172.23.108.213
       [2017-01-12 12:59:59,343] - [bucket_helper:230] INFO - waiting for bucket deletion to complete....
       [2017-01-12 12:59:59,349] - [rest_client:134] INFO - node 172.23.108.213 existing buckets : []
       [2017-01-12 12:59:59,382] - [cluster_helper:78] INFO - waiting for ns_server @ 172.23.108.213:8091
       [2017-01-12 12:59:59,388] - [cluster_helper:80] INFO - ns_server @ 172.23.108.213:8091 is running
       Cluster instance shutdown with force
       [2017-01-12 12:59:59,389] - [fts_base:2807] INFO - ==== FTSbasetests cleanup is finished for test #132 test_doc_config ===
       ok
       
       ----------------------------------------------------------------------
       Ran 1 test in 107.368s
       
       OK
       Cluster instance shutdown with force
       grabbing core dumps files from 172.23.108.213
       [2017-01-12 12:59:59,392] - [remote_util:181] INFO - connecting to 172.23.108.213 with username:root password:couchbase ssh_key:
       [2017-01-12 12:59:59,560] - [remote_util:215] INFO - Connected to 172.23.108.213
       [2017-01-12 12:59:59,863] - [remote_util:2663] INFO - It could be other operating systyem. Go to check at other location
       [2017-01-12 13:00:00,337] - [remote_util:505] INFO - 172.23.108.213 **** The version file /opt/couchbase/ VERSION.txt exists
       looking for crashes on 172.23.108.213 ... 
       erl_crash files under /opt/couchbase/var/lib/couchbase/
       core* files under /opt/couchbase/var/lib/couchbase/
       core* files under /tmp/
       breakpad *dmp files under /opt/couchbase/var/lib/couchbase/
       crashes not found on 172.23.108.213
       grabbing core dumps files from 172.23.109.49
       [2017-01-12 13:00:00,620] - [remote_util:181] INFO - connecting to 172.23.109.49 with username:root password:couchbase ssh_key:
       [2017-01-12 13:00:00,788] - [remote_util:215] INFO - Connected to 172.23.109.49
       [2017-01-12 13:00:01,057] - [remote_util:2663] INFO - It could be other operating systyem. Go to check at other location
       [2017-01-12 13:00:01,512] - [remote_util:505] INFO - 172.23.109.49 **** The version file /opt/couchbase/ VERSION.txt exists
       looking for crashes on 172.23.109.49 ... 
       erl_crash files under /opt/couchbase/var/lib/couchbase/
       core* files under /opt/couchbase/var/lib/couchbase/
       core* files under /tmp/
       breakpad *dmp files under /opt/couchbase/var/lib/couchbase/
       crashes not found on 172.23.109.49
       summary so far suite fts.stable_topology_fts.StableTopFTS , pass 123 , fail 9
       failures so far...
       fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
       fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
       fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
       fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
       fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
       fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
       fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
       fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
       fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping
       testrunner logs, diags and results are available under /data/workspace/cen006-p0-fts-vset00-00-custom-map-rqg/logs/testrunner-17-Jan-12_10-32-40/test_132
       Logs will be stored at /data/workspace/cen006-p0-fts-vset00-00-custom-map-rqg/logs/testrunner-17-Jan-12_10-32-40/test_133
       
       ./testrunner -i FTS_INI_FILE.ini get-cbcollect-info=True,get-coredumps=True,get-logs=False,stop-on-failure=False,cluster=D+F,GROUP=ALL -t fts.stable_topology_fts.StableTopFTS.test_doc_config,mode=docid_regexp_neg1,query=\{"query": "Delta"},expected_hits=0
       
       Test Input params:
       \{'get-coredumps': 'True', 'logs_folder': '/data/workspace/cen006-p0-fts-vset00-00-custom-map-rqg/logs/testrunner-17-Jan-12_10-32-40/test_133', 'GROUP': 'ALL', 'expected_hits': '0', 'conf_file': 'conf/fts/py-fts-custmap-rqg-queries.conf', 'num_nodes': 2, 'cluster_name': 'FTS_INI_FILE', 'cluster': 'D+F', 'stop-on-failure': 'False', 'mode': 'docid_regexp_neg1', 'case_number': 133, 'get-logs': 'False', 'query': '\{"query": "Delta"}', 'get-cbcollect-info': 'True', 'spec': 'py-fts-custmap-rqg-queries', 'ini': 'FTS_INI_FILE.ini'}
       [2017-01-12 13:00:01,696] - [remote_util:181] INFO - connecting to 172.23.108.213 with username:root password:couchbase ssh_key:
       [2017-01-12 13:00:01,864] - [remote_util:215] INFO - Connected to 172.23.108.213
       [2017-01-12 13:00:02,132] - [remote_util:2663] INFO - It could be other operating systyem. Go to check at other location
       [2017-01-12 13:00:02,652] - [remote_util:505] INFO - 172.23.108.213 **** The version file /opt/couchbase/ VERSION.txt exists
       looking for Erlang/Memcached crashes on 172.23.108.213 ... 
       dump files not found on 172.23.108.213
       [2017-01-12 13:00:02,792] - [remote_util:181] INFO - connecting to 172.23.109.49 with username:root password:couchbase ssh_key:
       [2017-01-12 13:00:02,960] - [remote_util:215] INFO - Connected to 172.23.109.49
       [2017-01-12 13:00:03,236] - [remote_util:2663] INFO - It could be other operating systyem. Go to check at other location
       [2017-01-12 13:00:03,712] - [remote_util:505] INFO - 172.23.109.49 **** The version file /opt/couchbase/ VERSION.txt exists
       test_doc_config (fts.stable_topology_fts.StableTopFTS) ... looking for Erlang/Memcached crashes on 172.23.109.49 ... 
       dump files not found on 172.23.109.49
       [2017-01-12 13:00:03,879] - [fts_base:2731] INFO - ==== FTSbasetests setup is started for test #133 test_doc_config ====
       [2017-01-12 13:00:03,892] - [fts_base:1697] INFO - removing nodes from cluster ...
       [2017-01-12 13:00:03,903] - [fts_base:1699] INFO - cleanup [ip:172.23.108.213 port:8091 ssh_username:root]
       [2017-01-12 13:00:03,919] - [bucket_helper:138] INFO - deleting existing buckets [] on 172.23.108.213
       [2017-01-12 13:00:03,951] - [cluster_helper:78] INFO - waiting for ns_server @ 172.23.108.213:8091
       [2017-01-12 13:00:03,957] - [cluster_helper:80] INFO - ns_server @ 172.23.108.213:8091 is running
       [2017-01-12 13:00:03,957] - [fts_base:1642] INFO - Initializing Cluster ...
       [2017-01-12 13:00:04,895] - [task:117] INFO - server: ip:172.23.108.213 port:8091 ssh_username:root, nodes/self: \{'ip': u'127.0.0.1', 'availableStorage': [], 'rest_username': '', 'id': u'ns_1@127.0.0.1', 'uptime': u'8925', 'mcdMemoryReserved': 3048, 'hostname': u'172.23.108.213:8091', 'storage': [<membase.api.rest_client.NodeDataStorage object at 0x7f5391f36650>], 'moxi': 11211, 'port': u'8091', 'version': u'5.0.0-1648-enterprise', 'memcached': 11210, 'status': u'healthy', 'clusterCompatibility': 327680, 'curr_items': 0, 'services': [u'fts', u'kv'], 'rest_password': '', 'clusterMembership': u'active', 'memoryFree': 1955012608, 'memoryTotal': 3996319744, 'memoryQuota': 2032, 'mcdMemoryAllocated': 3048, 'os': u'x86_64-unknown-linux-gnu', 'ports': []}
       [2017-01-12 13:00:04,895] - [rest_client:891] INFO - pools/default params : memoryQuota=2032
       [2017-01-12 13:00:04,900] - [rest_client:928] INFO - settings/indexes params : storageMode=forestdb
       [2017-01-12 13:00:04,905] - [rest_client:807] INFO - settings/web params on 172.23.108.213:8091:username=Administrator&password=password&port=8091
       [2017-01-12 13:00:05,945] - [rest_client:1304] INFO - rebalance params : password=password&ejectedNodes=&user=Administrator&knownNodes=ns_1%40127.0.0.1
       [2017-01-12 13:00:05,951] - [rest_client:1308] INFO - rebalance operation started
       [2017-01-12 13:00:05,969] - [rest_client:1439] INFO - rebalance percentage : 0.00 %
       [2017-01-12 13:00:15,993] - [rest_client:1439] INFO - rebalance percentage : 0.00 %
       [2017-01-12 13:00:26,055] - [rest_client:1422] ERROR - \{u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'} - rebalance failed
       [2017-01-12 13:00:26,079] - [rest_client:2702] INFO - Latest logs from UI on 172.23.108.213:
       [2017-01-12 13:00:26,079] - [rest_client:2703] ERROR - \{u'node': u'ns_1@127.0.0.1', u'code': 2, u'text': u'Rebalance exited with reason \{buckets_shutdown_wait_failed,\n [\{\'ns_1@127.0.0.1\',\n \{\'EXIT\',\n \{old_buckets_shutdown_wait_failed,n ["travel-sample"]}
       
      }}]}\n', u'shortText': u'message', u'serverTime': u'2017-01-12T13:00:25.957Z', u'module': u'ns_orchestrator', u'tstamp': 1484254825957, u'type': u'info'}
       [2017-01-12 13:00:26,079] - [rest_client:2703] ERROR - \{u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Failed to wait deletion of some buckets on some nodes: [\{\'ns_1@127.0.0.1\',\n \{\'EXIT\',\n
       
      \{old_buckets_shutdown_wait_failed,\n ["travel-sample"]}
       
      }}]\n', u'shortText': u'message', u'serverTime': u'2017-01-12T13:00:25.956Z', u'module': u'ns_rebalancer', u'tstamp': 1484254825956, u'type': u'critical'}
       [2017-01-12 13:00:26,080] - [rest_client:2703] ERROR -
       
      \{u'node': u'ns_1@127.0.0.1', u'code': 4, u'text': u"Starting rebalance, KeepNodes = ['ns_1@127.0.0.1'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes\n", u'shortText': u'message', u'serverTime': u'2017-01-12T13:00:05.944Z', u'module': u'ns_orchestrator', u'tstamp': 1484254805944, u'type': u'info'}
       
      [2017-01-12 13:00:26,080] - [rest_client:2703] ERROR -
       
      \{u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Shutting down bucket "travel-sample" on \'ns_1@127.0.0.1\' for deletion', u'shortText': u'message', u'serverTime': u'2017-01-12T12:59:29.410Z', u'module': u'ns_memcached', u'tstamp': 1484254769410, u'type': u'info'}
       
      [2017-01-12 13:00:26,080] - [rest_client:2703] ERROR - \{u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Loading sample bucket travel-sample failed:
       
      \{failed_to_load_samples_with_status,\n 1}
       
      ', u'shortText': u'message', u'serverTime': u'2017-01-12T12:58:49.519Z', u'module': u'samples_loader_tasks', u'tstamp': 1484254729519, u'type': u'critical'}
       [2017-01-12 13:00:26,080] - [rest_client:2703] ERROR -
       
      \{u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Bucket "travel-sample" loaded on node \'ns_1@127.0.0.1\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2017-01-12T12:58:33.630Z', u'module': u'ns_memcached', u'tstamp': 1484254713630, u'type': u'info'}
       
      [2017-01-12 13:00:26,080] - [rest_client:2703] ERROR - \{u'node': u'ns_1@127.0.0.1', u'code': 12, u'text': u'Created bucket "travel-sample" of type: couchbase\n[
       
      \{num_replicas,1}
       
      ,\n
       
      \{replica_index,false}
       
      ,\n
       
      \{ram_quota,104857600}
       
      ,\n
       
      \{auth_type,sasl}
       
      ,\n
       
      \{flush_enabled,false}
       
      ,\n
       
      \{num_threads,3}
       
      ,\n
       
      \{eviction_policy,value_only}
       
      ,\n
       
      \{conflict_resolution_type,seqno}
       
      ]', u'shortText': u'message', u'serverTime': u'2017-01-12T12:58:33.521Z', u'module': u'menelaus_web', u'tstamp': 1484254713521, u'type': u'info'}
       [2017-01-12 13:00:26,081] - [rest_client:2703] ERROR -
       
      \{u'node': u'ns_1@127.0.0.1', u'code': 11, u'text': u'Deleted bucket "default"\n', u'shortText': u'message', u'serverTime': u'2017-01-12T12:58:32.467Z', u'module': u'menelaus_web', u'tstamp': 1484254712467, u'type': u'info'}
       
      [2017-01-12 13:00:26,081] - [rest_client:2703] ERROR -
       
      \{u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Shutting down bucket "default" on \'ns_1@127.0.0.1\' for deletion', u'shortText': u'message', u'serverTime': u'2017-01-12T12:58:31.888Z', u'module': u'ns_memcached', u'tstamp': 1484254711888, u'type': u'info'}
       
      [2017-01-12 13:00:26,081] - [rest_client:2703] ERROR -
       
      \{u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Bucket "default" loaded on node \'ns_1@127.0.0.1\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2017-01-12T12:58:25.280Z', u'module': u'ns_memcached', u'tstamp': 1484254705280, u'type': u'info'}
       
      ERROR
      

      NOTE : Not seeing this issue with other buckets, but for travel-sample, this issue is consistently seen.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            harsha Harsha Havanur
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty