Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-27287

[FTS - Scorch toybuild] Service rebalance fails from call: ServiceAPI.GetTaskList",\n #Fun<json_rpc_connection.0.123522448>

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 6.5.0
    • 5.5.0
    • fts
    • Untriaged
    • Centos 64-bit
    • Unknown

    Description

      Build
      5.1.0-1004

      Testcase
      /data/workspace/centos-p0-fts-vset00-00-stabletopologyP1/logs/testrunner-17-Dec-14_15-29-04/test_5

      ./testrunner -i /tmp/testexec.14978.ini cluster=D+F,get-cbcollect-info=True,GROUP=P1 -t fts.stable_topology_fts.StableTopFTS.index_utf16_dataset,items=100,GROUP=P1

      Please note that this rebalance failure happens during the setup phase of the test and hence most probably has to do with the cluster state at the time of rebalance and the operations that were done in the previous tests.

      There were no indexes or buckets at the time of this rebalance failure. .208 is a single node cluster with kv+fts.

      Test Input params:
      {'logs_folder': '/data/workspace/centos-p0-fts-vset00-00-stabletopologyP1/logs/testrunner-17-Dec-14_15-29-04/test_5', 'GROUP': 'P1', 'items': '100', 'conf_file': 'fts/py-fts-simpletopology.conf', 'num_nodes': 2, 'cluster_name': 'testexec.14978', 'cluster': 'D+F', 'ini': '/tmp/testexec.14978.ini', 'case_number': 5, 'get-cbcollect-info': 'True', 'spec': 'py-fts-simpletopology'}
      [2017-12-14 15:36:22,486] - [fts_base:2818] INFO - ==== FTSbasetests setup is started for test #5 index_utf16_dataset ====
      [2017-12-14 15:36:33,715] - [fts_base:1715] INFO - removing nodes from cluster ...
      [2017-12-14 15:36:35,280] - [fts_base:1717] INFO - cleanup [ip:172.23.108.208 port:8091 ssh_username:root]
      [2017-12-14 15:36:36,393] - [bucket_helper:142] INFO - deleting existing buckets [] on 172.23.108.208
      [2017-12-14 15:36:36,394] - [bucket_helper:168] INFO - sleep 2 seconds to make sure all buckets were deleted completely.
      [2017-12-14 15:36:39,044] - [cluster_helper:78] INFO - waiting for ns_server @ 172.23.108.208:8091
      [2017-12-14 15:36:39,227] - [cluster_helper:80] INFO - ns_server @ 172.23.108.208:8091 is running
      [2017-12-14 15:36:39,228] - [fts_base:1740] INFO - Removing user 'cbadminbucket'...
      [2017-12-14 15:36:47,751] - [rest_client:804] ERROR - DELETE http://172.23.108.208:8091/settings/rbac/users/local/cbadminbucket body:  headers: {'Content-Type': 'application/x-www-form-urlencoded', 'Accept': '*/*', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==\n'} error: 404 reason: unknown "User was not found." auth: Administrator:password
      [2017-12-14 15:36:47,752] - [fts_base:1744] INFO - "User was not found."
      [2017-12-14 15:36:47,753] - [fts_base:1661] INFO - Initializing Cluster ...
      [2017-12-14 15:36:49,521] - [task:125] INFO - server: ip:172.23.108.208 port:8091 ssh_username:root, nodes/self: {'ip': u'127.0.0.1', 'availableStorage': [], 'rest_username': '', 'id': u'ns_1@127.0.0.1', 'uptime': u'482', 'mcdMemoryReserved': 3048, 'storageTotalRam': 3811, 'hostname': u'172.23.108.208:8091', 'storage': [<membase.api.rest_client.NodeDataStorage object at 0x7f2e28b93b50>], 'moxi': 11211, 'port': u'8091', 'version': u'5.1.0-1004-enterprise', 'memcached': 11210, 'status': u'healthy', 'clusterCompatibility': 327681, 'curr_items': 0, 'services': [u'fts', u'kv'], 'rest_password': '', 'clusterMembership': u'active', 'memoryFree': 603607040, 'memoryTotal': 3996319744, 'memoryQuota': 2032, 'mcdMemoryAllocated': 3048, 'os': u'x86_64-unknown-linux-gnu', 'ports': []}
      [2017-12-14 15:36:49,522] - [rest_client:903] INFO - pools/default params : memoryQuota=2032
      [2017-12-14 15:36:50,431] - [rest_client:937] INFO - settings/indexes params : storageMode=forestdb
      [2017-12-14 15:36:50,725] - [rest_client:804] ERROR - POST http://172.23.108.208:8091/settings/indexes body: storageMode=forestdb headers: {'Content-Type': 'application/x-www-form-urlencoded', 'Accept': '*/*', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==\n'} error: 400 reason: unknown {"errors":{"storageMode":"Storage mode cannot be set to 'forestdb' in 5.0 enterprise edition."}} auth: Administrator:password
      [2017-12-14 15:36:50,726] - [rest_client:822] INFO - settings/web params on 172.23.108.208:8091:username=Administrator&password=password&port=8091
      [2017-12-14 15:37:02,370] - [rest_client:1418] INFO - rebalance params : {'password': 'password', 'ejectedNodes': '', 'user': 'Administrator', 'knownNodes': u'ns_1@127.0.0.1'}
      [2017-12-14 15:37:02,845] - [rest_client:1423] INFO - rebalance operation started
      [2017-12-14 15:37:02,982] - [rest_client:1571] INFO - rebalance percentage : 0.00 %
      [2017-12-14 15:37:02,983] - [task:491] INFO - Rebalance - status: running, progress: 0
      [2017-12-14 15:37:16,607] - [rest_client:1571] INFO - rebalance percentage : 50.00 %
      [2017-12-14 15:37:16,608] - [task:491] INFO - Rebalance - status: running, progress: 50.0
      [2017-12-14 15:37:33,145] - [rest_client:1571] INFO - rebalance percentage : 50.00 %
      [2017-12-14 15:37:33,145] - [task:491] INFO - Rebalance - status: running, progress: 50.0
      [2017-12-14 15:37:49,564] - [rest_client:1571] INFO - rebalance percentage : 50.00 %
      [2017-12-14 15:37:49,565] - [task:491] INFO - Rebalance - status: running, progress: 50.0
      [2017-12-14 15:38:00,373] - [rest_client:1554] ERROR - {u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try again.'} - rebalance failed
      [2017-12-14 15:38:01,885] - [rest_client:3029] INFO - Latest logs from UI on 172.23.108.208:
      [2017-12-14 15:38:01,885] - [rest_client:3030] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Rebalance exited with reason {service_rebalance_failed,fts,\n                              {linked_process_died,<0.16163.0>,\n                               {timeout,\n                                {gen_server,call,\n                                 [<0.763.0>,\n                                  {call,"ServiceAPI.GetTaskList",\n                                   #Fun<json_rpc_connection.0.123522448>},\n                                  60000]}}}}', u'shortText': u'message', u'serverTime': u'2017-12-14T17:37:56.732Z', u'module': u'ns_orchestrator', u'tstamp': 1513294676732, u'type': u'critical'}
      [2017-12-14 15:38:01,886] - [rest_client:3030] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 4, u'text': u"Starting rebalance, KeepNodes = ['ns_1@127.0.0.1'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes\n", u'shortText': u'message', u'serverTime': u'2017-12-14T17:37:02.591Z', u'module': u'ns_orchestrator', u'tstamp': 1513294622591, u'type': u'info'}
      [2017-12-14 15:38:01,886] - [rest_client:3030] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 11, u'text': u'Deleted bucket "default"\n', u'shortText': u'message', u'serverTime': u'2017-12-14T17:36:06.357Z', u'module': u'menelaus_web', u'tstamp': 1513294566357, u'type': u'info'}
      [2017-12-14 15:38:01,887] - [rest_client:3030] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Shutting down bucket "default" on \'ns_1@127.0.0.1\' for deletion', u'shortText': u'message', u'serverTime': u'2017-12-14T17:35:59.211Z', u'module': u'ns_memcached', u'tstamp': 1513294559211, u'type': u'info'}
      [2017-12-14 15:38:01,887] - [rest_client:3030] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Bucket "default" loaded on node \'ns_1@127.0.0.1\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2017-12-14T17:34:56.708Z', u'module': u'ns_memcached', u'tstamp': 1513294496708, u'type': u'info'}
      [2017-12-14 15:38:01,887] - [rest_client:3030] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 12, u'text': u'Created bucket "default" of type: couchbase\n[{num_replicas,1},\n {replica_index,true},\n {ram_quota,1606418432},\n {flush_enabled,true},\n {num_threads,3},\n {eviction_policy,value_only},\n {conflict_resolution_type,seqno},\n {storage_mode,couchstore}]', u'shortText': u'message', u'serverTime': u'2017-12-14T17:34:56.637Z', u'module': u'menelaus_web', u'tstamp': 1513294496637, u'type': u'info'}
      [2017-12-14 15:38:01,888] - [rest_client:3030] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Rebalance completed successfully.', u'shortText': u'message', u'serverTime': u'2017-12-14T17:34:25.554Z', u'module': u'ns_orchestrator', u'tstamp': 1513294465554, u'type': u'info'}
      [2017-12-14 15:38:01,888] - [rest_client:3030] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 4, u'text': u"Starting rebalance, KeepNodes = ['ns_1@127.0.0.1'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes\n", u'shortText': u'message', u'serverTime': u'2017-12-14T17:34:25.503Z', u'module': u'ns_orchestrator', u'tstamp': 1513294465503, u'type': u'info'}
      [2017-12-14 15:38:01,889] - [rest_client:3030] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 11, u'text': u'Deleted bucket "default"\n', u'shortText': u'message', u'serverTime': u'2017-12-14T17:34:18.640Z', u'module': u'menelaus_web', u'tstamp': 1513294458640, u'type': u'info'}
      [2017-12-14 15:38:01,889] - [rest_client:3030] ERROR - {u'node': u'ns_1@127.0.0.1', u'code': 0, u'text': u'Shutting down bucket "default" on \'ns_1@127.0.0.1\' for deletion', u'shortText': u'message', u'serverTime': u'2017-12-14T17:34:17.226Z', u'module': u'ns_memcached', u'tstamp': 1513294457226, u'type': u'info'}
      ERROR
      
      

      I see no panic in the logs. Sorry, the timestamps are not matching, I'm making a note to update ntp right away.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            apiravi Aruna Piravi (Inactive)
            apiravi Aruna Piravi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty