Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7637

bucket does not respond after rebalance: Server error during processing: ["web request failed", {path,"/pools/default/buckets/default"

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • 3.0
    • 2.0.1
    • ns_server
    • Security Level: Public
    • None

    Description

      http://qa.hq.northscale.net/job/centos-64-2.0-rebalance-regressions/175/consoleFull[2013-01-29 21:48:00,345] - [rest_client:896] INFO - rebalance percentage : 97.4609375 %
      [2013-01-29 21:48:10,384] - [rest_client:896] INFO - rebalance percentage : 98.28125 %
      [2013-01-29 21:48:20,432] - [rest_client:896] INFO - rebalance percentage : 99.1015625 %
      [2013-01-29 21:48:30,530] - [task:299] INFO - rebalancing was completed with progress: 100% in 2584.67042708 sec
      [2013-01-29 21:48:32,505] - [data_helper:289] INFO - creating direct client 10.3.121.93:11210 default
      [('testrunner', 325, '<module>', 'result = unittest.TextTestRunner(verbosity=2).run(suite)'), ('/opt/python2.7/lib/python2.7/unittest/runner.py', 151, 'run', 'test(result)'), ('/opt/python2.7/lib/python2.7/unittest/suite.py', 70, '_call', 'return self.run(*args, **kwds)'), ('/opt/python2.7/lib/python2.7/unittest/suite.py', 108, 'run', 'test(result)'), ('/opt/python2.7/lib/python2.7/unittest/case.py', 391, 'call_', 'return self.run(*args, **kwds)'), ('/opt/python2.7/lib/python2.7/unittest/case.py', 327, 'run', 'testMethod()'), ('pytests/rebalance/rebalancein.py', 382, 'rebalance_in_with_ddoc_compaction', 'self.verify_cluster_stats(self.servers[:self.nodes_in + 1])'), ('pytests/basetestcase.py', 368, 'verify_cluster_stats', 'self._wait_for_stats_all_buckets(servers, timeout=(timeout or 120))'), ('pytests/basetestcase.py', 287, '_wait_for_stats_all_buckets', 'task.result(timeout)'), ('lib/tasks/future.py', 161, 'result', 'self.set_exception(TimeoutError())'), ('lib/tasks/future.py', 263, 'set_exception', 'ERROR
      print traceback.extract_stack()')]
      Tue Jan 29 21:53:41 2013
      [2013-01-29 21:53:41,932] - [basetestcase:98] INFO - ============== basetestcase cleanup was started for test #11 rebalance_in_with_ddoc_compaction ==============
      [2013-01-29 21:53:42,080] - [task:351] INFO - Saw ep_queue_size 0 == 0 expected on '10.3.121.93:8091'
      [2013-01-29 21:53:42,147] - [data_helper:289] INFO - creating direct client 10.3.121.93:11210 default
      [2013-01-29 21:53:42,155] - [bucket_helper:127] INFO - deleting existing buckets on [ip:10.3.121.93 port:8091 ssh_username:root, ip:10.3.121.98 port:8091 ssh_username:root, ip:10.3.121.94 port:8091 ssh_username:root, ip:10.3.121.95 port:8091 ssh_username:root, ip:10.3.121.96 port:8091 ssh_username:root, ip:10.3.121.97 port:8091 ssh_username:root, ip:10.3.121.92 port:8091 ssh_username:root]
      [2013-01-29 21:53:42,353] - [rest_client:578] ERROR - http://10.3.121.93:8091/pools/default/buckets/default error 500 reason: unknown ["Unexpected server error, request logged."]
      Exception in thread Cluster_Thread:
      Traceback (most recent call last):
      File "/opt/python2.7/lib/python2.7/threading.py", line 552, in __bootstrap_inner
      self.run()
      File "lib/tasks/taskmanager.py", line 31, in run
      task.step(self)
      File "lib/tasks/task.py", line 55, in step
      self.check(task_manager)
      File "lib/tasks/task.py", line 331, in check
      client = self._get_connection(server)
      File "lib/tasks/task.py", line 365, in _get_connection
      self.conns[server] = MemcachedClientHelper.direct_client(server, self.bucket)
      File "lib/memcached/helper/data_helper.py", line 296, in direct_client
      client.vbucket_count = len(vBuckets)
      TypeError: object of type 'NoneType' has no len()

      [2013-01-29 21:53:45,711] - [bucket_helper:145] INFO - deleted bucket : default from 10.3.121.93
      [2013-01-29 21:53:45,712] - [bucket_helper:215] INFO - waiting for bucket deletion to complete....
      [2013-01-29 21:53:45,723] - [rest_client:109] INFO - existing buckets : []

      after rebalance we tried to check/wait stats ep_queue_size, ep_flusher_todo ( first stacktrace)
      then after 5 minutes we send request to delete bucket and get
      [2013-01-29 21:53:42,353] - [rest_client:578] ERROR - http://10.3.121.93:8091/pools/default/buckets/default error 500 reason: unknown ["Unexpected server error, request logged."]

      [ns_server:info,2013-01-29T21:58:24.639,ns_1@10.3.121.93:<0.32070.194>:compaction_daemon:spawn_bucket_compactor:544]Compacting bucket default with config:
      [{view_fragmentation_threshold,{undefined,undefined}},
      {database_fragmentation_threshold,{undefined,undefined}},

      {parallel_db_and_view_compaction,false}

      ]
      [ns_server:info,2013-01-29T21:58:28.792,ns_1@10.3.121.93:ns_config_rep<0.14294.131>:ns_config_rep:do_pull:341]Pulling config from: 'ns_1@10.3.121.94'

      [menelaus:warn,2013-01-29T21:58:40.430,ns_1@10.3.121.93:<0.26423.194>:menelaus_web:loop:433]Server error during processing: ["web request failed",

      {path,"/pools/default/buckets/default"}

      ,

      {type,error}

      ,
      {what,{badmatch,not_present}},
      {trace,
      [

      {ns_bucket,live_bucket_nodes,1}

      ,

      {menelaus_stats,basic_stats,1}

      ,

      {menelaus_web_buckets,build_bucket_info,5}

      ,

      {menelaus_web_buckets,handle_bucket_info,5}

      ,

      {menelaus_web_buckets, checking_bucket_access,4}

      ,

      {menelaus_web,loop,3}

      ,

      {mochiweb_http,headers,5}

      ,

      {proc_lib,init_p_do_apply,3}

      ]}]
      [rebalance:info,2013-01-29T21:58:40.434,ns_1@10.3.121.93:<0.20984.194>:ebucketmigrator_srv:do_confirm_sent_messages:701]Got close ack!

      [rebalance:info,2013-01-29T21:58:40.441,ns_1@10.3.121.93:<0.19872.194>:ebucketmigrator_srv:do_confirm_sent_messages:701]Got close ack!

      [rebalance:info,2013-01-29T21:58:40.443,ns_1@10.3.121.93:<0.19834.194>:ebucketmigrator_srv:do_confirm_sent_messages:701]Got close ack!

      [user:info,2013-01-29T21:58:40.456,ns_1@10.3.121.93:ns_memcached-default<0.11408.184>:ns_memcached:terminate:661]Shutting down bucket "default" on 'ns_1@10.3.121.93' for deletion
      [ns_server:info,2013-01-29T21:58:40.641,ns_1@10.3.121.93:ns_port_memcached<0.14372.131>:ns_port_server:log:171]memcached<0.14372.131>: Tue Jan 29 21:58:40.440409 PST 3: TAP (Consumer) eq_tapq:anon_9978 - disconnected

      I believe that all this time bucket was not available, even though I haven't found anything suspicious in the logs and stats

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            alkondratenko Aleksey Kondratenko (Inactive)
            andreibaranouski Andrei Baranouski
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty