Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6779

Apparently ep-engine continues to try to write to old revision of vbucket db continuously breaking persistence and rebalance (was: [consistent views enabled] Rebalance hungs, Write commit failure appears)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Blocker
    • 2.0-beta-2
    • None
    • couchbase-bucket
    • Security Level: Public
    • None
    • build 1776
      centOS 4 nodes cluster

    Description

      rebalance 4->5 nodes hungs

      In memcached i see lots of
      Warning: failed to open database, vbucketId = 584 fileRev = 1 numDocs = 17
      Fatal error in persisting SET ``ui0005-2008_01_05'' on vb 584!!! Requeue it...

      stats are not moving for 10.2.2.65:
      [root@localhost bin]# ./cbstats localhost:11210 tap
      ep_tap_ack_grace_period: 300
      ep_tap_ack_interval: 1000
      ep_tap_ack_window_size: 10
      ep_tap_backoff_period: 5
      ep_tap_bg_fetch_requeued: 0
      ep_tap_bg_fetched: 0
      ep_tap_bg_max_pending: 500
      ep_tap_count: 11
      ep_tap_deletes: 0
      ep_tap_fg_fetched: 0
      ep_tap_noop_interval: 20
      ep_tap_queue_backfillremaining: 0
      ep_tap_queue_backoff: 0
      ep_tap_queue_drain: 0
      ep_tap_queue_fill: 0
      ep_tap_queue_itemondisk: 0
      ep_tap_throttle_queue_cap: 1000000
      ep_tap_throttle_threshold: 90
      ep_tap_throttled: 0
      ep_tap_total_backlog_size: 0
      ep_tap_total_fetched: 380
      ep_tap_total_queue: 0
      eq_tapq:anon_260:connected: true
      eq_tapq:anon_260:created: 248
      eq_tapq:anon_260:num_checkpoint_end: 3
      eq_tapq:anon_260:num_checkpoint_end_failed: 0
      eq_tapq:anon_260:num_checkpoint_start: 42
      eq_tapq:anon_260:num_checkpoint_start_failed: 0
      eq_tapq:anon_260:num_delete: 0
      eq_tapq:anon_260:num_delete_failed: 0
      eq_tapq:anon_260:num_flush: 0
      eq_tapq:anon_260:num_flush_failed: 0
      eq_tapq:anon_260:num_mutation: 55
      eq_tapq:anon_260:num_mutation_failed: 0
      eq_tapq:anon_260:num_opaque: 82
      eq_tapq:anon_260:num_opaque_failed: 0
      eq_tapq:anon_260:num_unknown: 0
      eq_tapq:anon_260:num_vbucket_set: 0
      eq_tapq:anon_260:num_vbucket_set_failed: 0
      eq_tapq:anon_260:pending_disconnect: false
      eq_tapq:anon_260:reserved: 0
      eq_tapq:anon_260:supports_ack: true
      eq_tapq:anon_260:type: consumer
      eq_tapq:anon_461:connected: true
      eq_tapq:anon_461:created: 422
      eq_tapq:anon_461:num_checkpoint_end: 11
      eq_tapq:anon_461:num_checkpoint_end_failed: 0
      eq_tapq:anon_461:num_checkpoint_start: 44
      eq_tapq:anon_461:num_checkpoint_start_failed: 0
      eq_tapq:anon_461:num_delete: 0
      eq_tapq:anon_461:num_delete_failed: 0
      eq_tapq:anon_461:num_flush: 0
      eq_tapq:anon_461:num_flush_failed: 0
      eq_tapq:anon_461:num_mutation: 152
      eq_tapq:anon_461:num_mutation_failed: 0
      eq_tapq:anon_461:num_opaque: 80
      eq_tapq:anon_461:num_opaque_failed: 0
      eq_tapq:anon_461:num_unknown: 0
      eq_tapq:anon_461:num_vbucket_set: 0
      eq_tapq:anon_461:num_vbucket_set_failed: 0
      eq_tapq:anon_461:pending_disconnect: false
      eq_tapq:anon_461:reserved: 0
      eq_tapq:anon_461:supports_ack: true
      eq_tapq:anon_461:type: consumer
      eq_tapq:anon_566:connected: true
      eq_tapq:anon_566:created: 568
      eq_tapq:anon_566:num_checkpoint_end: 0
      eq_tapq:anon_566:num_checkpoint_end_failed: 0
      eq_tapq:anon_566:num_checkpoint_start: 1
      eq_tapq:anon_566:num_checkpoint_start_failed: 0
      eq_tapq:anon_566:num_delete: 0
      eq_tapq:anon_566:num_delete_failed: 0
      eq_tapq:anon_566:num_flush: 0
      eq_tapq:anon_566:num_flush_failed: 0
      eq_tapq:anon_566:num_mutation: 16
      eq_tapq:anon_566:num_mutation_failed: 0
      eq_tapq:anon_566:num_opaque: 4
      eq_tapq:anon_566:num_opaque_failed: 0
      eq_tapq:anon_566:num_unknown: 0
      eq_tapq:anon_566:num_vbucket_set: 0
      eq_tapq:anon_566:num_vbucket_set_failed: 0
      eq_tapq:anon_566:pending_disconnect: false
      eq_tapq:anon_566:reserved: 0
      eq_tapq:anon_566:supports_ack: true
      eq_tapq:anon_566:type: consumer
      eq_tapq:anon_567:connected: true
      eq_tapq:anon_567:created: 568
      eq_tapq:anon_567:num_checkpoint_end: 0
      eq_tapq:anon_567:num_checkpoint_end_failed: 0
      eq_tapq:anon_567:num_checkpoint_start: 1
      eq_tapq:anon_567:num_checkpoint_start_failed: 0
      eq_tapq:anon_567:num_delete: 0
      eq_tapq:anon_567:num_delete_failed: 0
      eq_tapq:anon_567:num_flush: 0
      eq_tapq:anon_567:num_flush_failed: 0
      eq_tapq:anon_567:num_mutation: 21
      eq_tapq:anon_567:num_mutation_failed: 0
      eq_tapq:anon_567:num_opaque: 4
      eq_tapq:anon_567:num_opaque_failed: 0
      eq_tapq:anon_567:num_unknown: 0
      eq_tapq:anon_567:num_vbucket_set: 0
      eq_tapq:anon_567:num_vbucket_set_failed: 0
      eq_tapq:anon_567:pending_disconnect: false
      eq_tapq:anon_567:reserved: 0
      eq_tapq:anon_567:supports_ack: true
      eq_tapq:anon_567:type: consumer
      eq_tapq:anon_568:connected: true
      eq_tapq:anon_568:created: 569
      eq_tapq:anon_568:num_checkpoint_end: 0
      eq_tapq:anon_568:num_checkpoint_end_failed: 0
      eq_tapq:anon_568:num_checkpoint_start: 1
      eq_tapq:anon_568:num_checkpoint_start_failed: 0
      eq_tapq:anon_568:num_delete: 0
      eq_tapq:anon_568:num_delete_failed: 0
      eq_tapq:anon_568:num_flush: 0
      eq_tapq:anon_568:num_flush_failed: 0
      eq_tapq:anon_568:num_mutation: 15
      eq_tapq:anon_568:num_mutation_failed: 0
      eq_tapq:anon_568:num_opaque: 4
      eq_tapq:anon_568:num_opaque_failed: 0
      eq_tapq:anon_568:num_unknown: 0
      eq_tapq:anon_568:num_vbucket_set: 0
      eq_tapq:anon_568:num_vbucket_set_failed: 0
      eq_tapq:anon_568:pending_disconnect: false
      eq_tapq:anon_568:reserved: 0
      eq_tapq:anon_568:supports_ack: true
      eq_tapq:anon_568:type: consumer
      eq_tapq:anon_7:connected: true
      eq_tapq:anon_7:created: 8
      eq_tapq:anon_7:num_checkpoint_end: 10
      eq_tapq:anon_7:num_checkpoint_end_failed: 0
      eq_tapq:anon_7:num_checkpoint_start: 56
      eq_tapq:anon_7:num_checkpoint_start_failed: 0
      eq_tapq:anon_7:num_delete: 0
      eq_tapq:anon_7:num_delete_failed: 0
      eq_tapq:anon_7:num_flush: 0
      eq_tapq:anon_7:num_flush_failed: 0
      eq_tapq:anon_7:num_mutation: 145
      eq_tapq:anon_7:num_mutation_failed: 0
      eq_tapq:anon_7:num_opaque: 106
      eq_tapq:anon_7:num_opaque_failed: 0
      eq_tapq:anon_7:num_unknown: 0
      eq_tapq:anon_7:num_vbucket_set: 0
      eq_tapq:anon_7:num_vbucket_set_failed: 0
      eq_tapq:anon_7:pending_disconnect: false
      eq_tapq:anon_7:reserved: 0
      eq_tapq:anon_7:supports_ack: true
      eq_tapq:anon_7:type: consumer
      eq_tapq:anon_8:connected: true
      eq_tapq:anon_8:created: 10
      eq_tapq:anon_8:num_checkpoint_end: 0
      eq_tapq:anon_8:num_checkpoint_end_failed: 0
      eq_tapq:anon_8:num_checkpoint_start: 51
      eq_tapq:anon_8:num_checkpoint_start_failed: 0
      eq_tapq:anon_8:num_delete: 0
      eq_tapq:anon_8:num_delete_failed: 0
      eq_tapq:anon_8:num_flush: 0
      eq_tapq:anon_8:num_flush_failed: 0
      eq_tapq:anon_8:num_mutation: 0
      eq_tapq:anon_8:num_mutation_failed: 0
      eq_tapq:anon_8:num_opaque: 102
      eq_tapq:anon_8:num_opaque_failed: 0
      eq_tapq:anon_8:num_unknown: 0
      eq_tapq:anon_8:num_vbucket_set: 0
      eq_tapq:anon_8:num_vbucket_set_failed: 0
      eq_tapq:anon_8:pending_disconnect: false
      eq_tapq:anon_8:reserved: 0
      eq_tapq:anon_8:supports_ack: true
      eq_tapq:anon_8:type: consumer
      eq_tapq:replication_ns_1@10.2.2.108:ack_log_size: 0
      eq_tapq:replication_ns_1@10.2.2.108:ack_seqno: 108
      eq_tapq:replication_ns_1@10.2.2.108:ack_window_full: false
      eq_tapq:replication_ns_1@10.2.2.108:backfill_completed: true
      eq_tapq:replication_ns_1@10.2.2.108:bg_jobs_completed: 0
      eq_tapq:replication_ns_1@10.2.2.108:bg_jobs_issued: 0
      eq_tapq:replication_ns_1@10.2.2.108:bg_result_size: 0
      eq_tapq:replication_ns_1@10.2.2.108:connected: true
      eq_tapq:replication_ns_1@10.2.2.108:created: 344
      eq_tapq:replication_ns_1@10.2.2.108:flags: 85 (ack,backfill,vblist,checkpoints)
      eq_tapq:replication_ns_1@10.2.2.108:has_queued_item: false
      eq_tapq:replication_ns_1@10.2.2.108:idle: true
      eq_tapq:replication_ns_1@10.2.2.108:paused: 1
      eq_tapq:replication_ns_1@10.2.2.108:pending_backfill: false
      eq_tapq:replication_ns_1@10.2.2.108:pending_disconnect: false
      eq_tapq:replication_ns_1@10.2.2.108:pending_disk_backfill: false
      eq_tapq:replication_ns_1@10.2.2.108:qlen: 0
      eq_tapq:replication_ns_1@10.2.2.108:qlen_high_pri: 0
      eq_tapq:replication_ns_1@10.2.2.108:qlen_low_pri: 0
      eq_tapq:replication_ns_1@10.2.2.108:queue_backfillremaining: 0
      eq_tapq:replication_ns_1@10.2.2.108:queue_backoff: 0
      eq_tapq:replication_ns_1@10.2.2.108:queue_drain: 0
      eq_tapq:replication_ns_1@10.2.2.108:queue_fill: 0
      eq_tapq:replication_ns_1@10.2.2.108:queue_itemondisk: 0
      eq_tapq:replication_ns_1@10.2.2.108:queue_memory: 0
      eq_tapq:replication_ns_1@10.2.2.108:rec_fetched: 71
      eq_tapq:replication_ns_1@10.2.2.108:recv_ack_seqno: 107
      eq_tapq:replication_ns_1@10.2.2.108:reserved: 1
      eq_tapq:replication_ns_1@10.2.2.108:seqno_ack_requested: 107
      eq_tapq:replication_ns_1@10.2.2.108:supports_ack: true
      eq_tapq:replication_ns_1@10.2.2.108:suspended: false
      eq_tapq:replication_ns_1@10.2.2.108:total_backlog_size: 0
      eq_tapq:replication_ns_1@10.2.2.108:total_noops: 14330
      eq_tapq:replication_ns_1@10.2.2.108:type: producer
      eq_tapq:replication_ns_1@10.2.2.108:vb_filter:

      { 921, [989,1022] }

      eq_tapq:replication_ns_1@10.2.2.108:vb_filters: 35
      eq_tapq:replication_ns_1@10.2.2.60:ack_log_size: 0
      eq_tapq:replication_ns_1@10.2.2.60:ack_seqno: 156
      eq_tapq:replication_ns_1@10.2.2.60:ack_window_full: false
      eq_tapq:replication_ns_1@10.2.2.60:backfill_completed: true
      eq_tapq:replication_ns_1@10.2.2.60:bg_jobs_completed: 0
      eq_tapq:replication_ns_1@10.2.2.60:bg_jobs_issued: 0
      eq_tapq:replication_ns_1@10.2.2.60:bg_result_size: 0
      eq_tapq:replication_ns_1@10.2.2.60:connected: true
      eq_tapq:replication_ns_1@10.2.2.60:created: 122
      eq_tapq:replication_ns_1@10.2.2.60:flags: 85 (ack,backfill,vblist,checkpoints)
      eq_tapq:replication_ns_1@10.2.2.60:has_queued_item: false
      eq_tapq:replication_ns_1@10.2.2.60:idle: true
      eq_tapq:replication_ns_1@10.2.2.60:paused: 1
      eq_tapq:replication_ns_1@10.2.2.60:pending_backfill: false
      eq_tapq:replication_ns_1@10.2.2.60:pending_disconnect: false
      eq_tapq:replication_ns_1@10.2.2.60:pending_disk_backfill: false
      eq_tapq:replication_ns_1@10.2.2.60:qlen: 0
      eq_tapq:replication_ns_1@10.2.2.60:qlen_high_pri: 0
      eq_tapq:replication_ns_1@10.2.2.60:qlen_low_pri: 0
      eq_tapq:replication_ns_1@10.2.2.60:queue_backfillremaining: 0
      eq_tapq:replication_ns_1@10.2.2.60:queue_backoff: 0
      eq_tapq:replication_ns_1@10.2.2.60:queue_drain: 0
      eq_tapq:replication_ns_1@10.2.2.60:queue_fill: 0
      eq_tapq:replication_ns_1@10.2.2.60:queue_itemondisk: 0
      eq_tapq:replication_ns_1@10.2.2.60:queue_memory: 0
      eq_tapq:replication_ns_1@10.2.2.60:rec_fetched: 103
      eq_tapq:replication_ns_1@10.2.2.60:recv_ack_seqno: 155
      eq_tapq:replication_ns_1@10.2.2.60:reserved: 1
      eq_tapq:replication_ns_1@10.2.2.60:seqno_ack_requested: 155
      eq_tapq:replication_ns_1@10.2.2.60:supports_ack: true
      eq_tapq:replication_ns_1@10.2.2.60:suspended: false
      eq_tapq:replication_ns_1@10.2.2.60:total_backlog_size: 0
      eq_tapq:replication_ns_1@10.2.2.60:total_noops: 14351
      eq_tapq:replication_ns_1@10.2.2.60:type: producer
      eq_tapq:replication_ns_1@10.2.2.60:vb_filter:

      { [410,426], [478,510], 1023 }

      eq_tapq:replication_ns_1@10.2.2.60:vb_filters: 51
      eq_tapq:replication_ns_1@10.2.2.63:ack_log_size: 0
      eq_tapq:replication_ns_1@10.2.2.63:ack_seqno: 156
      eq_tapq:replication_ns_1@10.2.2.63:ack_window_full: false
      eq_tapq:replication_ns_1@10.2.2.63:backfill_completed: true
      eq_tapq:replication_ns_1@10.2.2.63:bg_jobs_completed: 0
      eq_tapq:replication_ns_1@10.2.2.63:bg_jobs_issued: 0
      eq_tapq:replication_ns_1@10.2.2.63:bg_result_size: 0
      eq_tapq:replication_ns_1@10.2.2.63:connected: true
      eq_tapq:replication_ns_1@10.2.2.63:created: 8
      eq_tapq:replication_ns_1@10.2.2.63:flags: 85 (ack,backfill,vblist,checkpoints)
      eq_tapq:replication_ns_1@10.2.2.63:has_queued_item: false
      eq_tapq:replication_ns_1@10.2.2.63:idle: true
      eq_tapq:replication_ns_1@10.2.2.63:paused: 1
      eq_tapq:replication_ns_1@10.2.2.63:pending_backfill: false
      eq_tapq:replication_ns_1@10.2.2.63:pending_disconnect: false
      eq_tapq:replication_ns_1@10.2.2.63:pending_disk_backfill: false
      eq_tapq:replication_ns_1@10.2.2.63:qlen: 0
      eq_tapq:replication_ns_1@10.2.2.63:qlen_high_pri: 0
      eq_tapq:replication_ns_1@10.2.2.63:qlen_low_pri: 0
      eq_tapq:replication_ns_1@10.2.2.63:queue_backfillremaining: 0
      eq_tapq:replication_ns_1@10.2.2.63:queue_backoff: 0
      eq_tapq:replication_ns_1@10.2.2.63:queue_drain: 0
      eq_tapq:replication_ns_1@10.2.2.63:queue_fill: 0
      eq_tapq:replication_ns_1@10.2.2.63:queue_itemondisk: 0
      eq_tapq:replication_ns_1@10.2.2.63:queue_memory: 0
      eq_tapq:replication_ns_1@10.2.2.63:rec_fetched: 103
      eq_tapq:replication_ns_1@10.2.2.63:recv_ack_seqno: 155
      eq_tapq:replication_ns_1@10.2.2.63:reserved: 1
      eq_tapq:replication_ns_1@10.2.2.63:seqno_ack_requested: 155
      eq_tapq:replication_ns_1@10.2.2.63:supports_ack: true
      eq_tapq:replication_ns_1@10.2.2.63:suspended: false
      eq_tapq:replication_ns_1@10.2.2.63:total_backlog_size: 0
      eq_tapq:replication_ns_1@10.2.2.63:total_noops: 14348
      eq_tapq:replication_ns_1@10.2.2.63:type: producer
      eq_tapq:replication_ns_1@10.2.2.63:vb_filter:

      { [239,255], [308,340], 511 }

      eq_tapq:replication_ns_1@10.2.2.63:vb_filters: 51
      eq_tapq:replication_ns_1@10.2.2.64:ack_log_size: 0
      eq_tapq:replication_ns_1@10.2.2.64:ack_seqno: 156
      eq_tapq:replication_ns_1@10.2.2.64:ack_window_full: false
      eq_tapq:replication_ns_1@10.2.2.64:backfill_completed: true
      eq_tapq:replication_ns_1@10.2.2.64:bg_jobs_completed: 0
      eq_tapq:replication_ns_1@10.2.2.64:bg_jobs_issued: 0
      eq_tapq:replication_ns_1@10.2.2.64:bg_result_size: 0
      eq_tapq:replication_ns_1@10.2.2.64:connected: true
      eq_tapq:replication_ns_1@10.2.2.64:created: 135
      eq_tapq:replication_ns_1@10.2.2.64:flags: 85 (ack,backfill,vblist,checkpoints)
      eq_tapq:replication_ns_1@10.2.2.64:has_queued_item: false
      eq_tapq:replication_ns_1@10.2.2.64:idle: true
      eq_tapq:replication_ns_1@10.2.2.64:paused: 1
      eq_tapq:replication_ns_1@10.2.2.64:pending_backfill: false
      eq_tapq:replication_ns_1@10.2.2.64:pending_disconnect: false
      eq_tapq:replication_ns_1@10.2.2.64:pending_disk_backfill: false
      eq_tapq:replication_ns_1@10.2.2.64:qlen: 0
      eq_tapq:replication_ns_1@10.2.2.64:qlen_high_pri: 0
      eq_tapq:replication_ns_1@10.2.2.64:qlen_low_pri: 0
      eq_tapq:replication_ns_1@10.2.2.64:queue_backfillremaining: 0
      eq_tapq:replication_ns_1@10.2.2.64:queue_backoff: 0
      eq_tapq:replication_ns_1@10.2.2.64:queue_drain: 0
      eq_tapq:replication_ns_1@10.2.2.64:queue_fill: 0
      eq_tapq:replication_ns_1@10.2.2.64:queue_itemondisk: 0
      eq_tapq:replication_ns_1@10.2.2.64:queue_memory: 0
      eq_tapq:replication_ns_1@10.2.2.64:rec_fetched: 103
      eq_tapq:replication_ns_1@10.2.2.64:recv_ack_seqno: 155
      eq_tapq:replication_ns_1@10.2.2.64:reserved: 1
      eq_tapq:replication_ns_1@10.2.2.64:seqno_ack_requested: 155
      eq_tapq:replication_ns_1@10.2.2.64:supports_ack: true
      eq_tapq:replication_ns_1@10.2.2.64:suspended: false
      eq_tapq:replication_ns_1@10.2.2.64:total_backlog_size: 0
      eq_tapq:replication_ns_1@10.2.2.64:total_noops: 14349
      eq_tapq:replication_ns_1@10.2.2.64:type: producer
      eq_tapq:replication_ns_1@10.2.2.64:vb_filter:

      { [154,170], [820,853] }

      eq_tapq:replication_ns_1@10.2.2.64:vb_filters: 51

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            iryna iryna
            iryna iryna
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty