Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-3770

tap connection is paused during rebalance and resumes after 5 mins ( tap_keep_alive)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • 1.6.5.4
    • 1.6.5.4
    • couchbase-bucket
    • Security Level: Public
    • None

    Description

      analysis by Chiyoung and Aliaksey

      tap connection is paused during the rebalance of 2 nodes and is never resumed so the rebalance gets stuck but it doesn't timeout.

      according to the logs
      memcached<0.104.0>: Vbucket <eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814641.860081> is going dead.

      tap stat from master node

      root@ubuntu:~# /opt/membase/bin/ep_engine/management/stats 10.1.4.243:11210 tap
      ep_tap_ack_grace_period: 300
      ep_tap_ack_interval: 1000
      ep_tap_ack_window_size: 10
      ep_tap_backoff_period: 1
      ep_tap_bg_fetch_requeued: 0
      ep_tap_bg_fetched: 400
      ep_tap_bg_max_pending: 500
      ep_tap_count: 2
      ep_tap_deletes: 0
      ep_tap_fg_fetched: 400
      ep_tap_keepalive: 300
      ep_tap_noop_interval: 20
      ep_tap_throttled: 0
      ep_tap_total_fetched: 806
      ep_tap_total_queue: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:ack_log_size: 1
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:ack_playback_size: 1
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:ack_seqno: 403
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:ack_window_full: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:bg_backlog_size: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:bg_jobs_completed: 200
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:bg_jobs_issued: 200
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:bg_queue_size: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:bg_queued: 200
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:bg_result_size: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:bg_results: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:bg_wait_for_results: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:complete: true
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:connected: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:disconnects: 1
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:empty: true
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:flags: 28 (ack,vblist,takeover)
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:has_item: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:has_queued_item: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:idle: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:last_walk: 332
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:num_tap_nack: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:num_tap_tmpfail_survivors: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:paused: 1
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:pending_backfill: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:pending_disconnect: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:pending_disk_backfill: true
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:qlen: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:qlen_high_pri: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:qlen_low_pri: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:rec_fetched: 204
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:recv_ack_seqno: 402
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:suspended: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:vb_filter:

      { 0 }

      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304814975.734239:vb_filters: 1
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:ack_log_size: 1
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:ack_playback_size: 1
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:ack_seqno: 403
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:ack_window_full: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:bg_backlog_size: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:bg_jobs_completed: 200
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:bg_jobs_issued: 200
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:bg_queue_size: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:bg_queued: 200
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:bg_result_size: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:bg_results: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:bg_wait_for_results: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:complete: true
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:connected: true
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:empty: true
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:flags: 28 (ack,vblist,takeover)
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:has_item: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:has_queued_item: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:idle: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:last_walk: 571
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:num_tap_nack: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:num_tap_tmpfail_survivors: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:paused: 1
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:pending_backfill: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:pending_disconnect: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:pending_disk_backfill: true
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:qlen: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:qlen_high_pri: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:qlen_low_pri: 0
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:rec_fetched: 204
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:recv_ack_seqno: 402
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:suspended: false
      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:vb_filter:

      { 1 }

      eq_tapq:t-ns_1@10.1.4.244-ns_1@10.1.4.243-1304815281.063615:vb_filters: 1

      tap stat from node being added

      root@ubuntu:~# /opt/membase/bin/ep_engine/management/stats 127.0.0.1:11210 tap
      ep_tap_ack_grace_period: 300
      ep_tap_ack_interval: 1000
      ep_tap_ack_window_size: 10
      ep_tap_backoff_period: 1
      ep_tap_bg_fetch_requeued: 0
      ep_tap_bg_fetched: 0
      ep_tap_bg_max_pending: 500
      ep_tap_count: 0
      ep_tap_deletes: 0
      ep_tap_fg_fetched: 0
      ep_tap_keepalive: 300
      ep_tap_noop_interval: 20
      ep_tap_throttled: 0
      ep_tap_total_fetched: 0
      ep_tap_total_queue: 0

      diags also attached ...

      master 10.1.4.243 ( root/ the usual password )

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            dustin Dustin Sallings (Inactive)
            farshid Farshid Ghods (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty