Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7260

[windows]a node is on pending state on UI, but api is available

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Incomplete
    • Affects Version/s: 2.0
    • Fix Version/s: 2.0.1
    • Component/s: UI
    • Security Level: Public
    • Labels:
      None
    • Environment:

      Description

      4 nodes cluster/ 10M items/ 1 default bucket/1 ddoc with 2 views/ autofailover enabled
      Node is marked as down in home page, and as pending in servers page, but i can access it, query it
      Steps to reproduce:
      1. Stop a node (.135)
      2. Reboot another node(.151)
      3. Node (.135) is down and auto failed over
      4. Warm up is completed (below are stats from .151) but on UI the node is on pending state

      accepting_conns: 1
      auth_cmds: 0
      auth_errors: 0
      bucket_active_conns: 1
      bucket_conns: 6
      bytes: 637628416
      bytes_read: 323575
      bytes_written: 50171063
      cas_badval: 0
      cas_hits: 0
      cas_misses: 0
      cmd_flush: 0
      cmd_get: 0
      cmd_set: 0
      conn_yields: 0
      connection_structures: 5000
      curr_connections: 10
      curr_conns_on_port_11209: 6
      curr_conns_on_port_11210: 2
      curr_items: 0
      curr_items_tot: 0
      curr_temp_items: 0
      daemon_connections: 4
      decr_hits: 0
      decr_misses: 0
      delete_hits: 0
      delete_misses: 0
      ep_access_scanner_last_runtime: 0
      ep_access_scanner_num_items: 0
      ep_access_scanner_task_time: 2012-11-27 10:00:00
      ep_allow_data_loss_during_shutdown: 1
      ep_alog_block_size: 4096
      ep_alog_path: c:/Program Files/Couchbase/Server/var/lib/couchbase/data/default/access.log
      ep_alog_sleep_time: 1440
      ep_alog_task_time: 10
      ep_backend: couchdb
      ep_bg_fetch_delay: 0
      ep_bg_fetched: 0
      ep_bg_meta_fetched: 0
      ep_bg_remaining_jobs: 0
      ep_chk_max_items: 5000
      ep_chk_period: 1800
      ep_chk_persistence_remains: 0
      ep_chk_persistence_timeout: 10
      ep_chk_remover_stime: 5
      ep_commit_num: 0
      ep_commit_time: 0
      ep_commit_time_total: 0
      ep_concurrentDB: 1
      ep_config_file:
      ep_couch_bucket: default
      ep_couch_host: localhost
      ep_couch_port: 11213
      ep_couch_reconnect_sleeptime: 250
      ep_couch_response_timeout: 180000
      ep_data_age: 0
      ep_data_age_highwat: 0
      ep_data_traffic_enabled: 0
      ep_dbinit: 1
      ep_dbname: c:/Program Files/Couchbase/Server/var/lib/couchbase/data/default
      ep_degraded_mode: 1
      ep_diskqueue_drain: 0
      ep_diskqueue_fill: 0
      ep_diskqueue_items: 0
      ep_diskqueue_memory: 0
      ep_diskqueue_pending: 0
      ep_exp_pager_stime: 3600
      ep_expired_access: 0
      ep_expired_pager: 0
      ep_expiry_window: 3
      ep_failpartialwarmup: 0
      ep_flush_all: false
      ep_flush_duration_total: 0
      ep_flushall_enabled: 0
      ep_flusher_state: running
      ep_flusher_todo: 0
      ep_getl_default_timeout: 15
      ep_getl_max_timeout: 30
      ep_ht_locks: 5
      ep_ht_size: 3079
      ep_inconsistent_slave_chk: 0
      ep_initfile:
      ep_io_num_read: 0
      ep_io_num_write: 0
      ep_io_read_bytes: 0
      ep_io_write_bytes: 0
      ep_item_begin_failed: 0
      ep_item_commit_failed: 0
      ep_item_flush_expired: 0
      ep_item_flush_failed: 0
      ep_item_num_based_new_chk: 1
      ep_items_rm_from_checkpoints: 0
      ep_keep_closed_chks: 0
      ep_klog_block_size: 4096
      ep_klog_compactor_queue_cap: 500000
      ep_klog_compactor_stime: 3600
      ep_klog_flush: commit2
      ep_klog_max_entry_ratio: 10
      ep_klog_max_log_size: 2147483647
      ep_klog_path:
      ep_klog_sync: commit2
      ep_kv_size: 545218597
      ep_max_checkpoints: 2
      ep_max_data_size: 1048576000
      ep_max_item_size: 20971520
      ep_max_size: 1048576000
      ep_max_txn_size: 10000
      ep_max_vbuckets: 1024
      ep_mem_high_wat: 786432000
      ep_mem_low_wat: 629145600
      ep_mem_tracker_enabled: true
      ep_min_data_age: 0
      ep_mlog_compactor_runs: 0
      ep_mutation_mem_threshold: 0
      ep_num_access_scanner_runs: 0
      ep_num_eject_failures: 0
      ep_num_expiry_pager_runs: 0
      ep_num_non_resident: 0
      ep_num_not_my_vbuckets: 0
      ep_num_ops_del_meta: 0
      ep_num_ops_get_meta: 0
      ep_num_ops_set_meta: 0
      ep_num_pager_runs: 0
      ep_num_value_ejects: 101
      ep_oom_errors: 0
      ep_overhead: 27122008
      ep_pager_active_vb_pcnt: 40
      ep_pager_unbiased_period: 60
      ep_pending_ops: 0
      ep_pending_ops_max: 0
      ep_pending_ops_max_duration: 0
      ep_pending_ops_total: 0
      ep_postInitfile:
      ep_queue_age_cap: 900
      ep_queue_size: 0
      ep_startup_time: 1353928779
      ep_storage_age: 0
      ep_storage_age_highwat: 0
      ep_store_max_concurrency: 10
      ep_store_max_readers: 9
      ep_store_max_readwrite: 1
      ep_stored_val_type:
      ep_tap_ack_grace_period: 300
      ep_tap_ack_initial_sequence_number: 1
      ep_tap_ack_interval: 1000
      ep_tap_ack_window_size: 10
      ep_tap_backfill_resident: 0.9
      ep_tap_backlog_limit: 5000
      ep_tap_backoff_period: 5
      ep_tap_bg_fetch_requeued: 0
      ep_tap_bg_fetched: 0
      ep_tap_bg_max_pending: 500
      ep_tap_keepalive: 300
      ep_tap_noop_interval: 20
      ep_tap_requeue_sleep_time: 0.1
      ep_tap_throttle_cap_pcnt: 10
      ep_tap_throttle_queue_cap: 1000000
      ep_tap_throttle_threshold: 90
      ep_tmp_oom_errors: 0
      ep_too_old: 0
      ep_too_young: 0
      ep_total_cache_size: 0
      ep_total_del_items: 0
      ep_total_enqueued: 0
      ep_total_new_items: 0
      ep_total_persisted: 0
      ep_uncommitted_items: 0
      ep_value_size: 141477046
      ep_vb0: 0
      ep_vb_snapshot_total: 1
      ep_vb_total: 512
      ep_vbucket_del: 0
      ep_vbucket_del_fail: 0
      ep_version: 2.0.0r_140_gde42c8c
      ep_waitforwarmup: 0
      ep_warmup: 1
      ep_warmup_batch_size: 1000
      ep_warmup_dups: 0
      ep_warmup_min_items_threshold: 100
      ep_warmup_min_memory_threshold: 100
      ep_warmup_oom: 0
      ep_warmup_thread: complete
      ep_warmup_time: 39791138
      get_hits: 0
      get_misses: 0
      incr_hits: 0
      incr_misses: 0
      libevent: 2.0.11-stable
      limit_maxbytes: 67108864
      listen_disabled_num: 0
      max_conns_on_port_11209: 1000
      max_conns_on_port_11210: 9000
      mem_used: 637628416
      pid: 1060
      pointer_size: 64
      rejected_conns: 0
      threads: 4
      time: 1353931933
      total_connections: 10
      uptime: 3157
      vb_active_curr_items: 0
      vb_active_eject: 0
      vb_active_expired: 0
      vb_active_ht_memory: 0
      vb_active_itm_memory: 0
      vb_active_meta_data_memory: 0
      vb_active_num: 0
      vb_active_num_non_resident: 0
      vb_active_num_ref_ejects: 0
      vb_active_num_ref_items: 0
      vb_active_ops_create: 0
      vb_active_ops_delete: 0
      vb_active_ops_reject: 0
      vb_active_ops_update: 0
      vb_active_perc_mem_resident: 0
      vb_active_queue_age: 0
      vb_active_queue_drain: 0
      vb_active_queue_fill: 0
      vb_active_queue_memory: 0
      vb_active_queue_pending: 0
      vb_active_queue_size: 0
      vb_dead_num: 512
      vb_pending_curr_items: 0
      vb_pending_eject: 0
      vb_pending_expired: 0
      vb_pending_ht_memory: 0
      vb_pending_itm_memory: 0
      vb_pending_meta_data_memory: 0
      vb_pending_num: 0
      vb_pending_num_non_resident: 0
      vb_pending_num_ref_ejects: 0
      vb_pending_num_ref_items: 0
      vb_pending_ops_create: 0
      vb_pending_ops_delete: 0
      vb_pending_ops_reject: 0
      vb_pending_ops_update: 0
      vb_pending_perc_mem_resident: 0
      vb_pending_queue_age: 0
      vb_pending_queue_drain: 0
      vb_pending_queue_fill: 0
      vb_pending_queue_memory: 0
      vb_pending_queue_pending: 0
      vb_pending_queue_size: 0
      vb_replica_curr_items: 0
      vb_replica_eject: 0
      vb_replica_expired: 0
      vb_replica_ht_memory: 0
      vb_replica_itm_memory: 0
      vb_replica_meta_data_memory: 0
      vb_replica_num: 0
      vb_replica_num_non_resident: 0
      vb_replica_num_ref_ejects: 0
      vb_replica_num_ref_items: 0
      vb_replica_ops_create: 0
      vb_replica_ops_delete: 0
      vb_replica_ops_reject: 0
      vb_replica_ops_update: 0
      vb_replica_perc_mem_resident: 0
      vb_replica_queue_age: 0
      vb_replica_queue_drain: 0
      vb_replica_queue_fill: 0
      vb_replica_queue_memory: 0
      vb_replica_queue_pending: 0
      vb_replica_queue_size: 0
      version: 1.4.4_600_g7ea975a

      1. home_page.png
        166 kB
      2. servers_page.png
        170 kB
      # Subject Project Status CR V
      For Gerrit Dashboard: &For+MB-7260=message:MB-7260

        Activity

        Show
        iryna iryna added a comment - https://s3.amazonaws.com/bugdb/jira/MB-7260/fccb2891/diag-146.txt.gz https://s3.amazonaws.com/bugdb/jira/MB-7260/fccb2891/diag-147.txt.gz https://s3.amazonaws.com/bugdb/jira/MB-7260/fccb2891/diag-151.txt.gz
        Hide
        iryna iryna added a comment -

        screenshots attached

        Show
        iryna iryna added a comment - screenshots attached
        Hide
        iryna iryna added a comment -

        .151 node was pending for about 1 hour until I started failed over node

        Show
        iryna iryna added a comment - .151 node was pending for about 1 hour until I started failed over node
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Not windows-specific and not a bug.

        You can see that 135 was not in fact failed over. There's a known problem in where all nodes needs to be warmed up in order to complete warmup of even single node.

        In this case .135 is still part of cluster but it's down and thus prevents janitor's activation of .151

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Not windows-specific and not a bug. You can see that 135 was not in fact failed over. There's a known problem in where all nodes needs to be warmed up in order to complete warmup of even single node. In this case .135 is still part of cluster but it's down and thus prevents janitor's activation of .151

          People

          • Assignee:
            alkondratenko Aleksey Kondratenko (Inactive)
            Reporter:
            iryna iryna
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes