Details

    • Type: Technical task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0
    • Fix Version/s: 2.1.0
    • Component/s: ns_server
    • Security Level: Public

      Description

      steps:

      1. cluster #1
      2 nodes: 10.3.121.112, 10.3.121.113
      3 buckets with ~700K items in each:
      default,
      sasl:sasl( replica=2)
      standard

      cluster #2
      10.3.121.115:
      3 buckets:
      default1:default1( replica=2)
      sasl( no password, replica is disabled)

      2. cluster#2 rebalance in 10.3.121.116

      3. during rebalance create on 10.3.121.112 create xdcr replications:

      default this cluster bucket "default1" on cluster "http://10.3.121.115/" Replicating Last 10 errors on change
      sasl this cluster bucket "sasl" on cluster "http://10.3.121.115/" Starting Up on change
      standard this cluster bucket "standard" on cluster "http://10.3.121.115/" Replicating Last 10 errors on change

      and on 10.3.121.115 create:
      standard this cluster bucket "standard" on cluster "http://10.3.121.112:8091/" Replicating on change

      see screenshot with details

      result:

      [ns_server:debug,2012-11-18T4:22:25.816,ns_1@10.3.121.115:ns_config_log<0.338.0>:ns_config_log:log_common:111]config change:
      rebalance_status ->

      {none,<<"Rebalance stopped by janitor.">>}
      [ns_server:info,2012-11-18T4:22:25.992,ns_1@10.3.121.115:<0.5807.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/209">>
      [ns_server:info,2012-11-18T4:22:26.405,ns_1@10.3.121.115:<0.5839.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/210">>
      [ns_server:info,2012-11-18T4:22:26.725,ns_1@10.3.121.115:<0.5860.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/211">>
      [ns_server:info,2012-11-18T4:22:27.334,ns_1@10.3.121.115:<0.5915.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/212">>
      [ns_server:info,2012-11-18T4:22:27.810,ns_1@10.3.121.115:<0.5937.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/213">>
      [ns_server:info,2012-11-18T4:22:28.086,ns_1@10.3.121.115:<0.5980.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/214">>
      [ns_server:error,2012-11-18T4:22:28.602,ns_1@10.3.121.115:ns_memcached-standard<0.21850.0>:ns_memcached:handle_info:630]handle_info(ensure_bucket,..) took too long: 853680 us
      [ns_server:info,2012-11-18T4:22:28.785,ns_1@10.3.121.115:<0.6043.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/215">>
      [ns_server:error,2012-11-18T4:22:29.450,ns_1@10.3.121.115:<0.9097.0>:ns_memcached:verify_report_long_call:297]call {get_meta,<<"default111-5c34112c98cb69d2">>,883} took too long: 551954 us
      [ns_server:error,2012-11-18T4:22:29.451,ns_1@10.3.121.115:<0.21854.0>:ns_memcached:verify_report_long_call:297]call {get_meta,<<"standard-5e1b18c4c6a6d316">>,751} took too long: 549644 us
      [ns_server:info,2012-11-18T4:22:29.652,ns_1@10.3.121.115:<0.6111.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/216">>
      [ns_server:info,2012-11-18T4:22:29.918,ns_1@10.3.121.115:<0.6151.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/217">>
      [ns_server:error,2012-11-18T4:22:29.968,ns_1@10.3.121.115:ns_memcached-sasl<0.15429.0>:ns_memcached:handle_info:630]handle_info(ensure_bucket,..) took too long: 1629801 us
      [ns_server:info,2012-11-18T4:22:30.052,ns_1@10.3.121.115:<0.6192.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/218">>
      [ns_server:info,2012-11-18T4:22:30.400,ns_1@10.3.121.115:<0.6233.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/219">>
      [ns_server:info,2012-11-18T4:22:30.762,ns_1@10.3.121.115:<0.6261.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/220">>
      [ns_server:info,2012-11-18T4:22:30.900,ns_1@10.3.121.115:<0.6285.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/221">>
      [ns_server:info,2012-11-18T4:22:31.548,ns_1@10.3.121.115:<0.6348.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/222">>
      [ns_server:error,2012-11-18T4:22:31.882,ns_1@10.3.121.115:<0.9099.0>:ns_memcached:verify_report_long_call:297]call {stats,<<>>} took too long: 627784 us
      [ns_server:info,2012-11-18T4:22:31.920,ns_1@10.3.121.115:<0.6451.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/223">>
      [ns_server:error,2012-11-18T4:22:31.947,ns_1@10.3.121.115:<0.21855.0>:ns_memcached:verify_report_long_call:297]call topkeys took too long: 5994193 us
      [stats:warn,2012-11-18T4:22:32.332,ns_1@10.3.121.115:<0.21861.0>:stats_collector:latest_tick:223]Dropped 6 ticks
      [ns_server:info,2012-11-18T4:22:32.661,ns_1@10.3.121.115:<0.6494.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/224">>
      [ns_server:info,2012-11-18T4:22:33.516,ns_1@10.3.121.115:<0.6536.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/225">>
      [ns_server:error,2012-11-18T4:22:33.848,ns_1@10.3.121.115:<0.21855.0>:ns_memcached:verify_report_long_call:297]call {stats,<<>>} took too long: 692405 us
      [ns_server:info,2012-11-18T4:22:34.464,ns_1@10.3.121.115:<0.6579.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/226">>
      [ns_server:error,2012-11-18T4:22:34.947,ns_1@10.3.121.115:<0.21855.0>:ns_memcached:verify_report_long_call:297]call list_vbuckets took too long: 700928 us
      [stats:warn,2012-11-18T4:22:35.000,ns_1@10.3.121.115:<0.21861.0>:stats_collector:latest_tick:223]Dropped 1 ticks
      [ns_server:info,2012-11-18T4:22:35.363,ns_1@10.3.121.115:<0.6615.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/227">>
      [ns_server:info,2012-11-18T4:22:35.662,ns_1@10.3.121.115:<0.6639.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/228">>
      [ns_server:error,2012-11-18T4:22:36.373,ns_1@10.3.121.115:<0.15433.0>:ns_memcached:verify_report_long_call:297]call topkeys took too long: 4081382 us
      [ns_server:info,2012-11-18T4:22:36.576,ns_1@10.3.121.115:<0.6716.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/229">>
      [ns_server:info,2012-11-18T4:22:37.330,ns_1@10.3.121.115:<0.6747.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/230">>
      [ns_server:info,2012-11-18T4:22:37.680,ns_1@10.3.121.115:<0.6772.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/231">>
      [ns_server:info,2012-11-18T4:22:37.960,ns_1@10.3.121.115:<0.6809.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/232">>
      [ns_server:error,2012-11-18T4:22:37.843,ns_1@10.3.121.115:ns_memcached-default1<0.9095.0>:ns_memcached:handle_info:630]handle_info(ensure_bucket,..) took too long: 1210948 us
      [ns_server:info,2012-11-18T4:22:38.306,ns_1@10.3.121.115:<0.6865.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/233">>
      [ns_server:info,2012-11-18T4:22:38.777,ns_1@10.3.121.115:<0.6932.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/234">>
      [ns_server:error,2012-11-18T4:22:38.955,ns_1@10.3.121.115:<0.15434.0>:ns_memcached:verify_report_long_call:297]call {stats,<<>>} took too long: 843491 us
      [ns_server:error,2012-11-18T4:22:39.098,ns_1@10.3.121.115:ns_memcached-standard<0.21850.0>:ns_memcached:handle_info:630]handle_info(ensure_bucket,..) took too long: 1347335 us
      [ns_server:error,2012-11-18T4:22:39.272,ns_1@10.3.121.115:ns_memcached-sasl<0.15429.0>:ns_memcached:handle_info:630]handle_info(ensure_bucket,..) took too long: 913668 us




      {cluster_compat_version,
      [{'_vclock',[{'ns_1@127.0.0.1',{1,63520456425}}]},
      2,0]},
      {counters,
      [{'_vclock',
      [{'ns_1@10.3.121.115',{1,63520459705}}]},
      {rebalance_start,1}]},
      {dynamic_config_version,
      [{'_vclock',[{'ns_1@127.0.0.1',{4,63520456425}}]},
      2,0]},
      {email_alerts,
      [{'_vclock',[{'ns_1@127.0.0.1',{1,63520456424}}]},
      {recipients,["root@localhost"]},
      {sender,"couchbase@localhost"},
      {enabled,false},
      {email_server,
      [{user,[]},
      {pass,[]},
      {host,"localhost"},
      {port,25},
      {encrypt,false}]},
      {alerts, [auto_failover_node, auto_failover_maximum_reached, auto_failover_other_nodes_down, auto_failover_cluster_too_small,ip,disk, overhead,ep_oom_errors, ep_item_commit_failed]}]},
      {fast_warmup,
      [{fast_warmup_enabled,true},
      {min_memory_threshold,10},
      {min_items_threshold,10}]},
      {index_aware_rebalance_disabled,false},
      {max_bucket_count,10},
      {memory_quota,
      [{'_vclock',
      [{'ns_1@127.0.0.1',{1,63520456426}}]}|
      3082]},
      {nodes_wanted,
      [{'_vclock',
      [{'ns_1@10.3.121.115',{2,63520459703}}]},
      'ns_1@10.3.121.115','ns_1@10.3.121.116']},
      {otp,
      [{'_vclock',
      [{'ns_1@10.3.121.116',{2,63520459703}},
      {'ns_1@127.0.0.1',{1,63520456425}}]},
      {cookie,khxnkzsiiskgxwef}]},
      {rebalance_status,
      [{'_vclock',
      [{'ns_1@10.3.121.115',{2,63520460544}}]}|
      {none,<<"Rebalance stopped by janitor.">>}

      ]},
      {rebalancer_pid,
      [{'_vclock',
      [{'ns_1@10.3.121.115',{2,63520460544}}]}|
      undefined]},
      {remote_clusters,
      [{'_vclock',
      [{'ns_1@10.3.121.115',{1,63520459931}}]},
      [

      {uuid,<<"274a6cf7338d4686038a03878a2a176d">>}

      ,

      {name,"http://10.3.121.112:8091/"}

      ,

      {hostname,"10.3.121.112"}

      ,

      {username,"Administrator"}

      ,

      {password,"password"}

      ]]},
      {replication,[

      {enabled,true}

      ]},
      {rest,[

      {port,8091}

      ]},

      [ns_server:info,2012-11-18T4:22:23.761,ns_1@10.3.121.115:<0.5617.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/203">>
      [ns_server:info,2012-11-18T4:22:24.425,ns_1@10.3.121.115:<0.5681.4>:compaction_daemon:spawn_vbucket_compactor:639]Compacting <<"default1/204">>
      [user:info,2012-11-18T4:22:24.496,ns_1@10.3.121.115:ns_config<0.331.0>:ns_janitor:maybe_stop_rebalance_status:147]Resetting rebalance status since it's not really running

      after failure XDCR sasl bucket replication status is Starting Up, and only 871 items were transfered

      1. 10.3.121.112-8091-diag.txt.gz
        5.12 MB
        Andrei Baranouski
      2. 10.3.121.113-8091-diag.txt.gz
        10.92 MB
        Andrei Baranouski
      3. 10.3.121.115-8091-diag.txt.gz
        9.87 MB
        Andrei Baranouski
      4. 10.3.121.116-8091-diag.txt.gz
        2.66 MB
        Andrei Baranouski
      1. replication_buckets.png
        325 kB
      2. replication.png
        288 kB
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        andreibaranouski Andrei Baranouski created issue -
        farshid Farshid Ghods (Inactive) made changes -
        Field Original Value New Value
        Fix Version/s 2.0 [ 10114 ]
        Component/s cross-datacenter-replication [ 10136 ]
        Component/s ns_server [ 10019 ]
        farshid Farshid Ghods (Inactive) made changes -
        Labels 2.0-release-notes
        Hide
        steve Steve Yen added a comment -

        moved to 2.0.1 per bug scrub

        Show
        steve Steve Yen added a comment - moved to 2.0.1 per bug scrub
        steve Steve Yen made changes -
        Fix Version/s 2.0.1 [ 10399 ]
        Fix Version/s 2.0 [ 10114 ]
        dipti Dipti Borkar made changes -
        Fix Version/s 2.0.2 [ 10418 ]
        Fix Version/s 2.0.1 [ 10399 ]
        farshid Farshid Ghods (Inactive) made changes -
        Parent MB-6595 [ 19671 ]
        Issue Type Bug [ 1 ] Technical task [ 7 ]
        Hide
        kzeller kzeller added a comment -

        Added to RN as:

        One of the internal maintenance processes within Couchbase Server for checking vBuckets
        will stop node rebalance when you create replication via XDCR on a source cluster.

        Show
        kzeller kzeller added a comment - Added to RN as: One of the internal maintenance processes within Couchbase Server for checking vBuckets will stop node rebalance when you create replication via XDCR on a source cluster.
        Hide
        junyi Junyi Xie (Inactive) added a comment -

        Looks to me this is a rebalance issue (rebalance stopped when the system is under some workload) and unrelated to XDCR component. XDCR just exposed the issue.

        Remove xdcr from the component and watch this bug.

        Show
        junyi Junyi Xie (Inactive) added a comment - Looks to me this is a rebalance issue (rebalance stopped when the system is under some workload) and unrelated to XDCR component. XDCR just exposed the issue. Remove xdcr from the component and watch this bug.
        junyi Junyi Xie (Inactive) made changes -
        Component/s cross-datacenter-replication [ 10136 ]
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Agree with Junyi. It's about timeouts, not directly about xdcr.

        However XDCR evidently pushes system more heavily into timeoutful state.

        Let me also note that it's perhaps fixed already by +A and other timeout-fixing commits.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Agree with Junyi. It's about timeouts, not directly about xdcr. However XDCR evidently pushes system more heavily into timeoutful state. Let me also note that it's perhaps fixed already by +A and other timeout-fixing commits.
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Worth re-testing

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Worth re-testing
        alkondratenko Aleksey Kondratenko (Inactive) made changes -
        Assignee Aleksey Kondratenko [ alkondratenko ] Farshid Ghods [ farshid ]
        farshid Farshid Ghods (Inactive) made changes -
        Assignee Farshid Ghods [ farshid ] Ketaki Gangal [ ketaki ]
        farshid Farshid Ghods (Inactive) made changes -
        Planned Start 2013-03-01 12:00 (re-scheduled: set planned start date to today)
        Planned End 2013-03-01 12:00 (generated: based on start date + estimation * velocity)
        Hide
        maria Maria McDuff (Inactive) added a comment -

        QE — pls retest in latest 2.0.2 build. Thanks.

        Show
        maria Maria McDuff (Inactive) added a comment - QE — pls retest in latest 2.0.2 build. Thanks.
        maria Maria McDuff (Inactive) made changes -
        Assignee Ketaki Gangal [ ketaki ] Andrei Baranouski [ andreibaranouski ]
        Hide
        maria Maria McDuff (Inactive) added a comment -

        andrei, can u pls re-test and update this bug? Thanks.

        Show
        maria Maria McDuff (Inactive) added a comment - andrei, can u pls re-test and update this bug? Thanks.
        Hide
        andreibaranouski Andrei Baranouski added a comment - - edited

        rebalance passed with the same steps on 2.0.2-789

        Show
        andreibaranouski Andrei Baranouski added a comment - - edited rebalance passed with the same steps on 2.0.2-789
        andreibaranouski Andrei Baranouski made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        andreibaranouski Andrei Baranouski added a comment -

        Maria, can we close it now?

        Show
        andreibaranouski Andrei Baranouski added a comment - Maria, can we close it now?
        andreibaranouski Andrei Baranouski made changes -
        Assignee Andrei Baranouski [ andreibaranouski ] Maria McDuff [ maria ]
        Hide
        maria Maria McDuff (Inactive) added a comment -

        yes, pls close. thanks.

        Show
        maria Maria McDuff (Inactive) added a comment - yes, pls close. thanks.
        maria Maria McDuff (Inactive) made changes -
        Assignee Maria McDuff [ maria ] Andrei Baranouski [ andreibaranouski ]
        andreibaranouski Andrei Baranouski made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            andreibaranouski Andrei Baranouski
            Reporter:
            andreibaranouski Andrei Baranouski
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes