Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-24480

Rebalance of FTS indexes really slow in system test

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Do
    • Critical
    • 5.0.0
    • 5.0.0
    • fts
    • Untriaged
    • Unknown

    Description

      rebalance got stuck for 10 hours in FTS component system test against 5.0.0-2915 - http://qa.sc.couchbase.com/job/centos-systest-launcher/932/console - following report seen in logs around same time:

      per_node_processes('ns_1@172.23.108.103') =
           {<0.16826.1>,
            [{registered_name,[]},
             {status,waiting},
             {initial_call,{proc_lib,init_p,5}},
             {backtrace,[<<"Program counter: 0x00007fbc74de9f28 (gen_server:loop/6 + 264)">>,
                         <<"CP: 0x0000000000000000 (invalid)">>,<<"arity = 0">>,
                         <<>>,
                         <<"0x00007fbc275af580 Return addr 0x00007fbc74dc4528 (proc_lib:init_p_do_apply/3 + 56)">>,
                         <<"y(0)     []">>,<<"y(1)     10000">>,
                         <<"y(2)     dcp_proxy">>,
                         <<"(3)     {state,#Port<0.10293>,{producer,\"replication:ns_1@172.23.108.97->ns_1@172.23.108.103:other-2\",'ns_1@172.23.108.9">>,
                         <<"y(4)     <0.16826.1>">>,<<"y(5)     <0.16776.1>">>,<<>>,
                         <<"0x00007fbc275af5b8 Return addr 0x0000000000891848 (<terminate process normally>)">>,
                         <<"y(0)     Catch 0x00007fbc74dc4548 (proc_lib:init_p_do_apply/3 + 88)">>,
                         <<>>]},
             {error_handler,error_handler},
             {garbage_collection,[{min_bin_vheap_size,46422},
                                  {min_heap_size,233},
                                  {fullsweep_after,512},
                                  {minor_gcs,431}]},
             {heap_size,17731},
             {total_heap_size,139267},
             {links,[<0.16776.1>,#Port<0.10293>]},
             {monitors,[]},
             {monitored_by,[<0.2770.0>]},
             {memory,1115088},
             {messages,[]},
             {message_queue_len,0},
             {reductions,4127536},
             {trap_exit,false},
             {current_location,{gen_server,loop,6,
                                           [{file,"gen_server.erl"},{line,358}]}},
             {dictionary,[{'$ancestors',['dcp_replicator-other-2-ns_1@172.23.108.97',
                                         'dcp_sup-other-2',
                                         'single_bucket_kv_sup-other-2',
                                         ns_bucket_sup,ns_bucket_worker_sup,
                                         ns_server_sup,ns_server_nodes_sup,
                                         <0.171.0>,ns_server_cluster_sup,<0.89.0>]},
                          {'$initial_call',{dcp_proxy,init,1}}]}]}
           {<0.16777.1>,
            [{registered_name,'dcp_consumer_conn-other-2-ns_1@172.23.108.97'},
             {status,waiting},
             {initial_call,{proc_lib,init_p,5}},
             {backtrace,[<<"Program counter: 0x00007fbc74de9f28 (gen_server:loop/6 + 264)">>,
                         <<"CP: 0x0000000000000000 (invalid)">>,<<"arity = 0">>,
                         <<>>,
                         <<"0x00007fbc6ceea4a8 Return addr 0x00007fbc74dc4528 (proc_lib:init_p_do_apply/3 + 56)">>,
                         <<"y(0)     []">>,<<"y(1)     10000">>,
                         <<"y(2)     dcp_proxy">>,
                         <<"(3)     {state,#Port<0.10291>,{consumer,\"replication:ns_1@172.23.108.97->ns_1@172.23.108.103:other-2\",'ns_1@172.23.108.1">>,
                         <<"y(4)     <0.16777.1>">>,<<"y(5)     <0.16776.1>">>,<<>>,
                         <<"0x00007fbc6ceea4e0 Return addr 0x0000000000891848 (<terminate process normally>)">>,
                         <<"y(0)     Catch 0x00007fbc74dc4548 (proc_lib:init_p_do_apply/3 + 88)">>,
                         <<>>]},
             {error_handler,error_handler},
             {garbage_collection,[{min_bin_vheap_size,46422},
                                  {min_heap_size,233},
                                  {fullsweep_after,512},
                                  {minor_gcs,47}]},
             {heap_size,987},
             {total_heap_size,3573},
             {links,[<0.16776.1>,#Port<0.10291>]},
             {monitors,[]},
             {monitored_by,[<0.2770.0>]},
             {memory,29536},
             {messages,[]},
             {message_queue_len,0},
             {reductions,5773477},
             {trap_exit,false},
             {current_location,{gen_server,loop,6,
                                           [{file,"gen_server.erl"},{line,358}]}},
             {dictionary,[{'$ancestors',['dcp_replicator-other-2-ns_1@172.23.108.97',
                                         'dcp_sup-other-2',
                                         'single_bucket_kv_sup-other-2',
                                         ns_bucket_sup,ns_bucket_worker_sup,
                                         ns_server_sup,ns_server_nodes_sup,
                                         <0.171.0>,ns_server_cluster_sup,<0.89.0>]},
                          {'$initial_call',{dcp_proxy,init,1}}]}]}
           {<0.16776.1>,
            [{registered_name,'dcp_replicator-other-2-ns_1@172.23.108.97'},
             {status,waiting},
             {initial_call,{proc_lib,init_p,5}},
             {backtrace,[<<"Program counter: 0x00007fbc74de9f28 (gen_server:loop/6 + 264)">>,
                         <<"CP: 0x0000000000000000 (invalid)">>,<<"arity = 0">>,
                         <<>>,
                         <<"0x00007fbc2754d3c0 Return addr 0x00007fbc74dc4528 (proc_lib:init_p_do_apply/3 + 56)">>,
                         <<"y(0)     []">>,<<"y(1)     infinity">>,
                         <<"y(2)     dcp_replicator">>,
                         <<"(3)     {state,[{<0.16777.1>,#Port<0.10291>},{<0.16826.1>,#Port<0.10293>}],<0.16777.1>,\"replication:ns_1@172.23.108.97->">>,
                         <<"y(4)     'dcp_replicator-other-2-ns_1@172.23.108.97'">>,
                         <<"y(5)     <0.4725.0>">>,<<>>,
                         <<"0x00007fbc2754d3f8 Return addr 0x0000000000891848 (<terminate process normally>)">>,
                         <<"y(0)     Catch 0x00007fbc74dc4548 (proc_lib:init_p_do_apply/3 + 88)">>,
                         <<>>]},
             {error_handler,error_handler},
             {garbage_collection,[{min_bin_vheap_size,46422},
                                  {min_heap_size,233},
                                  {fullsweep_after,512},
                                  {minor_gcs,149}]},
             {heap_size,1598},
             {total_heap_size,3196},
             {links,[<0.16777.1>,<0.16826.1>,<0.4725.0>]},
             {monitors,[]},
             {monitored_by,[<0.2502.0>]},
             {memory,26560},
             {messages,[]},
             {message_queue_len,0},
             {reductions,13737},
             {trap_exit,true},
             {current_location,{gen_server,loop,6,
                                           [{file,"gen_server.erl"},{line,358}]}},
             {dictionary,[{'$ancestors',['dcp_sup-other-2',
                                         'single_bucket_kv_sup-other-2',
                                         ns_bucket_sup,ns_bucket_worker_sup,
                                         ns_server_sup,ns_server_nodes_sup,
                                         <0.171.0>,ns_server_cluster_sup,<0.89.0>]},
                          {'$initial_call',{dcp_replicator,init,1}}]}]}
           {<0.11771.1>,
            [{registered_name,[]},
             {status,waiting},
             {initial_call,{proc_lib,init_p,5}},
             {backtrace,[<<"Program counter: 0x00007fbc74de9f28 (gen_server:loop/6 + 264)">>,
                         <<"CP: 0x0000000000000000 (invalid)">>,<<"arity = 0">>,
                         <<>>,
                         <<"0x00007fbc1fa9ff10 Return addr 0x00007fbc74dc4528 (proc_lib:init_p_do_apply/3 + 56)">>,
                         <<"y(0)     []">>,<<"y(1)     10000">>,
                         <<"y(2)     dcp_proxy">>,
                         <<"(3)     {state,#Port<0.10162>,{producer,\"replication:ns_1@172.23.98.135->ns_1@172.23.108.103:other-2\",'ns_1@172.23.98.13">>,
                         <<"y(4)     <0.11771.1>">>,<<"y(5)     <0.11706.1>">>,<<>>,
                         <<"0x00007fbc1fa9ff48 Return addr 0x0000000000891848 (<terminate process normally>)">>,
                         <<"y(0)     Catch 0x00007fbc74dc4548 (proc_lib:init_p_do_apply/3 + 88)">>,
                         <<>>]},
             {error_handler,error_handler},
             {garbage_collection,[{min_bin_vheap_size,46422},
                                  {min_heap_size,233},
                                  {fullsweep_after,512},
                                  {minor_gcs,432}]},
             {heap_size,17731},
             {total_heap_size,139267},
             {links,[<0.11706.1>,#Port<0.10162>]},
             {monitors,[]},
             {monitored_by,[<0.2770.0>]},
             {memory,1115088},
             {messages,[]},
             {message_queue_len,0},
             {reductions,4134661},
             {trap_exit,false},
             {current_location,{gen_server,loop,6,
                                           [{file,"gen_server.erl"},{line,358}]}},
             {dictionary,[{'$ancestors',['dcp_replicator-other-2-ns_1@172.23.98.135',
                                         'dcp_sup-other-2',
                                         'single_bucket_kv_sup-other-2',
                                         ns_bucket_sup,ns_bucket_worker_sup,
                                         ns_server_sup,ns_server_nodes_sup,
                                         <0.171.0>,ns_server_cluster_sup,<0.89.0>]},
                          {'$initial_call',{dcp_proxy,init,1}}]}]}
           {<0.11739.1>,
            [{registered_name,'dcp_consumer_conn-other-2-ns_1@172.23.98.135'},
             {status,waiting},
             {initial_call,{proc_lib,init_p,5}},
             {backtrace,[<<"Program counter: 0x00007fbc74de9f28 (gen_server:loop/6 + 264)">>,
                         <<"CP: 0x0000000000000000 (invalid)">>,<<"arity = 0">>,
                         <<>>,
                         <<"0x00007fbc015c5a08 Return addr 0x00007fbc74dc4528 (proc_lib:init_p_do_apply/3 + 56)">>,
                         <<"y(0)     []">>,<<"y(1)     10000">>,
                         <<"y(2)     dcp_proxy">>,
                         <<"(3)     {state,#Port<0.10144>,{consumer,\"replication:ns_1@172.23.98.135->ns_1@172.23.108.103:other-2\",'ns_1@172.23.108.1">>,
                         <<"y(4)     <0.11739.1>">>,<<"y(5)     <0.11706.1>">>,<<>>,
                         <<"0x00007fbc015c5a40 Return addr 0x0000000000891848 (<terminate process normally>)">>,
                         <<"y(0)     Catch 0x00007fbc74dc4548 (proc_lib:init_p_do_apply/3 + 88)">>,
                         <<>>]},
             {error_handler,error_handler},
             {garbage_collection,[{min_bin_vheap_size,46422},
                                  {min_heap_size,233},
                                  {fullsweep_after,512},
                                  {minor_gcs,264}]},
             {heap_size,376},
             {total_heap_size,2962},
             {links,[<0.11706.1>,#Port<0.10144>]},
             {monitors,[]},
             {monitored_by,[<0.2770.0>]},
             {memory,24648},
             {messages,[]},
             {message_queue_len,0},
             {reductions,5783053},
             {trap_exit,false},
      

      Logs can be found here:

      https://s3.amazonaws.com/bugdb/jira/may22/collectinfo-2017-05-22T061905-ns_1%40172.23.108.103.zip (master)
      https://s3.amazonaws.com/bugdb/jira/may22/collectinfo-2017-05-22T061905-ns_1%40172.23.108.104.zip
      https://s3.amazonaws.com/bugdb/jira/may22/collectinfo-2017-05-22T061905-ns_1%40172.23.108.107.zip
      https://s3.amazonaws.com/bugdb/jira/may22/collectinfo-2017-05-22T061905-ns_1%40172.23.108.108.zip
      https://s3.amazonaws.com/bugdb/jira/may22/collectinfo-2017-05-22T061905-ns_1%40172.23.108.97.zip (fts node)
      https://s3.amazonaws.com/bugdb/jira/may22/collectinfo-2017-05-22T061905-ns_1%40172.23.98.135.zip

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            steve Steve Yen
            arunkumar Arunkumar Senthilnathan (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty