Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-39542

[Collections] - Rebalance in out + collections ttl data load fails in volume testing

    XMLWordPrintable

Details

    Description

      Script to repro:

      ./testrunner -i /tmp/durability_volume.ini  -t volumetests.Collections.volume.test_volume_taf,nodes_init=4,replicas=1,num_failed_nodes=1,new_replica=1,graceful=True,bucket_spec=multi_bucket.buckets_for_volume_tests_with_ttl,iterations=1,doc_and_collection_ttl=True,data_load_spec=volume_test_load_with_doc_ttl,sdk_client_pool=True,quota_percent=100,rerun=False,skip_collections_cleanup=True,skip_cleanup=True
      

      Steps to Repro
      1) create a 4 node cluster
      2020-05-22 21:47:10,243 | test | INFO | pool-2-thread-7 | [table_view:display:72] Rebalance Overview
      ------------------------------------

      Nodes Services Status

      ------------------------------------

      172.23.121.81 kv Cluster node
      172.23.121.83 None <--- IN —
      172.23.121.85 None <--- IN —
      172.23.121.105 None <--- IN —

      ------------------------------------

      2) Create scope + collections + data load
      2020-05-22 22:00:03,246 | test | INFO | MainThread | [table_view:display:72] Bucket statistics
      ----------------------------------------------------------------------+

      Bucket Type Replicas TTL Items RAM Quota RAM Used Disk Used

      ----------------------------------------------------------------------+

      bucket1 membase 3 0 20000 419430400 148296240 113683122
      bucket2 ephemeral 3 0 30000 419430400 119000368 136
      default membase 3 350 20041547 71303168000 17780179472 13332354608

      ----------------------------------------------------------------------+

      3)2020-05-22 22:00:06,082 | test | INFO | MainThread | [Collections:test_volume_taf:122] Step 5: Rebalance in with Loading of docs
      2020-05-22 22:00:09,119 | test | INFO | pool-2-thread-30 | [table_view:display:72] Rebalance Overview
      ------------------------------------

      Nodes Services Status

      ------------------------------------

      172.23.121.81 kv Cluster node
      172.23.121.83 kv Cluster node
      172.23.121.105 kv Cluster node
      172.23.121.85 kv Cluster node
      172.23.121.138 None <--- IN —

      ------------------------------------

      4)2020-05-22 22:30:09,706 | test | INFO | MainThread | [Collections:test_volume_taf:134] Step 6: Rebalance Out with Loading of docs
      2020-05-22 22:30:09,796 | test | INFO | pool-2-thread-10 | [table_view:display:72] Rebalance Overview
      ------------------------------------

      Nodes Services Status

      ------------------------------------

      172.23.121.81 kv Cluster node
      172.23.121.83 [u'kv'] — OUT --->
      172.23.121.105 kv Cluster node
      172.23.121.85 kv Cluster node
      172.23.121.138 kv Cluster node

      ------------------------------------

      5) 2020-05-22 23:03:08,331 | test | INFO | MainThread | [Collections:test_volume_taf:146] Step 7: Rebalance In_Out with Loading of docs
      2020-05-22 23:03:13,733 | test | INFO | pool-2-thread-13 | [table_view:display:72] Rebalance Overview
      ------------------------------------

      Nodes Services Status

      ------------------------------------

      172.23.121.81 kv Cluster node
      172.23.121.105 kv Cluster node
      172.23.121.85 kv Cluster node
      172.23.121.138 [u'kv'] — OUT --->
      172.23.121.83 None <--- IN —
      172.23.121.114 None <--- IN —

      ------------------------------------

      Rebalance in/out fails with the following errror.

      2020-05-22 23:05:41,430 | test  | ERROR   | pool-2-thread-13 | [rest_client:print_UI_logs:2537] {u'code': 0, u'module': u'ns_orchestrator', u'type': u'critical', u'node': u'ns_1@172.23.121.81', u'tstamp': 1590213937888L, u'shortText': u'message', u'serverTime': u'2020-05-22T23:05:37.888Z', u'text': u'Rebalance exited with reason {mover_crashed,\n                              {unexpected_exit,\n                               {\'EXIT\',<0.12092.35>,\n                                {{bulk_set_vbucket_state_failed,\n                                  [{\'ns_1@172.23.121.138\',\n                                    {\'EXIT\',\n                                     {{{{{{{badmatch,{error,einval}},\n                                           [{dcp_proxy,handle_packet,2,\n                                             [{file,"src/dcp_proxy.erl"},\n                                              {line,189}]},\n                                            {dcp_proxy,process_data_loop,3,\n                                             [{file,"src/dcp_proxy.erl"},\n                                              {line,368}]},\n                                            {dcp_proxy,handle_info,2,\n                                             [{file,"src/dcp_proxy.erl"},\n                                              {line,103}]},\n                                            {gen_server,try_dispatch,4,\n                                             [{file,"gen_server.erl"},\n                                              {line,616}]},\n                                            {gen_server,handle_msg,6,\n                                             [{file,"gen_server.erl"},\n                                              {line,686}]},\n                                            {proc_lib,init_p_do_apply,3,\n                                             [{file,"proc_lib.erl"},\n                                              {line,247}]}]},\n                                          {gen_server,call,\n                                           [<25249.12306.1>,get_partitions,\n                                            infinity]}},\n                                         {gen_server,call,\n                                          [<25249.12305.1>,get_partitions,\n                                           infinity]}},\n                                        {gen_server,call,\n                                         [\'dcp_replication_manager-default\',\n                                          {manage_replicators,\n                                           [\'ns_1@172.23.121.105\',\n                                            \'ns_1@172.23.121.81\',\n                                            \'ns_1@172.23.121.85\']},\n                                          infinity]}},\n                                       {gen_server,call,\n                                        [\'replication_manager-default\',\n                                         {change_vbucket_replication,665,\n                                          undefined},\n                                         infinity]}},\n                                      {gen_server,call,\n                                       [{\'janitor_agent-default\',\n                                         \'ns_1@172.23.121.138\'},\n                                        {if_rebalance,<0.1134.34>,\n                                         {update_vbucket_state,995,replica,\n                                          undefined,undefined}},\n                                        infinity]}}}}]},\n                                 [{janitor_agent,bulk_set_vbucket_state,4,\n                                   [{file,"src/janitor_agent.erl"},\n                                    {line,403}]},\n                                  {ns_single_vbucket_mover,\n                                   \'-cleanup_old_streams/4-fun-1-\',4,\n                                   [{file,"src/ns_single_vbucket_mover.erl"},\n                                    {line,353}]},\n                                  {proc_lib,init_p,3,\n                                   [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 3593e7b121cb66b0003e4e31bc1f612c'}
      

      cbcollect_info attached. This is the first time we are running this test.

      Detailed steps available @ https://hub.internal.couchbase.com/confluence/pages/viewpage.action?pageId=50135893

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              richard.demellow Richard deMellow
              Balakumaran.Gopal Balakumaran Gopal
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty