Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-25272

[FTS] Rebalance out of FTS nodes is stuck for >10mins

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • 5.5.0
    • 5.0.0
    • fts
    • Untriaged
    • Unknown

    Description

      Build
      5.0.0-3298

      Testcase
      Is seen in the teardown phase of test -

      ./testrunner -i INI_FILE.ini get-cbcollect-info=True,get-logs=False,stop-on-failure=False,level_compaction=True,GROUP=P1 -t fts.moving_topology_fts.MovingTopFTS.update_index_during_failover,items=200000,cluster=D+F+F,GROUP=P1

      This test has never failed in previous runs but I do not think it's a regression. Probably timing-related or a race in index deletion.

      1. Cluster: D, D+F, D+F
      2. Test indexes 200k docs, in the middle of indexing, fails over one fts node and updates the index definition to use 512 pindexes(oops! note to self: change to 8 or 16)
      3. Indexing completes.
      4. In the teardown phase, we start by deleting the index, then the buckets and then try to de-cluster the nodes when rebalance gets stuck at 75% -

      [2017-07-14 07:00:32,562] - [fts_base:885] INFO - Deleting fulltext-index default_index_1 on 172.23.105.190
      [2017-07-14 07:00:32,631] - [rest_client:1907] INFO - Node 172.23.106.66 not part of cluster inactiveFailed
      [2017-07-14 07:00:32,672] - [remote_util:197] INFO - connecting to 172.23.105.190 with username:root password:couchbase ssh_key:
      [2017-07-14 07:00:33,018] - [remote_util:231] INFO - Connected to 172.23.105.190
      [2017-07-14 07:00:37,963] - [remote_util:2779] INFO - running command.raw on 172.23.105.190: ls /opt/couchbase/var/lib/couchbase/data/@fts |grep default_index_1*.pindex | wc -l
      [2017-07-14 07:00:38,332] - [remote_util:2816] INFO - command executed successfully
      [2017-07-14 07:00:38,333] - [fts_base:1618] INFO - 0
      [2017-07-14 07:00:38,335] - [fts_base:894] INFO - Validated: all index files for default_index_1 deleted from disk
      [2017-07-14 07:00:38,337] - [fts_base:1709] INFO - removing nodes from cluster ...
      [2017-07-14 07:00:38,354] - [fts_base:1711] INFO - cleanup [ip:172.23.105.96 port:8091 ssh_username:root, ip:172.23.105.190 port:8091 ssh_username:root, ip:172.23.106.66 port:8091 ssh_username:root]
      [2017-07-14 07:00:38,406] - [bucket_helper:142] INFO - deleting existing buckets [u'default'] on 172.23.105.96
      [2017-07-14 07:00:38,407] - [bucket_helper:144] INFO - remove bucket default ...
      [2017-07-14 07:00:47,794] - [bucket_helper:158] INFO - deleted bucket : default from 172.23.105.96
      [2017-07-14 07:00:47,795] - [bucket_helper:234] INFO - waiting for bucket deletion to complete....
      [2017-07-14 07:00:47,802] - [rest_client:133] INFO - node 172.23.105.96 existing buckets : []
      [2017-07-14 07:00:47,856] - [cluster_helper:254] INFO - rebalancing all nodes in order to remove nodes
      [2017-07-14 07:00:47,862] - [rest_client:1395] INFO - rebalance params : {'password': 'password', 'ejectedNodes': u'ns_1@172.23.106.66,ns_1@172.23.105.190', 'user': 'Administrator', 'knownNodes': u'ns_1@172.23.106.66,ns_1@172.23.105.96,ns_1@172.23.105.190'}
      [2017-07-14 07:00:47,876] - [rest_client:1400] INFO - rebalance operation started
      [2017-07-14 07:00:47,887] - [rest_client:1548] INFO - rebalance percentage : 0.00 %
      [2017-07-14 07:00:57,906] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:01:07,932] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:01:17,952] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:01:27,980] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:01:38,008] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:01:48,031] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:01:58,051] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:02:08,073] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:02:18,091] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:02:28,110] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:02:38,136] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:02:48,155] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:02:58,175] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:03:08,196] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:03:18,215] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:03:28,234] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:03:38,253] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:03:48,273] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:03:58,291] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:04:08,319] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:04:18,338] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:04:28,357] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:04:38,377] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:04:48,401] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:04:58,422] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:05:08,441] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:05:18,459] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:05:28,478] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:05:38,498] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:05:48,518] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:05:58,538] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:06:08,557] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:06:18,575] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:06:28,597] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:06:38,620] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:06:48,641] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:06:58,663] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:07:08,682] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:07:18,701] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:07:28,716] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:07:38,735] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:07:48,761] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:07:58,783] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:08:08,799] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:08:18,818] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:08:28,833] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:08:38,861] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:08:48,882] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:08:58,902] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:09:08,921] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:09:18,939] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:09:28,959] - [rest_client:1548] INFO - rebalance percentage : 75.00 %
      [2017-07-14 07:09:28,959] - [rest_client:1474] ERROR - apparently rebalance progress code in infinite loop: 75.0
      

      Attachments

        For Gerrit Dashboard: MB-25272
        # Subject Branch Project Status CR V

        Activity

          People

            Sreekanth Sivasankaran Sreekanth Sivasankaran (Inactive)
            apiravi Aruna Piravi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                PagerDuty