Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-19122

[FTS] MCP: Rebalance % complete indicator incorrectly bounces up to 100% when >1 FTS indexes

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 4.5.0
    • 4.5.0
    • cbft
    • None
    • Untriaged
    • Yes

    Description

      Build
      4.5.0-2047

      I'm seeing this intermittently in the last few builds. Looks like a recent regression.

      Testcase
      test_2 in P0 mcp suite

      ./testrunner -i INI_FILE.ini -p get-cbcollect-info=True,get-coredumps=True,get-logs=False,stop-on-failure=False,GROUP=P0 -t fts.moving_topology_fts.MovingTopFTS.rebalance_in_during_index_building,items=100000,cluster=D,F,standard_buckets=2,GROUP=P0

      Steps
      1. Create a D,F cluster.
      2. Create 3 buckets - 1 default, 2 standard_buckets(buckets on a dedicated port)
      3. Load 100K docs into each bucket
      4. As index building starts on all 3 indexes, rebalance in a D+F node.
      5. At the end of rebalance, check the index distribution. For one of the indexes, although the index is distributed on both nodes, only 32 partitions are being serviced by one node as opposed to the expected 512.

      I'm not too sure if index distribution is still underway or it's just incorrect.

      [2016-04-07 15:12:34,391] - [fts_base:1621] INFO - Starting rebalance-in nodes:[ip:172.23.105.54 port:8091 ssh_username:root] at C1 cluster 172.23.105.96
      [2016-04-07 15:12:35,108] - [task:404] INFO - adding node 172.23.105.54:8091 to cluster
      [2016-04-07 15:12:35,109] - [rest_client:988] INFO - adding remote node @172.23.105.54:8091 to this cluster @172.23.105.96:8091
      [2016-04-07 15:12:41,838] - [rest_client:1206] INFO - rebalance params : password=password&ejectedNodes=&user=Administrator&knownNodes=ns_1%40172.23.105.54%2Cns_1%40172.23.105.96%2Cns_1%40172.23.105.190
      [2016-04-07 15:12:41,859] - [rest_client:1210] INFO - rebalance operation started
      [2016-04-07 15:12:41,870] - [rest_client:1338] INFO - rebalance percentage : 0.00 %
      [2016-04-07 15:12:51,904] - [rest_client:1338] INFO - rebalance percentage : 2.18 %
      [2016-04-07 15:13:01,923] - [rest_client:1338] INFO - rebalance percentage : 4.67 %
      [2016-04-07 15:13:11,981] - [rest_client:1338] INFO - rebalance percentage : 7.34 %
      [2016-04-07 15:13:22,007] - [rest_client:1338] INFO - rebalance percentage : 10.00 %
      [2016-04-07 15:13:32,027] - [rest_client:1338] INFO - rebalance percentage : 12.53 %
      [2016-04-07 15:13:42,052] - [rest_client:1338] INFO - rebalance percentage : 15.02 %
      [2016-04-07 15:13:52,070] - [rest_client:1338] INFO - rebalance percentage : 17.28 %
      [2016-04-07 15:14:02,090] - [rest_client:1338] INFO - rebalance percentage : 19.79 %
      [2016-04-07 15:14:12,110] - [rest_client:1338] INFO - rebalance percentage : 22.44 %
      [2016-04-07 15:14:22,129] - [rest_client:1338] INFO - rebalance percentage : 24.96 %
      [2016-04-07 15:14:32,156] - [rest_client:1338] INFO - rebalance percentage : 27.40 %
      [2016-04-07 15:14:42,175] - [rest_client:1338] INFO - rebalance percentage : 29.70 %
      [2016-04-07 15:14:52,197] - [rest_client:1338] INFO - rebalance percentage : 31.74 %
      [2016-04-07 15:15:02,216] - [rest_client:1338] INFO - rebalance percentage : 33.80 %
      [2016-04-07 15:15:12,242] - [rest_client:1338] INFO - rebalance percentage : 36.07 %
      [2016-04-07 15:15:22,261] - [rest_client:1338] INFO - rebalance percentage : 38.31 %
      [2016-04-07 15:15:32,280] - [rest_client:1338] INFO - rebalance percentage : 40.64 %
      [2016-04-07 15:15:42,306] - [rest_client:1338] INFO - rebalance percentage : 43.04 %
      [2016-04-07 15:15:52,339] - [rest_client:1338] INFO - rebalance percentage : 45.04 %
      [2016-04-07 15:16:02,359] - [rest_client:1338] INFO - rebalance percentage : 47.09 %
      [2016-04-07 15:16:12,394] - [rest_client:1338] INFO - rebalance percentage : 49.35 %
      [2016-04-07 15:16:22,415] - [rest_client:1338] INFO - rebalance percentage : 50.00 %
      [2016-04-07 15:16:32,434] - [rest_client:1338] INFO - rebalance percentage : 50.00 %
      [2016-04-07 15:16:42,453] - [rest_client:1338] INFO - rebalance percentage : 50.00 %
      [2016-04-07 15:16:52,472] - [rest_client:1338] INFO - rebalance percentage : 53.13 %
      [2016-04-07 15:17:02,491] - [rest_client:1338] INFO - rebalance percentage : 53.13 %
      [2016-04-07 15:17:12,520] - [rest_client:1338] INFO - rebalance percentage : 53.13 %
      [2016-04-07 15:17:22,540] - [rest_client:1338] INFO - rebalance percentage : 53.13 %
      [2016-04-07 15:17:32,559] - [rest_client:1338] INFO - rebalance percentage : 53.13 %
      [2016-04-07 15:17:42,577] - [rest_client:1338] INFO - rebalance percentage : 54.69 %
      [2016-04-07 15:17:52,595] - [rest_client:1338] INFO - rebalance percentage : 56.25 %
      [2016-04-07 15:18:02,615] - [rest_client:1338] INFO - rebalance percentage : 56.25 %
      [2016-04-07 15:18:12,634] - [rest_client:1338] INFO - rebalance percentage : 56.25 %
      [2016-04-07 15:18:22,654] - [rest_client:1338] INFO - rebalance percentage : 56.25 %
      [2016-04-07 15:18:32,674] - [rest_client:1338] INFO - rebalance percentage : 56.25 %
      [2016-04-07 15:18:42,693] - [rest_client:1338] INFO - rebalance percentage : 59.38 %
      [2016-04-07 15:18:52,712] - [rest_client:1338] INFO - rebalance percentage : 59.38 %
      [2016-04-07 15:19:02,734] - [rest_client:1338] INFO - rebalance percentage : 59.38 %
      [2016-04-07 15:19:12,753] - [rest_client:1338] INFO - rebalance percentage : 59.38 %
      [2016-04-07 15:19:22,771] - [rest_client:1338] INFO - rebalance percentage : 59.38 %
      [2016-04-07 15:19:32,789] - [rest_client:1338] INFO - rebalance percentage : 60.94 %
      [2016-04-07 15:19:42,809] - [rest_client:1338] INFO - rebalance percentage : 62.50 %
      [2016-04-07 15:19:52,827] - [rest_client:1338] INFO - rebalance percentage : 62.50 %
      [2016-04-07 15:20:02,846] - [rest_client:1338] INFO - rebalance percentage : 62.50 %
      [2016-04-07 15:20:12,870] - [rest_client:1338] INFO - rebalance percentage : 62.50 %
      [2016-04-07 15:20:22,892] - [rest_client:1338] INFO - rebalance percentage : 62.50 %
      [2016-04-07 15:20:32,913] - [rest_client:1338] INFO - rebalance percentage : 65.63 %
      [2016-04-07 15:20:42,935] - [rest_client:1338] INFO - rebalance percentage : 65.63 %
      [2016-04-07 15:20:52,954] - [rest_client:1338] INFO - rebalance percentage : 65.63 %
      [2016-04-07 15:21:02,981] - [rest_client:1338] INFO - rebalance percentage : 65.63 %
      [2016-04-07 15:21:12,999] - [rest_client:1338] INFO - rebalance percentage : 65.63 %
      [2016-04-07 15:21:23,018] - [rest_client:1338] INFO - rebalance percentage : 73.44 %
      [2016-04-07 15:21:33,036] - [rest_client:1338] INFO - rebalance percentage : 75.00 %
      [2016-04-07 15:21:43,053] - [rest_client:1338] INFO - rebalance percentage : 75.00 %
      [2016-04-07 15:21:53,071] - [rest_client:1338] INFO - rebalance percentage : 75.00 %
      [2016-04-07 15:22:03,090] - [rest_client:1338] INFO - rebalance percentage : 75.00 %
      [2016-04-07 15:22:13,109] - [rest_client:1338] INFO - rebalance percentage : 78.13 %
      [2016-04-07 15:22:23,131] - [rest_client:1338] INFO - rebalance percentage : 84.38 %
      [2016-04-07 15:22:33,161] - [rest_client:1338] INFO - rebalance percentage : 84.38 %
      [2016-04-07 15:22:43,201] - [rest_client:1338] INFO - rebalance percentage : 90.63 %
      [2016-04-07 15:22:53,222] - [rest_client:1338] INFO - rebalance percentage : 93.75 %
      [2016-04-07 15:23:03,239] - [rest_client:1338] INFO - rebalance percentage : 93.75 %
      [2016-04-07 15:23:13,258] - [rest_client:1338] INFO - rebalance percentage : 96.88 %
      [2016-04-07 15:23:23,487] - [rest_client:1338] INFO - rebalance percentage : 96.88 %
      [2016-04-07 15:23:33,508] - [rest_client:1338] INFO - rebalance percentage : 96.88 %
      [2016-04-07 15:23:44,185] - [rest_client:1338] INFO - rebalance percentage : 100.00 %
      [2016-04-07 15:23:44,185] - [task:520] INFO - rebalancing was completed with progress: 100% in 662.326179028 sec
      [2016-04-07 15:23:45,240] - [fts_base:2476] INFO - Validated: Number of PIndexes = 32
      [2016-04-07 15:23:45,262] - [fts_base:2487] INFO - Validated: Every pIndex serves 32 partitions or lesser
      [2016-04-07 15:23:45,262] - [fts_base:2503] INFO - Validated: pIndexes are distributed across [u'f6d723ce6a6db6b8953149ce031c0582', u'8ba52f25bae402a1eea017331050a6f6'] 
      [2016-04-07 15:23:45,262] - [fts_base:2509] INFO - Expecting num of partitions in each node in range 480-544
      [2016-04-07 15:23:45,262] - [fts_base:2524] INFO - Validated: Node f6d723ce6a6db6b8953149ce031c0582 houses 16 pindexes which serve 512 partitions
      [2016-04-07 15:23:45,263] - [fts_base:2524] INFO - Validated: Node 8ba52f25bae402a1eea017331050a6f6 houses 16 pindexes which serve 512 partitions
      [2016-04-07 15:23:45,487] - [fts_base:2476] INFO - Validated: Number of PIndexes = 32
      [2016-04-07 15:23:45,504] - [fts_base:2487] INFO - Validated: Every pIndex serves 32 partitions or lesser
      [2016-04-07 15:23:45,504] - [fts_base:2370] INFO - sleep for 10 secs. pIndexes not distributed across 2 nodes yet ...
      [2016-04-07 15:23:55,831] - [fts_base:2503] INFO - Validated: pIndexes are distributed across [u'f6d723ce6a6db6b8953149ce031c0582', u'8ba52f25bae402a1eea017331050a6f6'] 
      [2016-04-07 15:23:55,832] - [fts_base:2509] INFO - Expecting num of partitions in each node in range 480-544
      FAIL
      

      AssertionError: The source partitions are not evenly distributed among nodes, seeing 32 on f6d723ce6a6db6b8953149ce031c0582

      Testrunnner hasn't logged which of the 3 indexes is imbalanced. I've added that logging now, but I wasn't sure if it is reproducible again so I'm attaching the logs from which we can decipher the same.

      Attaching cbcollect info.

      Attachments

        Activity

          People

            apiravi Aruna Piravi (Inactive)
            apiravi Aruna Piravi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              PagerDuty