Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-25768

MOI: Rebalance fails with "Validation fails: Index exceeding quota"

    XMLWordPrintable

Details

    • Untriaged
    • Yes

    Description

      Build 5.0.0-3497
      Observed rebalance failure in longevity test running on AWS with storage mode as MOI after ~16hours with error - <<"Validation fails: Index exceeding quota. Index=o2_rating (replica 1) Bucket=other-2 Memory=3766327203048588950 Cpu=0.0000 MemoryQuota=25652363264 CpuQuota=4">>
      Note: successive rebalances are succeeded but similar failures are seen multiple times.
      Below are logs, logs might be rotated but collected gz file for a node which experienced this error and attached as indexer.log.gz
      https://s3.amazonaws.com/bugdb/jira/exceed_quota/collectinfo-2017-08-22T051638-ns_1%40172.31.17.253.zip
      https://s3.amazonaws.com/bugdb/jira/exceed_quota/collectinfo-2017-08-22T051638-ns_1%40172.31.18.91.zip
      https://s3.amazonaws.com/bugdb/jira/exceed_quota/collectinfo-2017-08-22T051638-ns_1%40172.31.21.9.zip
      https://s3.amazonaws.com/bugdb/jira/exceed_quota/collectinfo-2017-08-22T051638-ns_1%40172.31.24.215.zip
      https://s3.amazonaws.com/bugdb/jira/exceed_quota/collectinfo-2017-08-22T051638-ns_1%40172.31.27.43.zip
      https://s3.amazonaws.com/bugdb/jira/exceed_quota/collectinfo-2017-08-22T051638-ns_1%40172.31.31.44.zip
      https://s3.amazonaws.com/bugdb/jira/exceed_quota/collectinfo-2017-08-22T060143-ns_1%40172.31.26.156.zip

      Here is snippet of logs in attached indexer.log.gz where Index total memory is shown very high for few indexes

      2017-08-22T04:46:35.525+00:00 [Warn] Validation error after adjusting solution for planner. Restore to original plan. Error=Total memory usage of all indexes (10945309697576528603) exceed aggregated memory quota of all indexer nodes (76957089792)
       2017-08-22T04:46:35.525+00:00 [Info] 
       2017-08-22T04:46:35.525+00:00 [Info] Indexer serverGroup:Group 1, nodeId:172.31.24.215:8091, useLiveData:true
       2017-08-22T04:46:35.525+00:00 [Info] Indexer total memory:0 (0), data:0 (0), overhead:0 (0), cpu:0.0000, number of indexes:0 isDeleted:false isNew:true
       2017-08-22T04:46:35.525+00:00 [Info] 
       2017-08-22T04:46:35.525+00:00 [Info] Indexer serverGroup:Group 1, nodeId:172.31.26.156:8091, useLiveData:true
       2017-08-22T04:46:35.525+00:00 [Info] Indexer total memory:0 (0), data:0 (0), overhead:0 (0), cpu:0.0000, number of indexes:0 isDeleted:false isNew:true
       2017-08-22T04:46:35.525+00:00 [Info] 
       2017-08-22T04:46:35.525+00:00 [Info] Indexer serverGroup:Group 1, nodeId:172.31.27.43:8091, useLiveData:true
       2017-08-22T04:46:35.525+00:00 [Info] Indexer total memory:1721332547 (1.60312G), data:1735054023 (1.61589G), overhead:18446744073695830140 (1.67772e+07T), cpu:0.0000, number of indexes:7 isDeleted:true isNew:false
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:#primary, bucket:default, defnId:13063694492764666537, instId:6573787491621757265, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:972379141227 (905.599G), data:91 (91), overhead:972379141136 (905.599G), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:o2_rating, bucket:other-2, defnId:1521072769113754533, instId:18113273178930424593, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:3766327841847923705 (3.42546e+06T), data:354251257 (337.84M), overhead:3766327841493672448 (3.42546e+06T), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:o1_result, bucket:other-1, defnId:15737661634471338961, instId:5706012952424690469, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:3729984411773327606 (3.3924e+06T), data:350832886 (334.58M), overhead:3729984411422494720 (3.3924e+06T), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:o3_result, bucket:other-3, defnId:16176266099460296785, instId:13595983295097802510, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:20632821837169946 (18765.4T), data:1940670 (1.85077M), overhead:20632821835229276 (18765.4T), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:default_result, bucket:default, defnId:4408012275783107923, instId:18213872847099380327, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:972379141227 (905.599G), data:91 (91), overhead:972379141136 (905.599G), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:o1_claim, bucket:other-1, defnId:5976823437345942146, instId:1073478395789900487, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:7158348057071587032 (6.51048e+06T), data:673296088 (642.105M), overhead:7158348056398290944 (6.51048e+06T), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:o1_rating, bucket:other-1, defnId:9898788758053623356, instId:6982586279300685279, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:3771448998142593420 (3.43011e+06T), data:354732940 (338.3M), overhead:3771448997787860480 (3.43011e+06T), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] 
       2017-08-22T04:46:35.525+00:00 [Info] Indexer serverGroup:Group 1, nodeId:172.31.31.44:8091, useLiveData:true
       2017-08-22T04:46:35.525+00:00 [Info] Indexer total memory:2260748284 (2.10549G), data:1736641658 (1.61737G), overhead:524106626 (499.827M), cpu:0.0000, number of indexes:7 isDeleted:false isNew:false
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:o1_claim (replica 1), bucket:other-1, defnId:5976823437345942146, instId:14531243601985371701, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:876589620 (835.981M), data:673370875 (642.177M), overhead:203218745 (193.804M), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:o2_result (replica 1), bucket:other-2, defnId:14650487287916820191, instId:12136620868549336270, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:456064501 (434.937M), data:350335602 (334.106M), overhead:105728899 (100.831M), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:default_rating, bucket:default, defnId:11837043824781621035, instId:105278524309748258, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:118 (118), data:91 (91), overhead:27 (27), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:o3_result (replica 1), bucket:other-3, defnId:16176266099460296785, instId:2194007151899798148, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:2530488 (2.41326M), data:1943848 (1.8538M), overhead:586640 (572.891K), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:o1_rating (replica 1), bucket:other-1, defnId:9898788758053623356, instId:9202806715494862580, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:461845697 (440.45M), data:354776550 (338.341M), overhead:107069147 (102.109M), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:o3_rating (replica 1), bucket:other-3, defnId:6759744410731956647, instId:6762783329415653226, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:2561243 (2.44259M), data:1967473 (1.87633M), overhead:593770 (579.854K), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] ------------------------------------------------------------------------------------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] Index name:o2_rating (replica 2), bucket:other-2, defnId:1521072769113754533, instId:17775040623170605968, new/moved:false defer:false ignoreEquivCheck:false
       2017-08-22T04:46:35.525+00:00 [Info] Index total memory:461156617 (439.793M), data:354247219 (337.836M), overhead:106909398 (101.957M), cpu:0.0000
       2017-08-22T04:46:35.525+00:00 [Info] ************ Indexer Layout *************
       2017-08-22T04:46:35.525+00:00 [Info] Score: 0
       2017-08-22T04:46:35.525+00:00 [Info] ElapsedTime: 0ns
       2017-08-22T04:46:35.525+00:00 [Info] ConvergenceTime: 0ns
       2017-08-22T04:46:35.525+00:00 [Info] Iteration: 0
       2017-08-22T04:46:35.525+00:00 [Info] Move: 0
       2017-08-22T04:46:35.525+00:00 [Info] ----------------------------------------
       2017-08-22T04:46:35.525+00:00 [Info] ****************************************
       2017-08-22T04:46:35.525+00:00 [Error] ServiceMgr::startRebalance Planner Error Validation fails: Index exceeding quota. Index=o1_rating Bucket=other-1 Memory=3771448998142593420 Cpu=0.0000 MemoryQuota=25652363264 CpuQuota=4

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            jliang John Liang
            mahesh.mandhare Mahesh Mandhare (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty