Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7785

Rebalance very very slow (almost stuck) with views (indexing, compaction and querying)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • 2.1.0
    • 2.0.1
    • couchbase-bucket
    • Security Level: Public
    • None
    • 2.0.1-160 Linux
      4 Core 30GB SSD machines
      Cluster: 8 nodes
      default (bucket quota: 12000MB per node) and saslbucket (bucket quota: 7000MB per node)

    Description

      • WIth continuous mixed front end load on both buckets
      • 2 views per design doc, one design doc per bucket
      • Continuous querying of views
      • Indexing, compaction of views running
      • Rebalance out (2 nodes) progressing at an extremely slow rate.
      • Rebalance progress says that rebalance is ticking, but the rate is very slow.
      • Rebalance started at:
        @13:26:24 - Tue Feb 19, 2013
        Starting rebalance, KeepNodes = ['ns_1@10.6.2.37','ns_1@10.6.2.38',
        'ns_1@10.6.2.39','ns_1@10.6.2.40',
        'ns_1@10.6.2.42','ns_1@10.6.2.43'], EjectNodes = ['ns_1@10.6.2.44',
        'ns_1@10.6.2.45']

      @19:15
      Rebalance still running, with progress at:
      {"status":"running","ns_1@10.6.2.40":

      {"progress":0.06956521739130433}

      ,"ns_1@10.6.2.42":

      {"progress":0.06896551724137934}

      ,"ns_1@10.6.2.43":

      {"progress":0.004310344827586188}

      ,"ns_1@10.6.2.44":

      {"progress":0.0625}

      ,"ns_1@10.6.2.45":

      {"progress":0.00390625}

      ,"ns_1@10.6.2.37":

      {"progress":0.31304347826086953}

      ,"ns_1@10.6.2.38":

      {"progress":0.25217391304347825}

      ,"ns_1@10.6.2.39":{"progress":0.06956521739130433}}

      • Note: system under heavy swap.
      • CPU delays <Calculated from: git://github.com/alk/measure-sched-delays.git>
        (few snapshots) ::-
        10.6.2.42
        Timestamp                   Delay (in ns)
        Core0:
        1361331302.593882561 149947
        1361331303.593873739 141985
        1361331304.593876600 145064
        Core1:
        1361331302.593858242 125967
        1361331303.593853951 121343
        1361331304.593852758 120741
        Core2:
        1361331302.593874931 142799
        1361331303.594290495 558828
        1361331304.593849897 117718
        Core3:
        1361331302.593858242 125937
        1361331303.593850613 118879
        1361331304.593876600 144151

      10.6.2.43
      Timestamp Delay (in ns)
      Core0:
      1361331720.056307554 105684
      1361331721.056310654 109774
      1361331722.056315899 114700
      Core1:
      1361331726.056282997 82602
      1361331727.056329966 128091
      1361331728.056305408 103860
      Core2:
      1361331725.058000803 1799409
      1361331726.056292295 90883
      1361331727.056301594 100907
      Core3:
      1361331729.056302309 100871
      1361331730.056302309 101271
      1361331731.056336880 135469

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            abhinav Abhi Dangeti
            abhinav Abhi Dangeti
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty