Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-35889

Replication can get stuck when checkpoint memory overhead is very high

    XMLWordPrintable

Details

    • Untriaged
    • No
    • KV-Engine MH 2nd Beta, KV Sprint 2020-April

    Description

      Build 6.5.0-4218

      Observed that replication stuck when data service goes into low resident ratio.
      While running some HiDD tests on couchbase bucket we came across this issue.
      In this test we have 2 data nodes, load 250M docs and RR goes to 0.43%. After load phase we wait for "ep_dcp_replica_items_remaining" to go to zero. "ep_dcp_replica_items_remaining" stays ~19K and never become zero.

      Job- http://perf.jenkins.couchbase.com/job/magma-hidd/441
      Logs-
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/replica_issue_couchbase/collectinfo-2019-09-10T055022-ns_1%40172.23.97.38.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/replica_issue_couchbase/collectinfo-2019-09-10T055022-ns_1%40172.23.97.39.zip
       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            mahesh.mandhare Mahesh Mandhare (Inactive) created issue -
            raju Raju Suravarjjala made changes -
            Field Original Value New Value
            Fix Version/s Mad-Hatter [ 15037 ]
            sarath Sarath Lakshman made changes -
            Labels Performance Performance hidd
            owend Daniel Owen made changes -
            Assignee Daniel Owen [ owend ] Jim Walker [ jwalker ]
            jwalker Jim Walker made changes -
            Assignee Jim Walker [ jwalker ] Mahesh Mandhare [ mahesh.mandhare ]
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Resolved [ 5 ]
            jwalker Jim Walker made changes -
            Assignee Mahesh Mandhare [ mahesh.mandhare ] Jim Walker [ jwalker ]
            Resolution Fixed [ 1 ]
            Status Resolved [ 5 ] Reopened [ 4 ]
            jwalker Jim Walker made changes -
            Status Reopened [ 4 ] In Progress [ 3 ]
            drigby Dave Rigby made changes -
            Sprint KV-Engine MH 2nd Beta [ 872 ]
            drigby Dave Rigby made changes -
            Rank Ranked higher
            james.harrison James Harrison made changes -
            Link This issue is duplicated by MB-35970 [ MB-35970 ]
            jwalker Jim Walker made changes -
            Assignee Jim Walker [ jwalker ] Mahesh Mandhare [ mahesh.mandhare ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Resolved [ 5 ]
            mahesh.mandhare Mahesh Mandhare (Inactive) made changes -
            Status Resolved [ 5 ] Closed [ 6 ]
            mahesh.mandhare Mahesh Mandhare (Inactive) made changes -
            Resolution Fixed [ 1 ]
            Status Closed [ 6 ] Reopened [ 4 ]
            mahesh.mandhare Mahesh Mandhare (Inactive) made changes -
            Assignee Mahesh Mandhare [ mahesh.mandhare ] Jim Walker [ jwalker ]
            jwalker Jim Walker made changes -
            Fix Version/s Cheshire-Cat [ 15915 ]
            Fix Version/s Mad-Hatter [ 15037 ]
            drigby Dave Rigby made changes -
            Sprint KV-Engine MH 2nd Beta [ 872 ] KV-Engine MH 2nd Beta, KV-Engine MH 2nd Beta 2 [ 872, 910 ]
            drigby Dave Rigby made changes -
            Sprint KV-Engine MH 2nd Beta, KV-Engine Mad-Hatter GA [ 872, 910 ] KV-Engine MH 2nd Beta [ 872 ]
            drigby Dave Rigby made changes -
            Rank Ranked higher
            owend Daniel Owen made changes -
            Epic Link MB-30659 [ 88207 ]
            ben.huddleston Ben Huddleston made changes -
            Assignee Jim Walker [ jwalker ] Ben Huddleston [ ben.huddleston ]
            ben.huddleston Ben Huddleston made changes -
            Link This issue relates to MB-38012 [ MB-38012 ]
            ben.huddleston Ben Huddleston made changes -
            Summary Replication stuck if data service in low resident ratio Replication can get stuck when checkpoint memory overhead is very high
            ben.huddleston Ben Huddleston made changes -
            Sprint KV-Engine MH 2nd Beta [ 872 ] KV-Engine MH 2nd Beta, KV Spint 2020-March [ 872, 1002 ]
            ben.huddleston Ben Huddleston made changes -
            Rank Ranked lower
            owend Daniel Owen made changes -
            Sprint KV-Engine MH 2nd Beta, KV Spint 2020-March [ 872, 1002 ] KV-Engine MH 2nd Beta, KV Sprint 2020-April [ 872, 1044 ]
            owend Daniel Owen made changes -
            Rank Ranked lower
            owend Daniel Owen made changes -
            Link This issue relates to CBSE-8284 [ CBSE-8284 ]
            drigby Dave Rigby made changes -
            Labels Performance hidd Performance candidate-for-6.6 hidd
            drigby Dave Rigby made changes -
            Fix Version/s 6.6.0 [ 16787 ]
            ben.huddleston Ben Huddleston made changes -
            Link This issue is duplicated by MB-35970 [ MB-35970 ]
            ben.huddleston Ben Huddleston made changes -
            Link This issue relates to MB-35970 [ MB-35970 ]
            owend Daniel Owen made changes -
            Affects Version/s 6.5.1 [ 16622 ]
            drigby Dave Rigby made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            till Till Westmann made changes -
            Link This issue blocks MB-38724 [ MB-38724 ]
            till Till Westmann made changes -
            Labels Performance candidate-for-6.6 hidd Performance approved-for-6.6.0 candidate-for-6.6 hidd
            drigby Dave Rigby made changes -
            Fix Version/s Cheshire-Cat [ 15915 ]
            Resolution Fixed [ 1 ]
            Status Reopened [ 4 ] Resolved [ 5 ]
            ben.huddleston Ben Huddleston made changes -
            Resolution Fixed [ 1 ]
            Status Resolved [ 5 ] Reopened [ 4 ]
            ben.huddleston Ben Huddleston made changes -
            Link This issue causes MB-39435 [ MB-39435 ]
            ben.huddleston Ben Huddleston made changes -
            Resolution Fixed [ 1 ]
            Status Reopened [ 4 ] Resolved [ 5 ]
            wayne Wayne Siu made changes -
            Assignee Ben Huddleston [ ben.huddleston ] Bo-Chun Wang [ bo-chun.wang ]
            ben.huddleston Ben Huddleston made changes -
            Link This issue is duplicated by MB-39440 [ MB-39440 ]
            wayne Wayne Siu made changes -
            Labels Performance approved-for-6.6.0 candidate-for-6.6 hidd Performance affects-cc-testing approved-for-6.6.0 candidate-for-6.6 hidd
            bo-chun.wang Bo-Chun Wang made changes -
            Status Resolved [ 5 ] Closed [ 6 ]
            richard.demellow Richard deMellow made changes -
            Link This issue relates to MB-41283 [ MB-41283 ]

            People

              bo-chun.wang Bo-Chun Wang
              mahesh.mandhare Mahesh Mandhare (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty