Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-43055

ItemPager available flag not reset if run() early exits

    XMLWordPrintable

    Details

    • Triage:
      Triaged
    • Story Points:
      1
    • Is this a Regression?:
      Yes
    • Sprint:
      KV-Engine 2021-Jan

      Description

      There is a bug which, if triggered, will stop a bucket from successfully evicting ever again (until restart/bucket deletion).


      Bug introduced in http://review.couchbase.org/c/kv_engine/+/133197 (6.6.1 pre-GA).

      item_pager.cc,ItemPager::run

      112
          if (((current > upper) || doEvict || wasNotified) &&
      113
              (*available).compare_exchange_strong(inverse, false)) {
      114
              if (kvBucket->getItemEvictionPolicy() == EvictionPolicy::Value) {
      115
                  doEvict = true;
      116
              }
      117
       
      118
              ++stats.pagerRuns;
      119
       
      120
              if (current <= lower) {
      121
                  // early exit - no need to run a paging visitor
      122
                  return true;
      123
              }
      

      The above change introduced the highlighted early exit; in the event that the item pager is notified to run, but the memory usage has dropped below the low watermark, don't evict. However, this is not safe to do -

      item_pager.cc,ItemPager::run

      112
          if (((current > upper) || doEvict || wasNotified) &&
      113
              (*available).compare_exchange_strong(inverse, false)) {
      

      Available is an atomic bool used to track if an item pager is currently running. It has been set to false here, but is only reset once the created PagingVisitor is completed. If the visitor creation is skipped, the flag is never reset, so no further item pager runs will occur.

        Attachments

          Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

            Activity

            Hide
            james.harrison James Harrison added a comment -

            The 6.0.5 behaviour is correct despite failing the test, ep_num_pager_runs should not increase while mem_used < low_wm, confirmed locally and in unit tests.

            It appears 6.5.2 passes the test with the incorrect expectation on ep_num_pager_runs due to unrelated allocations raising mem_used above low_wm again at the time the item pager runs - unfortunately this situation does not test the fix in this MB.

            A test change has been discussed with Ashwin Govindarajulu that should lead to consistent results across both versions. No further KV code changes should be necessary.

            Show
            james.harrison James Harrison added a comment - The 6.0.5 behaviour is correct despite failing the test, ep_num_pager_runs should not increase while mem_used < low_wm , confirmed locally and in unit tests. It appears 6.5.2 passes the test with the incorrect expectation on ep_num_pager_runs due to unrelated allocations raising mem_used above low_wm again at the time the item pager runs - unfortunately this situation does not test the fix in this MB. A test change has been discussed with Ashwin Govindarajulu that should lead to consistent results across both versions. No further KV code changes should be necessary.
            Hide
            ashwin.govindarajulu Ashwin Govindarajulu added a comment -

            As discussed with James Harrison updated the test case.

            Validated the test patch - http://review.couchbase.org/c/TAF/+/143266 and below are the test results,

            Test failed on "6.6.1-9202-enterprise" as expected

            Test passed on following build:

            6.0.5-3340-enterprise

            6.6.1-9203-enterprise

            6.5.2-6614-enterprise

            Show
            ashwin.govindarajulu Ashwin Govindarajulu added a comment - As discussed with James Harrison updated the test case. Validated the test patch - http://review.couchbase.org/c/TAF/+/143266 and below are the test results, Test failed on "6.6.1-9202-enterprise" as expected Test passed on following build: 6.0.5-3340-enterprise 6.6.1-9203-enterprise 6.5.2-6614-enterprise
            Hide
            ashwin.govindarajulu Ashwin Govindarajulu added a comment -

            Updated test passes on latest builds.

            Closing the ticket.

            Show
            ashwin.govindarajulu Ashwin Govindarajulu added a comment - Updated test passes on latest builds. Closing the ticket.
            Hide
            build-team Couchbase Build Team added a comment -

            Build couchbase-server-6.6.2-9429 contains kv_engine commit ec21fca with commit message:
            MB-43055: [BP] Ensure ItemPager available is not left set to false

            Show
            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.2-9429 contains kv_engine commit ec21fca with commit message: MB-43055 : [BP] Ensure ItemPager available is not left set to false
            Hide
            build-team Couchbase Build Team added a comment -

            Build couchbase-server-7.0.0-4258 contains kv_engine commit ec21fca with commit message:
            MB-43055: [BP] Ensure ItemPager available is not left set to false

            Show
            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-4258 contains kv_engine commit ec21fca with commit message: MB-43055 : [BP] Ensure ItemPager available is not left set to false

              People

              Assignee:
              ashwin.govindarajulu Ashwin Govindarajulu
              Reporter:
              james.harrison James Harrison
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  PagerDuty