Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7053

Expired items are not excluded from production/dev views until expiry pager runs ( which deletes the items from database permanently )

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0
    • Component/s: None
    • Security Level: Public
    • Environment:

      Description

      Expired items are not excluded from production/dev views. This happens only after the item is "get".

      Steps to reproduce:

      1. Create default bucket and create 1 production view
      curl -X PUT -H 'Content-Type: application/json' 'http://Administrator:asdasd@127.0.0.1:9500/default/_design/d1' -d '{"views":{"v1":{"map":"function(doc,meta)

      {\nemit(meta.id,doc);\n}

      "}}}'

      {"ok":true,"id":"_design/d1"}

      2. Insert 2 items from memcached client with expiry set to 5 seconds

      >>> import mc_bin_client
      >>> mc = mc_bin_client.MemcachedClient(port=12001)
      >>> mc.set("ab", 5, 0, "val")
      (3306888435, 7973028514358, '')
      >>> mc.set("ab1", 5, 0, "val")
      (2896330942, 7997559610975, '')

      3. Query the view with stale=false to build the index
      curl -X GET 'http://127.0.0.1:9500/default/_design/d1/_view/v1?stale=false'
      {"total_rows":2,"rows":[

      {"id":"ab","key":"ab","value":"dmFs"}

      ,

      {"id":"ab1","key":"ab1","value":"dmFs"}

      ]
      }

      4. Query the view after couple of minutes and expired items are still returned
      curl -X GET 'http://127.0.0.1:9500/default/_design/d1/_view/v1?stale=false'
      {"total_rows":2,"rows":[

      {"id":"ab","key":"ab","value":"dmFs"}

      ,

      {"id":"ab1","key":"ab1","value":"dmFs"}

      ]
      }

      If memcached get is used with these items, then these are excluded from the views. Otherwise these are always returned in view results.

      Diagnostic is attached.

      Checked with Mike on this and ep-engine seems to be doing things correctly:

      "This does look like an issue, but not in the ep-engine side. Since ep-engine might take an hour to actually remove an expired item, it should be up to the view engine to filter out any expired items too. The reaon why doing a get will cause the item to disappear from the view results is that ep-engine will actually do the deletion."

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        steve Steve Yen added a comment -

        options...

        • idea from Damien (might repeat something Filipe had on MB-6219?): track expiration time in secondary index.
        • run expiry pager more often - this scans entire hashtable, locking each hashtable partition at a time. This only reduces the window of the issue, but doesn't fundamentally solve the issue. Ask some customers set their expiry pager more often (e.g., once every 10 minutes for small/medium cluster).
        Show
        steve Steve Yen added a comment - options... idea from Damien (might repeat something Filipe had on MB-6219 ?): track expiration time in secondary index. run expiry pager more often - this scans entire hashtable, locking each hashtable partition at a time. This only reduces the window of the issue, but doesn't fundamentally solve the issue. Ask some customers set their expiry pager more often (e.g., once every 10 minutes for small/medium cluster).
        Hide
        steve Steve Yen added a comment -

        For 2.0, options discussed in bug-scrub...

        Recommend to impacted customers to run expiry pager more often – e.g., once every 10 minutes for small/medium clusters. This can help mitigate the inconsistency (but not fully solve) the issue. Running expiry pager more often also might have disk I/O impact / tradeoff, where it would actually write deletes to disk. User should also be aware of expiry pager in tradeoff with all the other dispatcher / tasks.

        Frank, Dipti, Yaseen to discuss today (2012/11/02) outside of bug-scrub mtg and resolve.

        Show
        steve Steve Yen added a comment - For 2.0, options discussed in bug-scrub... Recommend to impacted customers to run expiry pager more often – e.g., once every 10 minutes for small/medium clusters. This can help mitigate the inconsistency (but not fully solve) the issue. Running expiry pager more often also might have disk I/O impact / tradeoff, where it would actually write deletes to disk. User should also be aware of expiry pager in tradeoff with all the other dispatcher / tasks. Frank, Dipti, Yaseen to discuss today (2012/11/02) outside of bug-scrub mtg and resolve.
        Hide
        FilipeManana Filipe Manana (Inactive) added a comment - - edited

        Perhaps, I was not fully clear before.

        Tracking the expiration times in the indexes (values at the leaf nodes) would solve the issue for map view queries. True.
        However, we already have plenty of metadata tracked in the indexes (vbucket id for each value, 1024 bits/128 bytes bitmasks), that makes us lose some performance (deeper trees, smaller branching factor per tree node). At the moment, querying Apache CouchDB is faster than Couchbase Server (single node of course, to be fair).

        Now, for reduce views... Excluding values that were contributed by now-expired documents, means going down to all the leaf nodes to find out which values come from expired documents and which ones come from non-expired documents - then grab all the values from non-expired documents, applying reduce function against those values, and going up to the tree applying re-reduces until reaching the root. In other words, we would be doing not better than a linear scan, and defeating the whole purpose of intermediary reductions/efficiency for which CouchDB trees are known for.
        Basically doing this would, at the very best, give query response times of a few seconds (being very optimistic here) for any reasonably sized index (even less than than 1M items perhaps).

        Show
        FilipeManana Filipe Manana (Inactive) added a comment - - edited Perhaps, I was not fully clear before. Tracking the expiration times in the indexes (values at the leaf nodes) would solve the issue for map view queries. True. However, we already have plenty of metadata tracked in the indexes (vbucket id for each value, 1024 bits/128 bytes bitmasks), that makes us lose some performance (deeper trees, smaller branching factor per tree node). At the moment, querying Apache CouchDB is faster than Couchbase Server (single node of course, to be fair). Now, for reduce views... Excluding values that were contributed by now-expired documents, means going down to all the leaf nodes to find out which values come from expired documents and which ones come from non-expired documents - then grab all the values from non-expired documents, applying reduce function against those values, and going up to the tree applying re-reduces until reaching the root. In other words, we would be doing not better than a linear scan, and defeating the whole purpose of intermediary reductions/efficiency for which CouchDB trees are known for. Basically doing this would, at the very best, give query response times of a few seconds (being very optimistic here) for any reasonably sized index (even less than than 1M items perhaps).
        Hide
        mccouch MC Brown (Inactive) added a comment -

        The documentation has been updated at multiple points to highlight the inclusion of documents that may have expired, but not been fully deleted yet.

        Show
        mccouch MC Brown (Inactive) added a comment - The documentation has been updated at multiple points to highlight the inclusion of documents that may have expired, but not been fully deleted yet.
        Hide
        kzeller kzeller added a comment -

        Added to Release Notes as:

        Couchbase Server does lazy expiration, that is, expired items are flagged as
        deleted rather than being immediately erased. Couchbase Server has
        a maintenance process that will periodically look through all information and erase expired items.
        This means expired items may still be indexed and appear in result sets of views. The workarounds
        are available here <ulink url="http://www.couchbase.com/docs/couchbase-devguide-2.0/about-ttl-values.html">
        About Document Expiration</ulink>.

        Show
        kzeller kzeller added a comment - Added to Release Notes as: Couchbase Server does lazy expiration, that is, expired items are flagged as deleted rather than being immediately erased. Couchbase Server has a maintenance process that will periodically look through all information and erase expired items. This means expired items may still be indexed and appear in result sets of views. The workarounds are available here <ulink url="http://www.couchbase.com/docs/couchbase-devguide-2.0/about-ttl-values.html"> About Document Expiration</ulink>.

          People

          • Assignee:
            mccouch MC Brown (Inactive)
            Reporter:
            deepkaran.salooja Deepkaran Salooja
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes