Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-4518

deleting items during rebalance results in desynced view

    Details

      Description

      1) cluster 2 nodes together
      2) add 100 json items
      3) create a simple view with _count reduce
      4) verify that all items are returned in the view
      5) start to rebalance in a 3rd node, while at the same time start deleting all the keys
      6) wait till both rebalance and deletes are done
      7) verify that all items are deleted from memcached
      8) verify that all items are gone from the view
      at this point in my test I had 7 items left

      1. logs_deletion.tar.bz2
        316 kB
        Aliaksey Artamonau
      # Subject Project Status CR V
      For Gerrit Dashboard: &For+MB-4518=message:MB-4518

        Activity

        Hide
        Aliaksey Artamonau Aliaksey Artamonau added a comment -

        It seems like an ep-engine issue. Was able to reproduce it with 2 and 3 node clusters. Just created 10k items and then deleted them while rebalancing in a new node. Waited till rebalance was complete. After this from time to time cbstats reported non-zero number of active items on the old node. Sometimes it reported zero active items but non-zero replica items. Although all the items were not accessible via memcached. But the items were present in couchdb and thus visible to views. Mike, sync up with Chiyoung on this issue please.

        Show
        Aliaksey Artamonau Aliaksey Artamonau added a comment - It seems like an ep-engine issue. Was able to reproduce it with 2 and 3 node clusters. Just created 10k items and then deleted them while rebalancing in a new node. Waited till rebalance was complete. After this from time to time cbstats reported non-zero number of active items on the old node. Sometimes it reported zero active items but non-zero replica items. Although all the items were not accessible via memcached. But the items were present in couchdb and thus visible to views. Mike, sync up with Chiyoung on this issue please.
        Hide
        karan Karan Kumar (Inactive) added a comment -

        Thanks Aliaksey. This is definitely an issue currently.
        I was able to easily reproduce this. After rebalance, the active_items != replica_items.

        Will open another bug.

        Show
        karan Karan Kumar (Inactive) added a comment - Thanks Aliaksey. This is definitely an issue currently. I was able to easily reproduce this. After rebalance, the active_items != replica_items. Will open another bug.
        Hide
        karan Karan Kumar (Inactive) added a comment -

        For reproducing this:-
        1) Keep delete workload going in parallel
        2) Issue rebalance of nodes in.

        Show
        karan Karan Kumar (Inactive) added a comment - For reproducing this:- 1) Keep delete workload going in parallel 2) Issue rebalance of nodes in.
        Hide
        mikew Mike Wiederhold added a comment -

        This issue is caused by an error in views. I was able to verify that there were no items in couchdb after my deleting everything, but my view still reported having items. Alaiksey was able to produce a scenario however where active/replica item counts were not 0 so I will look into that issue. It is filed as MB-4661.

        Show
        mikew Mike Wiederhold added a comment - This issue is caused by an error in views. I was able to verify that there were no items in couchdb after my deleting everything, but my view still reported having items. Alaiksey was able to produce a scenario however where active/replica item counts were not 0 so I will look into that issue. It is filed as MB-4661 .
        Hide
        steve Steve Yen added a comment -

        Aliaksey A looking at this right now

        Show
        steve Steve Yen added a comment - Aliaksey A looking at this right now
        Hide
        Aliaksey Artamonau Aliaksey Artamonau added a comment -

        Attaching more log files. From ns_server perspective everything looks fine. On all the nodes correct vbucket are indexed. I also verified that update_seqs reported by couch_set_view:get_group_info and by couch_db:get_update_seq are the same.

        Show
        Aliaksey Artamonau Aliaksey Artamonau added a comment - Attaching more log files. From ns_server perspective everything looks fine. On all the nodes correct vbucket are indexed. I also verified that update_seqs reported by couch_set_view:get_group_info and by couch_db:get_update_seq are the same.
        Hide
        Aliaksey Artamonau Aliaksey Artamonau added a comment -

        Assigning to Filipe for further investigation.

        Show
        Aliaksey Artamonau Aliaksey Artamonau added a comment - Assigning to Filipe for further investigation.
        Hide
        steve Steve Yen added a comment -

        Hi Damien,
        Any news/status on this one?
        Thanks

        Show
        steve Steve Yen added a comment - Hi Damien, Any news/status on this one? Thanks
        Hide
        steve Steve Yen added a comment -

        Hi Damien,
        Any news/status on this one? Ping #2.
        Thanks

        Show
        steve Steve Yen added a comment - Hi Damien, Any news/status on this one? Ping #2. Thanks
        Hide
        damien damien added a comment -

        Was able to reproduce, but haven't been able to spend a lot of time on it yet. Filipe is also looking at it.

        Show
        damien damien added a comment - Was able to reproduce, but haven't been able to spend a lot of time on it yet. Filipe is also looking at it.
        Hide
        damien damien added a comment -

        This appears to be a duplicate of MB-4692. It appears that some vbuckets aren't properly cleaned/omitted from view indexes.

        Show
        damien damien added a comment - This appears to be a duplicate of MB-4692 . It appears that some vbuckets aren't properly cleaned/omitted from view indexes.
        Hide
        filipe manana filipe manana added a comment -
        Show
        filipe manana filipe manana added a comment - http://review.couchbase.org/#change,12767 fixes it
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        viewtests.ViewRebalanceTests.test_delete_x_docs_rebalance_in 1 min 43 sec Fixed
        viewtests.ViewRebalanceTests.test_delete_x_docs_rebalance_out 1 min 53 sec Fixed
        viewtests.ViewRebalanceTests.test_load_x_during_rebalance 6 min 17 sec Fixed
        viewtests.ViewRebalanceTests.test_view_stop_start_incremental_rebalance

        build : 2.0.0r-643-g4e529d3

        Show
        farshid Farshid Ghods (Inactive) added a comment - viewtests.ViewRebalanceTests.test_delete_x_docs_rebalance_in 1 min 43 sec Fixed viewtests.ViewRebalanceTests.test_delete_x_docs_rebalance_out 1 min 53 sec Fixed viewtests.ViewRebalanceTests.test_load_x_during_rebalance 6 min 17 sec Fixed viewtests.ViewRebalanceTests.test_view_stop_start_incremental_rebalance build : 2.0.0r-643-g4e529d3

          People

          • Assignee:
            damien damien
            Reporter:
            keith Keith Batten (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes