Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-46050

(Index reset on rollback) Partitioned Indexes online stay at state "deferred" for a long time after flushing bucket (2000 indexes)

    XMLWordPrintable

Details

    Description

      Steps to reproduce
      1. Create 2 node init cluster kv, n1ql
      2. Create a bucket of total 1k collections. Each collection has 10 documents
      3. Rebalance-in 18 index nodes
      4. Rebalance-out 14 index nodes. So 4 indexes nodes are now in the cluster.
      5. Create 2K partitioned indexes (no replicas, and partitioned by hash) such that there are 2 indexes created per collection. Each index is indexed with 10 documents now with defer_build=true. Build them after they all get deferred. 
      Total online indexes count is 2K at this point.
      6. Rebalance-in 14 index nodes
      Rebalance completes in close to 10 hours.
      7. Flush the entire bucket.
      Now the indexes go to state "deferred" (from online) for a long time. Indexes UI page shows that "created" as against to "ready".

      SELECT count(*) FROM system:indexes where state = 'deferred'

      gives 2000
      Eventually after some time (approx ~1 hour I think), they started getting rebuilt.

      Logs attached. 

      Attachments

        1. after_flush.png
          after_flush.png
          393 kB
        2. before_flush.png
          before_flush.png
          428 kB
        3. rebalance_and_flush.png
          rebalance_and_flush.png
          545 kB
        4. servers.png
          servers.png
          489 kB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            varun.velamuri Varun Velamuri added a comment - - edited

            Kevin Cherkauer, Assigning this MB to you as this might be similar to the other bug you are working on MB-45919 (Deleting a bucket with 3000 indexes). Please feel free to assign it back if you think it is different.

            [Kevin: The current MB is not related to MB-45919.]

            varun.velamuri Varun Velamuri added a comment - - edited Kevin Cherkauer , Assigning this MB to you as this might be similar to the other bug you are working on MB-45919 (Deleting a bucket with 3000 indexes). Please feel free to assign it back if you think it is different. [Kevin: The current MB is not related to MB-45919.]

            From the logs on 172.23.107.44

            1. Indexes begin to reset as part of bucket flush

            2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 3
            2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 5881558935989112103 to reset list
            2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 427123876923058581 to reset list
            2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 17373471062032708796 to reset list
            2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 9174387992023340348 to reset list
            2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 18373287532344483662 to reset list
            2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 16439633184987265989 to reset list
            2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 15145705994105998272 to reset list
            
            

            2. Metadata gets reset

            2021-05-02T06:32:56.907-07:00 [Info] LifecycleMgr.handleResetIndexOnRollback() : Reset Index 2525901244719920838
            2021-05-02T06:32:56.908-07:00 [Info] Indexer::resetSingleIndexOnRollback Reset done 3CThnWN%0jVNsI-47-507000 bAF76oHnU%Rbc3JIyTE-47-600000 CIvV--47-605000 gsi-
            set-0-1864
            2021-05-02T06:32:56.914-07:00 [Info] lifecycleMgr.dispatchRequest: op OPCODE_RESET_INDEX_ON_ROLLBACK elapsed 6.852667ms len(expediates) 0 len(incomings) 730 le
            n(outgoings) 0
            2021-05-02T06:32:56.914-07:00 [Info] LifecycleMgr.handleResetIndexOnRollback() : Reset Index 15343515144886787986
            2021-05-02T06:32:56.916-07:00 [Info] lifecycleMgr.dispatchRequest: op OPCODE_RESET_INDEX_ON_ROLLBACK elapsed 2.213312ms len(expediates) 0 len(incomings) 729 le
            n(outgoings) 0
            

            3. Builder is slowly building indexes for 30mins.

            2021-05-02T06:32:58.416-07:00 [Info] Indexer::handleBuildIndex [427123876923058581]
            2021-05-02T06:32:58.442-07:00 [Info] Indexer::handleBuildIndex [17373471062032708796]
            2021-05-02T07:01:39.297-07:00 [Info] Indexer::handleBuildIndex Added Index: [8349798319779408284] to Stream: INIT_STREAM State: INDEX_STATE_INITIAL
            

            The mechanism of rebuilding the indexes needs to be much faster, otherwise this can lead to index unavailability for extended period of time with large number of indexes. The resetting of indexes on rollback to 0/bucket flush, has been disabled as part of MB-45920MB-46125 tracks the work to enable it for 7.0.1.

            Moving this ticket to 7.0.1.

            deepkaran.salooja Deepkaran Salooja added a comment - From the logs on 172.23.107.44 1. Indexes begin to reset as part of bucket flush 2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 3 2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 5881558935989112103 to reset list 2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 427123876923058581 to reset list 2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 17373471062032708796 to reset list 2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 9174387992023340348 to reset list 2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 18373287532344483662 to reset list 2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 16439633184987265989 to reset list 2021-05-02T06:32:56.580-07:00 [Info] Indexer::resetIndexesOnRollback MAINT_STREAM 3CThnWN%0jVNsI-47-507000 Adding 15145705994105998272 to reset list 2. Metadata gets reset 2021-05-02T06:32:56.907-07:00 [Info] LifecycleMgr.handleResetIndexOnRollback() : Reset Index 2525901244719920838 2021-05-02T06:32:56.908-07:00 [Info] Indexer::resetSingleIndexOnRollback Reset done 3CThnWN%0jVNsI-47-507000 bAF76oHnU%Rbc3JIyTE-47-600000 CIvV--47-605000 gsi- set-0-1864 2021-05-02T06:32:56.914-07:00 [Info] lifecycleMgr.dispatchRequest: op OPCODE_RESET_INDEX_ON_ROLLBACK elapsed 6.852667ms len(expediates) 0 len(incomings) 730 le n(outgoings) 0 2021-05-02T06:32:56.914-07:00 [Info] LifecycleMgr.handleResetIndexOnRollback() : Reset Index 15343515144886787986 2021-05-02T06:32:56.916-07:00 [Info] lifecycleMgr.dispatchRequest: op OPCODE_RESET_INDEX_ON_ROLLBACK elapsed 2.213312ms len(expediates) 0 len(incomings) 729 le n(outgoings) 0 3. Builder is slowly building indexes for 30mins. 2021-05-02T06:32:58.416-07:00 [Info] Indexer::handleBuildIndex [427123876923058581] 2021-05-02T06:32:58.442-07:00 [Info] Indexer::handleBuildIndex [17373471062032708796] 2021-05-02T07:01:39.297-07:00 [Info] Indexer::handleBuildIndex Added Index: [8349798319779408284] to Stream: INIT_STREAM State: INDEX_STATE_INITIAL The mechanism of rebuilding the indexes needs to be much faster, otherwise this can lead to index unavailability for extended period of time with large number of indexes. The resetting of indexes on rollback to 0/bucket flush, has been disabled as part of MB-45920 .  MB-46125 tracks the work to enable it for 7.0.1. Moving this ticket to 7.0.1.
            kevin.cherkauer Kevin Cherkauer added a comment - - edited

            The current MB is for a partial rewrite of the index builder (lifecycle.go builder class) so that it adaptively parallelizes index builds instead of building in batches of 5 (actually batches of at most config.go indexer.settings.build.batch_size, which defaults to 5). Instead it should maximize the parallelism w.r.t. available resources on the box.

            From Deepkaran Salooja: "The builder needs to decide the number based on resource usage on the box(e.g. resident ratio of active indexes). Also, for indexes on same collection, those can be built together as it will need a single DCP stream."

            "Without reset, the stream will restart DCP from 0 at bucket level. If there are 1M docs in the bucket, it will stream everything, even if only 1 collection(and 100 docs) are required."

            After changing the algorithm (current MB) and completing MB-46124, then index reset on rollback needs to be reenabled (MB-46125). This is controlled by config.go "indexer.recovery.reset_index_on_rollback" added by MB-45920.

            Reset on rollback changes the reset indexes from active to created, so they get built again using collection-specific INIT_STREAMs. When this is disabled, instead rolled back indexes cause the MAINT_STREAM for the bucket to be restarted from 0, replaying all documents in the bucket. In the current MB's case this is actually very efficient since all indexes on the bucket needed rebuilding and they could all share that single stream. However it becomes inefficient if only one or a few indexes are rolled back, as they could be rebuilt using collection-specific INIT_STREAMS with much less data movement (only the documents in the target collection(s) rather than all those in the bucket).

            This was disabled because the current 5-at-a-time builder algorithm is much slower when thousands of indexes are reset en masse, as well as that MB-46124 needs to address some more RState corner cases.

            kevin.cherkauer Kevin Cherkauer added a comment - - edited The current MB is for a partial rewrite of the index builder (lifecycle.go builder class) so that it adaptively parallelizes index builds instead of building in batches of 5 (actually batches of at most config.go indexer.settings.build.batch_size, which defaults to 5). Instead it should maximize the parallelism w.r.t. available resources on the box. From Deepkaran Salooja : "The builder needs to decide the number based on resource usage on the box(e.g. resident ratio of active indexes). Also, for indexes on same collection, those can be built together as it will need a single DCP stream." " Without reset, the stream will restart DCP from 0 at  bucket   level . If there are 1M docs in the  bucket , it will stream everything, even if only 1 collection(and 100 docs) are required. " After changing the algorithm (current MB) and completing MB-46124 , then index reset on rollback needs to be reenabled ( MB-46125 ). This is controlled by config.go "indexer.recovery.reset_index_on_rollback" added by MB-45920 . Reset on rollback changes the reset indexes from active to created, so they get built again using collection-specific INIT_STREAMs. When this is disabled, instead rolled back indexes cause the MAINT_STREAM for the bucket to be restarted from 0, replaying all documents in the bucket. In the current MB's case this is actually very efficient since all indexes on the bucket needed rebuilding and they could all share that single stream. However it becomes inefficient if only one or a few indexes are rolled back, as they could be rebuilt using collection-specific INIT_STREAMS with much less data movement (only the documents in the target collection(s) rather than all those in the bucket). This was disabled because the current 5-at-a-time builder algorithm is much slower when thousands of indexes are reset en masse, as well as that MB-46124 needs to address some more RState corner cases.
            kevin.cherkauer Kevin Cherkauer added a comment - - edited

            Jeelan Poola per discussion with Deepkaran Salooja this MB is a candidate to move to 7.1.0 instead of 7.0.1 due to its scope. I will move it there for now and if you disagree you can move it back. Scope is a significant rewrite of the builder logic.

            kevin.cherkauer Kevin Cherkauer added a comment - - edited Jeelan Poola per discussion with Deepkaran Salooja this MB is a candidate to move to 7.1.0 instead of 7.0.1 due to its scope. I will move it there for now and if you disagree you can move it back. Scope is a significant rewrite of the builder logic.

            Changed from improvement to bug per Slack message from Jeelan Poola .

            kevin.cherkauer Kevin Cherkauer added a comment - Changed from improvement to bug per Slack message from Jeelan Poola .

            Ideally we would want to review the MB-46124 and get reset index working, then we fix the MB-46050 with this we will have reset index functionality enabled which is MB-46125 after which we can pick up MB-47544

            MB-46124 and MB-46050 are high effort items. Given that  we have been living without the ability to reset index for a long time and has not been a major issue we will defer all 4 issues to the next release. 

             

            Details: https://docs.google.com/document/d/1gskoBXjRmXgeMo0W2Ct6rkOJf8MAvF5ERw7ImKWNqe8/edit?usp=sharing

            yogendra.acharya Yogendra Acharya (Inactive) added a comment - Ideally we would want to review the  MB-46124  and get reset index working, then we fix the  MB-46050  with this we will have reset index functionality enabled which is  MB-46125  after which we can pick up  MB-47544 MB-46124  and  MB-46050  are high effort items. Given that  we have been living without the ability to reset index for a long time and has not been a major issue we will defer all 4 issues to the next release.    Details:  https://docs.google.com/document/d/1gskoBXjRmXgeMo0W2Ct6rkOJf8MAvF5ERw7ImKWNqe8/edit?usp=sharing

            People

              kevin.cherkauer Kevin Cherkauer
              sumedh.basarkod Sumedh Basarkod (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty