Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-41641

[BP to 6.6.1 of MB-38631]- Optimize ComputeArrayEntriesWithCount method

    XMLWordPrintable

    Details

    • Triage:
      Untriaged
    • Is this a Regression?:
      Unknown

      Description

      Currently, the compute complexity of the method ComputeArrayEntriesWithCount is O(len(newKey) * len(oldKey)).

      The function of this method seems to remove the common entries in both the arrays. This can be achieved using O(len(newKey) + len(oldKey))
       

        Attachments

          Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

            Activity

            Hide
            mihir.kamdar Mihir Kamdar added a comment -

            @Jeelan Poola Varun Velamuri can you pls describe the impact of this fix, or some thoughts on how it should be functionally tested ? Is perf testing required ?

            Show
            mihir.kamdar Mihir Kamdar added a comment - @ Jeelan Poola Varun Velamuri can you pls describe the impact of this fix, or some thoughts on how it should be functionally tested ? Is perf testing required ?
            Hide
            varun.velamuri Varun Velamuri added a comment -

            Mihir Kamdar This fix is targeted towards array indexing optimisations. As a part of testing, we should run through the existing array indexing functional test suite. Perf testing for array indexing is also required.

            Show
            varun.velamuri Varun Velamuri added a comment - Mihir Kamdar  This fix is targeted towards array indexing optimisations. As a part of testing, we should run through the existing array indexing functional test suite. Perf testing for array indexing is also required.
            Hide
            varun.velamuri Varun Velamuri added a comment -

            Mihir Kamdar, We should also target this test:

            a. Have a continuous scan load - low latency scans

            b. Have an array index and documents with large arrays (i.e.100 documents with around 5000 array entries or more)

            c. Update the array entries and check the latency of scans when update is happening

            Without the fix, the scan's should take more time when array updates happen. During this time, if we monitor the indexer pause due to GC, there should be a spike. 

            With the fix, that should not be the case - scan latencies should be normal and no spikes in indexer pause due to GC.

             

            Show
            varun.velamuri Varun Velamuri added a comment - Mihir Kamdar , We should also target this test: a. Have a continuous scan load - low latency scans b. Have an array index and documents with large arrays (i.e.100 documents with around 5000 array entries or more) c. Update the array entries and check the latency of scans when update is happening Without the fix, the scan's should take more time when array updates happen. During this time, if we monitor the indexer pause due to GC, there should be a spike.  With the fix, that should not be the case - scan latencies should be normal and no spikes in indexer pause due to GC.  
            Hide
            build-team Couchbase Build Team added a comment -

            Build couchbase-server-6.6.1-9103 contains indexing commit 905ad81 with commit message:
            MB-41641 Array Indexing performance improvements

            Show
            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.1-9103 contains indexing commit 905ad81 with commit message: MB-41641 Array Indexing performance improvements
            Hide
            hemant.rajput Hemant Rajput added a comment -

            Validated on 6.6.1-9133

             

            Load the bucket provided by Varun for array index containing array field with 5000 entries
            a. Create the array index like this - CREATE INDEX `idx_arr` ON `default`(`name`,(distinct (array `f` for `f` in `friends` end))). Also create gsi index on age field
            b. Make the update statement as - update default set name = \“%v\” where name is not null”, String(1000)
            c. Make the update to all documents only once i.e. run the above statement only once
            d. Initiate large number of scans - 20000 - on age field using below querie sequentially
            SELECT age from test_bucket where age > 0 order by age desc

            Show
            hemant.rajput Hemant Rajput added a comment - Validated on 6.6.1-9133   Load the bucket provided by Varun for array index containing array field with 5000 entries a. Create the array index like this - CREATE INDEX `idx_arr` ON `default`(`name`,(distinct (array `f` for `f` in `friends` end))). Also create gsi index on age field b. Make the update statement as - update default set name = \“%v\” where name is not null”, String(1000) c. Make the update to all documents only once i.e. run the above statement only once d. Initiate large number of scans - 20000 - on age field using below querie sequentially SELECT age from test_bucket where age > 0 order by age desc

              People

              Assignee:
              hemant.rajput Hemant Rajput
              Reporter:
              varun.velamuri Varun Velamuri
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Gerrit Reviews

                  There are no open Gerrit changes

                    PagerDuty