Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-62910

Consider distinct array elements before formulating array index keys

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Major
    • Morpheus
    • 6.6.6, 7.1.6, 7.2.5, 7.6.1
    • secondary-index
    • None
    • 0

    Description

      Consider the case of a distinct covering array index where array has large number of entries e.g. 100K but only 100 are distinct. In such cases, indexer will first try to create all possible 100K entries and then do a distinct. If it is non-distinct, then count is also computed during distinct.

      Entries are formulated in splitSecondaryArrayKey
      https://github.com/couchbase/indexing/blob/master/secondary/indexer/array.go#L36

      and later the distinct/count is done:
      https://github.com/couchbase/indexing/blob/master/secondary/indexer/array.go#L147

      But this can lead to large memory allocation request here:
      https://github.com/couchbase/indexing/blob/master/secondary/indexer/array.go#L125

      It is better to first do the distinct/count to avoid peak memory allocation.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            varun.velamuri Varun Velamuri
            deepkaran.salooja Deepkaran Salooja
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty