Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-29137

Slow initial indexing of multiple partitioned indexes

    XMLWordPrintable

Details

    Description

      Build 5.5.0-2414.

      Setup:

      • 4 KV nodes, 1 index node
      • 1 bucket
      • 100M items
      • Plasma (25GB quota / 32GB total RAM)
      • indexer.numSliceWriters=3 due to MB-29132

      Initial indexing of 10 non-partitioned indexes was reasonably fast and took roughly one hour. Initial indexing of partitioned indexes was much slower (<5K documents indexed per second) and eventually got interrupted by an OOM failure (see MB-29132).

      Partitioned Indexes:

          CREATE INDEX ag1 ON `bucket-1`(DATE_PART_STR(sold_date, 'year'), customer.state, store, quantity) PARTITION BY HASH(store) WITH {"defer_build": true};
       
          CREATE INDEX ag2 ON `bucket-1`(DATE_PART_STR(sold_date, 'year'), DATE_PART_STR(sold_date, 'week'), WEEKDAY_STR(sold_date), sales_price) PARTITION BY HASH(sold_date) WITH {"defer_build": true};
       
          CREATE INDEX ag3 ON `bucket-1`(item.manufacturer_id, DATE_PART_STR(sold_date, 'month'), DATE_PART_STR(sold_date, 'year'), item.brand, sales_price) PARTITION BY HASH(sold_date) WITH {"defer_build": true};
       
          CREATE INDEX ag4 ON `bucket-1`(DATE_PART_STR(sold_date, 'year'), customer.preferred_flag, customer.birth_country, wholesale_cost, sales_price) PARTITION BY HASH(customer.birth_country) WITH {"defer_build": true};
       
          CREATE INDEX ag5 ON `bucket-1`(DATE_PART_STR(sold_date, 'year'), DATE_PART_STR(sold_date, 'month'), sales_price, customer.state) PARTITION BY HASH(customer.state) WITH {"defer_build": true};
       
          CREATE INDEX ag6 ON `bucket-1`(customer.gender, customer.marital_status, customer.education_status, item.id, quantity, list_price, coupon_amt, sales_price) PARTITION BY HASH(item.id) WITH {"defer_build": true};
       
          CREATE INDEX ag7 ON `bucket-1`(customer.zip, customer.preferred_flag) PARTITION BY HASH(customer.zip) WITH {"defer_build": true};
       
          CREATE INDEX ag8 ON `bucket-1`(customer.county, DATE_PART_STR(sold_date, 'year'), DATE_PART_STR(sold_date, 'month'), customer.gender, customer.marital_status, customer.education_status, customer.purchase_estimate, customer.credit_rating, customer.dep_count, customer.dep_employed_count, customer.dep_college_count) PARTITION BY HASH(customer.county) WITH {"defer_build": true};
       
          CREATE INDEX ag9 ON `bucket-1`(item.category, DATE_PART_STR(sold_date, 'day_of_year'), sales_price) PARTITION BY HASH(item.category) WITH {"defer_build": true};
       
          CREATE INDEX ag10 ON `bucket-1`(customer.zip, DATE_PART_STR(sold_date, 'year'), DATE_PART_STR(sold_date, 'quarter'), sales_price) PARTITION BY HASH(customer.zip) WITH {"defer_build": true}
      

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-29137
          # Subject Branch Project Status CR V

          Activity

            People

              sundar Sundar Sridharan (Inactive)
              pavelpaulau Pavel Paulau (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty