Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-38743

Failure when creating secondary index on fields in a dataset that have large sized values

    XMLWordPrintable

Details

    Description

      Build : 6.5.1-6298

      While testing datasets with large sized documents, it is found that index creation on fields that have large sized values fails.

      Dataset details
      2 kb sized docs * 100000 docs
      20 kb sized docs * 20000 docs
      200 kb sized docs * 10000 docs
      100 kb sized docs * 10000 docs
      500 kb sized docs * 20000 docs
      2 mb sized docs * 1000 docs
      TOTAL : 161000 docs

      Steps to reproduce
      1. Setup a 4 node cluster - 2 KV and 2 Analytics node
      2. Load the above dataset into a bucket.
      3. Create an Analytics dataset (compressed) on the bucket
      4. Create a secondary index on the above dataset. The index should include the field that has large sized values. Also, ensure sufficient timeout is set for the DDL query since it might take a while for the index to get created.

      Index creation failed.

      [root@localhost logs]# /opt/couchbase/bin/cbq -e localhost:8091 -a -u Administrator -p password -t 5h
       Connected to : http://localhost:8091/. Type Ctrl-D or \QUIT to exit.
       
       Path to history file for the shell : /root/.cbq_history
      cbq> CREATE INDEX ds1_idx ON ds1(age:BIGINT, body:STRING);
      {
      	"requestID": "db6b991a-a998-45e1-b056-b6331cc3af9a",
      	"signature": {
      		"*": "*"
      	},
      	"errors": [{
      		"code": 25000,		"msg": "Internal error"	}
      	],
      	"status": "fatal",
      	"metrics": {
      		"elapsedTime": "839.4507128s",
      		"executionTime": "839.417042246s",
      		"resultCount": 0,
      		"resultSize": 0,
      		"processedObjects": 0,
      		"errorCount": 1
      	}
      }
      

      There are errors like the following in the analytics logs -

      2020-04-14T11:52:19.419-07:00 WARN CBAS.nc.Task [org.apache.hyracks.api.rewriter.runtime.SuperActivity:JID:0.19:TAID:TID:ANID:ODID:5:1:1:0:0] Task failed with exception
      org.apache.hyracks.api.exceptions.HyracksDataException: HYR0043: Record size (102429) larger than maximum acceptable record size (65523)
              at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:60) ~[hyracks-api.jar:6.5.1-6298]
              at org.apache.hyracks.storage.am.btree.impls.BTree$BTreeBulkLoader.propagateBulk(BTree.java:1121) ~[hyracks-storage-am-btree.jar:6.5.1-6298]
              at org.apache.hyracks.storage.am.btree.impls.BTree$BTreeBulkLoader.add(BTree.java:1047) ~[hyracks-storage-am-btree.jar:6.5.1-6298]
              at org.apache.hyracks.storage.am.lsm.common.impls.LSMIndexBulkLoader.add(LSMIndexBulkLoader.java:55) ~[hyracks-storage-am-lsm-common.jar:6.5.1-6298]
              at org.apache.hyracks.storage.am.lsm.common.impls.ChainedLSMDiskComponentBulkLoader.add(ChainedLSMDiskComponentBulkLoader.java:68) ~[hyracks-storage-am-lsm-common.jar:6.5.1-6298]
              at org.apache.hyracks.storage.am.lsm.common.impls.LSMIndexDiskComponentBulkLoader.add(LSMIndexDiskComponentBulkLoader.java:54) ~[hyracks-storage-am-lsm-common.jar:6.5.1-6298]
              at org.apache.hyracks.storage.am.common.dataflow.IndexBulkLoadOperatorNodePushable.nextFrame(IndexBulkLoadOperatorNodePushable.java:98) ~[hyracks-storage-am-common.jar:6.5.1-6298]
              at org.apache.hyracks.dataflow.common.comm.util.FrameUtils.flushFrame(FrameUtils.java:50) ~[hyracks-dataflow-common.jar:6.5.1-6298]
              at org.apache.hyracks.dataflow.std.sort.AbstractExternalSortRunMerger.merge(AbstractExternalSortRunMerger.java:204) ~[hyracks-dataflow-std.jar:6.5.1-6298]
              at org.apache.hyracks.dataflow.std.sort.AbstractExternalSortRunMerger.process(AbstractExternalSortRunMerger.java:133) ~[hyracks-dataflow-std.jar:6.5.1-6298]
              at org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:196) ~[hyracks-dataflow-std.jar:6.5.1-6298]
              at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$0(SuperActivityOperatorNodePushable.java:228) ~[hyracks-api.jar:6.5.1-6298]
              at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]
              at java.lang.Thread.run(Unknown Source) [?:?]
      2020-04-14T11:52:19.422-07:00 WARN CBAS.work.NotifyTaskFailureWork [Worker:f221b397d0b1d90ce37348dec1716477] task TAID:TID:ANID:ODID:5:1:1:0 has failed
      org.apache.hyracks.api.exceptions.HyracksDataException: HYR0043: Record size (102429) larger than maximum acceptable record size (65523)
              at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:60) ~[hyracks-api.jar:6.5.1-6298]
              at org.apache.hyracks.storage.am.btree.impls.BTree$BTreeBulkLoader.propagateBulk(BTree.java:1121) ~[hyracks-storage-am-btree.jar:6.5.1-6298]
              at org.apache.hyracks.storage.am.btree.impls.BTree$BTreeBulkLoader.add(BTree.java:1047) ~[hyracks-storage-am-btree.jar:6.5.1-6298]
              at org.apache.hyracks.storage.am.lsm.common.impls.LSMIndexBulkLoader.add(LSMIndexBulkLoader.java:55) ~[hyracks-storage-am-lsm-common.jar:6.5.1-6298]
              at org.apache.hyracks.storage.am.lsm.common.impls.ChainedLSMDiskComponentBulkLoader.add(ChainedLSMDiskComponentBulkLoader.java:68) ~[hyracks-storage-am-lsm-common.jar:6.5.1-6298]
              at org.apache.hyracks.storage.am.lsm.common.impls.LSMIndexDiskComponentBulkLoader.add(LSMIndexDiskComponentBulkLoader.java:54) ~[hyracks-storage-am-lsm-common.jar:6.5.1-6298]
              at org.apache.hyracks.storage.am.common.dataflow.IndexBulkLoadOperatorNodePushable.nextFrame(IndexBulkLoadOperatorNodePushable.java:98) ~[hyracks-storage-am-common.jar:6.5.1-6298]
              at org.apache.hyracks.dataflow.common.comm.util.FrameUtils.flushFrame(FrameUtils.java:50) ~[hyracks-dataflow-common.jar:6.5.1-6298]
              at org.apache.hyracks.dataflow.std.sort.AbstractExternalSortRunMerger.merge(AbstractExternalSortRunMerger.java:204) ~[hyracks-dataflow-std.jar:6.5.1-6298]
              at org.apache.hyracks.dataflow.std.sort.AbstractExternalSortRunMerger.process(AbstractExternalSortRunMerger.java:133) ~[hyracks-dataflow-std.jar:6.5.1-6298]
              at org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:196) ~[hyracks-dataflow-std.jar:6.5.1-6298]
              at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$0(SuperActivityOperatorNodePushable.java:228) ~[hyracks-api.jar:6.5.1-6298]
              at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
              at java.lang.Thread.run(Unknown Source) ~[?:?]
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              mihir.kamdar Mihir Kamdar (Inactive)
              mihir.kamdar Mihir Kamdar (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty