Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-14960

[system test] Queries hang, or return "Index not found - cause: Stale metadata" errors during rebalance

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • 4.0.0
    • 4.0.0
    • query
    • Security Level: Public
    • 400-2093
      6 Nodes
      10 buckets, 20 gsi indexes
      attached screenshot for
      - node services
      - items in buckets
    • Untriaged
    • Unknown

    Description

      1. load tpcc data on buckets, with 100 warehouses
      2. run queries ~ 4 hours - runs ok.
      3. Rebalance out Index node ( 10.6.2.164)
      4. Continue querying the buckets

      • Expect the queries where indexes lie on the rebalanced out node - to error
      • Expect other buckets to return query results as expected.

      Error - Mostly all queries return no results
      1.Example : Bucket default has index on node 10.6.2.167 (which is stil part of the cluster)

      • All queries to it return error on
        cbq> select * from default where SequenceNumber>200 limit 1;
        {
        "requestID": "91796a4b-5078-42c0-81a2-239d810f8b45",
        "signature": { "*": "*" }

        ,
        "results": [
        ],
        "errors": [

        { "code": 5000, "msg": "Index not found - cause: Stale metadata" }

        ],
        "status": "errors",
        "metrics":

        { "elapsedTime": "118.110364ms", "executionTime": "108.729726ms", "resultCount": 0, "resultSize": 0, "errorCount": 1 }

        }
        Indexes exist on healthy node 10.6.2.167
        Ketakis-MacBook-Pro:~ ketaki$ ssh root@10.6.2.167
        root@10.6.2.167's password:
        Last login: Wed May 13 10:56:08 2015 from 10.17.2.141
        [root@centos-64-x64 ~]# cd /index/
        [root@centos-64-x64 index]# ls
        @2i
        [root@centos-64-x64 index]# cd @2i/
        [root@centos-64-x64 @2i]# ls
        CUSTOMER_CU_ID_D_ID_W_ID_1548575999650553693_0.index MetadataStore
        default_arrv_dest_1298302833338186941_0.index NEW_ORDER_NO_D_ID_W_ID_16688672414267496920_0.index
        default_seq_num_6384134797828746110_0.index ORDER_LINE_PX_ORDER_LINE_14768459523065367802_0.index
        DISTRICT_PX_DISTRICT_13599845997881702209_0.index ORDERS_OR_O_ID_D_ID_W_ID_10742907495691224233_0.index
        HISTORY_px_history_13957383377823401760_0.index STOCK_PX_STOCK_3326173210389418617_0.index
        ITEM_I_ID_15784363429392692370_0.index WAREHOUSE_PX_WAREHOUSE_5432008840470267055_0.index
        [root@centos-64-x64 @2i]#

      Another example: STOCK, with indexes on 10.6.2.167 ( healthy existing node)

      • Queries to it hang and never recover
        cbq> select * from STOCK where S_W_ID>0 and S_I_ID>0 limit 1;

      Attaching logs.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            manik Manik Taneja (Inactive)
            ketaki Ketaki Gangal (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty