Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-14957

Indexer should recover from failure

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 4.0.0
    • 4.0.0
    • secondary-index
    • Security Level: Public
    • 4.0.0-2098

    Description

      Steps to reproduce will be similar to defect - MB-14954. The indexer seems to be stuck and has not processed all pending index.
      Expected result here would be for indexer to abort the troublesome index after a few retries and then move to the next pending index or mutations that are pending for index.
      Probably calculate the size of the index based on the data during initial scan and then use a combination of time and data size to determine time taken for the index to happen. Can be there are initial index.

      ./testrunner -i temp.ini -t 2i.indexscans_2i.SecondaryIndexingScanTests.test_multi_create_query_explain_drop_index_scan_consistency_with_where_clause,groups=simple,dataset=default,doc-per-day=20,use_gsi_for_primary=True,index_quota_percent=60,skip_cleanup=True,nodes_init=4,services_init=n1ql:kv-kv:index-index-index,doc-per-day=20,sasl_buckets=3,standard_buckets=3,use_gsi_for_secondary=True

      1. Create a 4 node cluster (1:n1ql+kv,2:kv+index,3:index,4:index) - Index RAM quota is set of 60% of data ram quota
      2. Create 7 buckets
      3. Create 40320 documents to 7 buckets
      4. Create primary index using GSI
      5. Create secondary index using GSI - 3 for each bucket
      6. Wait for indexing to finish, it has been over an hour that I have waited for index to finish.

      a) Check for indexing status, the indexer is stuck at indexing at these 6 buckets for over an hour.

      Ritams-MacBook-Pro:testrunner rsharma$ curl -u Administrator:password 172.23.106.75:8093/query -d 'statement=select * from system:indexes where state="pending"'
      {
      "requestID": "d64b958b-38b4-4b28-8374-231bde330200",
      "signature":

      { "*": "*" }

      ,
      "results": [
      {
      "indexes":

      { "condition": "(`job_title` = \"Sales\")", "datastore_id": "http://127.0.0.1:8091", "id": "5f828150a90ced90", "index_key": [ "`job_title`" ], "keyspace_id": "bucket1", "name": "simple_index_2", "namespace_id": "default", "state": "pending", "using": "gsi" }

      },
      {
      "indexes":

      { "condition": "(1999 \u003c `join_yr`)", "datastore_id": "http://127.0.0.1:8091", "id": "d26549137a7d3bc3", "index_key": [ "`join_yr`" ], "keyspace_id": "bucket2", "name": "simple_index_3", "namespace_id": "default", "state": "pending", "using": "gsi" }

      },
      {
      "indexes":

      { "condition": "(`job_title` = \"Sales\")", "datastore_id": "http://127.0.0.1:8091", "id": "c58cd278921046ad", "index_key": [ "`job_title`" ], "keyspace_id": "standard_bucket2", "name": "simple_index_2", "namespace_id": "default", "state": "pending", "using": "gsi" }

      },
      {
      "indexes":

      { "condition": "(not (`job_title` = \"Sales\"))", "datastore_id": "http://127.0.0.1:8091", "id": "2f18a27a7dd5d5bd", "index_key": [ "`job_title`" ], "keyspace_id": "bucket0", "name": "simple_index_1", "namespace_id": "default", "state": "pending", "using": "gsi" }

      },
      {
      "indexes":

      { "condition": "(not (`job_title` = \"Sales\"))", "datastore_id": "http://127.0.0.1:8091", "id": "d06fd1d86b5b1bd3", "index_key": [ "`job_title`" ], "keyspace_id": "standard_bucket0", "name": "simple_index_1", "namespace_id": "default", "state": "pending", "using": "gsi" }

      },
      {
      "indexes":

      { "condition": "(`job_title` = \"Sales\")", "datastore_id": "http://127.0.0.1:8091", "id": "af849ff1e999b8a9", "index_key": [ "`job_title`" ], "keyspace_id": "default", "name": "simple_index_2", "namespace_id": "default", "state": "pending", "using": "gsi" }

      }
      ],
      "status": "success",
      "metrics":

      { "elapsedTime": "1.771416449s", "executionTime": "1.771079185s", "resultCount": 6, "resultSize": 2936 }

      }

      2) Few more indexing stats:
      Ritams-MBP:testrunner rsharma$ curl -u Administrator:password http://172.23.106.77:9102/stats | json_pp | grep build_p
      % Total % Received % Xferd Average Speed Time Time Time Current
      Dload Upload Total Spent Left Speed
      100 6870 0 6870 0 0 11070 0 -::- -::- -::- 11062
      "standard_bucket2:#primary:build_progress" : 100,
      "bucket1:#primary:build_progress" : 100,
      "standard_bucket2:simple_index_2:build_progress" : 50,
      "default:#primary:build_progress" : 100,
      "bucket2:simple_index_2:build_progress" : 100,
      "standard_bucket1:simple_index_2:build_progress" : 100,
      "standard_bucket0:simple_index_2:build_progress" : 100,
      "default:simple_index_2:build_progress" : 50,
      "bucket0:simple_index_2:build_progress" : 100,
      "bucket1:simple_index_2:build_progress" : 50,
      Ritams-MBP:testrunner rsharma$ date
      Wed May 13 16:57:56 IST 2015
      Ritams-MBP:testrunner rsharma$ curl -u Administrator:password http://172.23.106.77:9102/stats | json_pp | grep build_p
      % Total % Received % Xferd Average Speed Time Time Time Current
      Dload Upload Total Spent Left Speed
      100 6870 0 6870 0 0 9920 0 -::- -::- -::- 9913
      "standard_bucket2:simple_index_2:build_progress" : 50,
      "standard_bucket2:#primary:build_progress" : 100,
      "default:simple_index_2:build_progress" : 50,
      "default:#primary:build_progress" : 100,
      "standard_bucket1:simple_index_2:build_progress" : 100,
      "bucket1:#primary:build_progress" : 100,
      "bucket0:simple_index_2:build_progress" : 100,
      "bucket2:simple_index_2:build_progress" : 100,
      "standard_bucket0:simple_index_2:build_progress" : 100,
      "bucket1:simple_index_2:build_progress" : 50

      Observe -
      "default:simple_index_2:build_progress" : 50
      "bucket1:simple_index_2:build_progress" : 50

      • it is stuck at 50% for over 40 minutes.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            prataprc Pratap Chakravarthy (Inactive)
            ritam.sharma Ritam Sharma
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty