Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50991

Cbindexperf MetaKV Errors on Cloud Tests

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • 7.1.0
    • 7.1.0
    • secondary-index
    • Machines: AWS, m5.4xlarge (16vCPU, 64GB Memory, 1TB Disk) x 4

      Cluster: 4 nodes - 3 KV, 1 Indexer

      OS: Amazon Linux 2

      Replicated on: 7.1.0-1831, 7.1.0-2284
    • Untriaged
    • 1
    • Unknown

    Description

      When running initial scan performance testing for Capella GSI, we are seeing the CPU thrashed on the index node, and running into issues with 'MetaKV' in the run:

      Rev rpc URLĀ 

      20:17:22 [ec2-3-238-21-92.compute-1.amazonaws.com] run:  echo 'export CBAUTH_REVRPC_URL=http://Administrator:password@ec2-3-237-178-20.compute-1.amazonaws.com:8091/query' >> ~/.bashrc 
       
       
      20:30:49 2022-02-15T07:00:49 [INFO] To be applied: ./opt/couchbase/bin/cbindexperf -cluster ec2-3-237-178-20.compute-1.amazonaws.com:8091 -auth="Administrator:password" -configfile tests/gsi/new_plasma/config/config_scan_multiple_plasma.json -resultfile result.json -statsfile /root/statsfile -cpuprofile cpuprofile.prof -memprofile memprofile.prof  -use_tls -cacert ./root.pem

      15:50:50 [ec2-3-238-21-92.compute-1.amazonaws.com] out: 2022-02-15T15:50:50.252+00:00 [Error] Fail to get indexer version from metakv. Internal Error = Get /_metakv/indexing/info/versionToken: missing port in address 15:50:50 [ec2-3-238-21-92.compute-1.amazonaws.com] out: 2022-02-15T15:50:50.252+00:00 [Fatal] MetakvSet Failed to set /indexing/info/versionToken: Put /_metakv/indexing/info/versionToken: missing port in address

      The 7.1.0-1831 was run with

      "Repeat": 1999999,

      and

      "Concurrency": 128,

      http://perf.jenkins.couchbase.com/job/Cloud-Tester-Dev/251/

      The 7.1.0-2284 was toned down to try and ease CPU pressure to:

      "Repeat": 19999,

      and

      "Concurrency": 64,

      but the CPU was still at around 100% usage for the duration of the scan phase, here is also the job for that run:
      http://perf.jenkins.couchbase.com/job/Cloud-Tester-Dev/252/

      Both jobs were able to generate cbmonitor reports as below, and the CPU usage can be seen:

      7.1.0-1831
      http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=couchbase1_710-1831_cloud_apply_scanworkload_ac38

      7.1.0-2284

      http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=couchbase1_710-2284_cloud_apply_scanworkload_0657
      I have also attatched result.json which doesn't seem to look out of the ordinary.

      We use the exact same command for running these commands on both cloud and non-cloud testing, but it seems to only become an issue when running on cloud.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            yogendra.acharya Yogendra Acharya (Inactive)
            sean.corrigan Sean Corrigan
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty