Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-36892

KV perf testing failure - [ERROR] Some nodes are not healthy: {'triton-srv-02-ip6.perf.couchbase.com:8091', 'triton-srv-01-ip6.perf.couchbase.com:8091'}

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • master
    • master
    • couchbase-bucket, ns_server
    • None

    Description

      Regression running perf tests, only tried 'kv/kv_max_ops_writes.test' but presume others fail too.

      Test worked in 7.0.0-1036 but then started failing in 7.0.0-1037, there were no kv-engine repo changes in that range (although platform has changes)

      Links to jenkins:

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          jwalker Jim Walker added a comment -

          1042 was fine

          jwalker Jim Walker added a comment - 1042 was fine
          jwalker Jim Walker added a comment -

          7.0.0-1042 is running and might be ok

          http://perf.jenkins.couchbase.com/job/triton/28568/console

          jwalker Jim Walker added a comment - 7.0.0-1042 is running and might be ok http://perf.jenkins.couchbase.com/job/triton/28568/console
          jwalker Jim Walker added a comment -

          Passing to ns_server for triage, only because ns_server debug log also reports the 'unhealthy' message. What is being observed to trigger the unhealthy issue?

          e.g. from
          https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-triton-28440/triton-srv-01-ip6.perf.couchbase.com.zip

          [ns_server:debug,2019-11-08T07:07:54.025-08:00,ns_1@triton-srv-01-ip6.perf.couchbase.com:<0.2112.0>:auto_failover:log_down_nodes_reason:357]Node 'ns_1@triton-srv-01-ip6.perf.couchbase.com' is considered down. Reason:"All monitors report node is unhealthy."
          

          jwalker Jim Walker added a comment - Passing to ns_server for triage, only because ns_server debug log also reports the 'unhealthy' message. What is being observed to trigger the unhealthy issue? e.g. from https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-triton-28440/triton-srv-01-ip6.perf.couchbase.com.zip [ns_server:debug,2019-11-08T07:07:54.025-08:00,ns_1@triton-srv-01-ip6.perf.couchbase.com:<0.2112.0>:auto_failover:log_down_nodes_reason:357]Node 'ns_1@triton-srv-01-ip6.perf.couchbase.com' is considered down. Reason:"All monitors report node is unhealthy."

          People

            jwalker Jim Walker
            jwalker Jim Walker
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty