Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-45253

[Collections] - "Connection::event_callback: unrecoverable error encountered ["reading","error"], shutting down connection" seen in logs

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 1
    • Yes
    • KV-Engine 2021-March

    Description

      Script to Repro

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/testexec.5901.ini GROUP=sdk_compression,rerun=False,upgrade_version=7.0.0-4774 -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_rebalance_in,nodes_init=3,nodes_in=2,bucket_spec=multi_bucket.buckets_for_rebalance_tests,data_load_stage=during,sdk_compression=True,skip_validations=False,GROUP=sdk_compression'
      

      Steps to Repro
      1) Create a 3 node cluster
      2021-03-24 22:10:57,329 | test | INFO | pool-1-thread-7 | [table_view:display:72] Rebalance Overview
      ----------------------------------------------------------------------

      Nodes Services Version CPU Status

      ----------------------------------------------------------------------

      172.23.123.125 kv 7.0.0-4774-enterprise 4.10615923886 Cluster node
      172.23.123.124 None     <--- IN —
      172.23.106.8 None     <--- IN —

      ----------------------------------------------------------------------

      2) Create buckets/scopes/collections/data
      2021-03-24 22:13:03,823 | test | INFO | MainThread | [table_view:display:72] Bucket statistics
      -------------------------------------------------------------------------

      Bucket Type Replicas Durability TTL Items RAM Quota RAM Used Disk Used

      -------------------------------------------------------------------------

      bucket1 couchbase 3 none 0 30000 314572800 106604680 108729454
      bucket2 ephemeral 3 none 0 30000 314572800 77965400 102
      default couchbase 3 none 0 500000 4718592000 455320888 381511918

      -------------------------------------------------------------------------

      3) Add 2 nodes(172.23.106.6 and 172.23.105.42) and rebalance in
      2021-03-24 22:13:15,101 | test | INFO | pool-1-thread-18 | [table_view:display:72] Rebalance Overview
      ----------------------------------------------------------------------

      Nodes Services Version CPU Status

      ----------------------------------------------------------------------

      172.23.123.124 kv 7.0.0-4774-enterprise 12.4526395554 Cluster node
      172.23.106.8 kv 7.0.0-4774-enterprise 11.6514690983 Cluster node
      172.23.123.125 kv 7.0.0-4774-enterprise 21.8512898331 Cluster node
      172.23.106.6 None     <--- IN —
      172.23.105.42 None     <--- IN —

      ----------------------------------------------------------------------

      We see the following errors on 172.23.123.124

      2021-03-24 22:15:51,417 | test  | INFO    | MainThread | [basetestcase:check_coredump_exist:789] unwanted messages in /opt/couchbase/var/lib/couchbase/logs/memcached.log.000001.txt
      2021-03-24 22:15:51,418 | test  | CRITICAL | MainThread | [basetestcase:check_coredump_exist:791] 172.23.123.124: Found ' ERROR ' logs - ['2021-03-24T22:13:18.648316-07:00 ERROR 1578: Connection::event_callback: unrecoverable error encountered: ["reading","error"], shutting down connection\n']
      

      cbcollect_info attached. This was not seen on 7.0.0-4735.

      Daniel Owen Looks like this bug could potentially affect 50-100 tc's in the weekly run. Seems to affect almost all rebalance+failover tc's.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              Balakumaran.Gopal Balakumaran Gopal
              Balakumaran.Gopal Balakumaran Gopal
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty