Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-53009

/pools/default fails with error "couldnt_connect_to_memcached"

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: User Error
    • Elixir
    • Elixir
    • memcached
    • 7.2.0-1638

    Description

      Seeing multiple instances of this failure in 7.2.0-1638 build sanity. Issue was not present in 7.2.0-1634.

      error.log:

      [ns_server:error,2022-07-16T05:01:44.597-07:00,ns_1@cb.local:<0.6236.40>:menelaus_util:reply_server_error_before_close:209]Server error during processing: ["web request failed",
                                       {path,"/pools/default"},
                                       {method,'GET'},
                                       {type,exit},
                                       {what,
                                        {{{{badmatch,
                                            {error,couldnt_connect_to_memcached}},
                                           [{ns_audit_cfg,notify_memcached,1,
                                             [{file,"src/ns_audit_cfg.erl"},
                                              {line,152}]},
                                            {ns_audit_cfg,handle_info,2,
                                             [{file,"src/ns_audit_cfg.erl"},
                                              {line,127}]},
                                            {gen_server,try_dispatch,4,
                                             [{file,"gen_server.erl"},{line,695}]},
                                            {gen_server,handle_msg,6,
                                             [{file,"gen_server.erl"},{line,771}]},
                                            {proc_lib,init_p_do_apply,3,
                                             [{file,"proc_lib.erl"},{line,226}]}]},
                                          {gen_server,call,[ns_audit_cfg,get_uid]}},
                                         {gen_server,call,
                                          [menelaus_web_cache,
                                           #Fun<menelaus_web_cache.2.105129554>,
                                           infinity]}}},
                                       {trace,
                                        [{gen_server,call,3,
                                          [{file,"gen_server.erl"},{line,247}]},
                                         {menelaus_web_pools,pool_info,6,
                                          [{file,"src/menelaus_web_pools.erl"},
                                           {line,104}]},
                                         {menelaus_web_pools,handle_pool_info,2,
                                          [{file,"src/menelaus_web_pools.erl"},
                                           {line,94}]},
                                         {request_tracker,request,2,
                                          [{file,"src/request_tracker.erl"},
                                           {line,40}]},
                                         {menelaus_util,handle_request,2,
                                         [{file,"src/menelaus_util.erl"},
                                           {line,220}]},
                                         {mochiweb_http,headers,6,
                                          [{file,
                                            "/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/mochiweb/mochiweb_http.erl"},
                                           {line,153}]},
                                         {proc_lib,init_p_do_apply,3,
                                          [{file,"proc_lib.erl"},{line,226}]}]}]
      

      babaysitter.log:

      [ns_server:info,2022-07-16T06:39:21.176-07:00,babysitter_of_ns_1@cb.local:<0.20935.44>:ns_port_server:log:226]memcached<0.20935.44>: 2022-07-16T06:39:21.161512-07:00 WARNING Unknown key "enforce_tenant_limits_enabled" in config ignored.
      memcached<0.20935.44>: 2022-07-16T06:39:21.172888-07:00 CRITICAL getrandom
       
      [error_logger:error,2022-07-16T06:39:21.177-07:00,babysitter_of_ns_1@cb.local:<0.20935.44>:ale_error_logger_handler:do_log:101]
      =========================ERROR REPORT=========================
      ** Generic server <0.20935.44> terminating
      ** Last message in was {#Port<0.4643>,{exit_status,1}}
      ** When Server state == {state,#Port<0.4643>,25159,
                                  {memcached,"/opt/couchbase/bin/memcached",
                                      ["-C",
                                       "/opt/couchbase/var/lib/couchbase/config/memcached.json"],
                                      [{env,
                                           [{"EVENT_NOSELECT","1"},
                                            {"CBSASL_PWFILE",
                                             "/opt/couchbase/var/lib/couchbase/isasl.pw"}]},
                                       use_stdio,stderr_to_stdout,exit_status,
                                       stream]},
                                  {ringbuffer,154,1024,
                                      {[{<<"2022-07-16T06:39:21.172888-07:00 CRITICAL getrandom">>,
                                         51}],
                                       [{<<"2022-07-16T06:39:21.161512-07:00 WARNING Unknown key \"enforce_tenant_limits_enabled\" in config ignored.">>,
                                         103}]}},
                                  undefined,#Ref<0.3150024439.699400193.94937>,
                                  [<<"2022-07-16T06:39:21.172888-07:00 CRITICAL getrandom">>,
                                   <<"2022-07-16T06:39:21.161512-07:00 WARNING Unknown key \"enforce_tenant_limits_enabled\" in config ignored.">>],
                                  0}
      ** Reason for termination ==
      ** {abnormal,1}
      

      Most likely caused by: https://review.couchbase.org/c/gomemcached/+/177414 (Merged in 1635)

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          Donald.haggart Donald Haggart added a comment - - edited

          I don't agree with your identification of a gomemcached change for a memcached crash.  I'm not sure, since memcached is written in C++.

          Your kernel:

          Linux xcp-s12325 3.10.0-123.20.1.el7.x86_64 #1 SMP Thu Jan 29 18:05:33 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
          

          is too old - from 2015!

          You are [edit: most likely] encountering this:  https://issues.couchbase.com/browse/MB-52896 (and the problem was introduced by MB-35297 but wouldn't have been encountered until you started using sequential scans (in Query) or had an SDK invoke range scans directly).

          Donald.haggart Donald Haggart added a comment - - edited I don't agree with your identification of a gomemcached change for a memcached crash.  I'm not sure, since memcached is written in C++. Your kernel: Linux xcp-s12325 3.10 . 0 - 123.20 . 1 .el7.x86_64 # 1 SMP Thu Jan 29 18 : 05 : 33 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux is too old - from 2015! You are [edit: most likely] encountering this:   https://issues.couchbase.com/browse/MB-52896 (and the problem was introduced by MB-35297 but wouldn't have been encountered until you started using sequential scans (in Query) or had an SDK invoke range scans directly).
          dfinlay Dave Finlay added a comment -

          I'm not sure that this is the same issue as MB-52896. Memcached is crashing repeatedly but the CRITICAL log message looks to be different.

          2022-07-16T01:13:32.104020-07:00 INFO ---------- Opening logfile:
          2022-07-16T01:13:32.106569-07:00 INFO Couchbase version 7.2.0-1638 starting.
          2022-07-16T01:13:32.106597-07:00 INFO Process identifier: 32746
          2022-07-16T01:13:32.106599-07:00 INFO Development asserts enabled
          2022-07-16T01:13:32.106628-07:00 INFO recalculate_max_connections: {"engine_fds":133982,"max_connections":65000,"max_fds":200000,"system_connections":5000}
          2022-07-16T01:13:32.106796-07:00 INFO Breakpad enabled. Minidumps will be written to '/opt/couchbase/var/lib/couchbase/crash'
          2022-07-16T01:13:32.107051-07:00 INFO Fine clock overhead: 226ns
          2022-07-16T01:13:32.107081-07:00 INFO Coarse clock overhead: 17ns)
          2022-07-16T01:13:32.107083-07:00 INFO (Clock measurement period: 1ns)
          2022-07-16T01:13:32.107758-07:00 INFO Using SLA configuration: {"COMPACT_DB":{"slow":"1800 s"},"CREATE_BUCKET":{"slow":"5 s"},"DELETE_BUCKET":{"slow":"10 s"},"SELECT_BUCKET":{"slow":"10 ms"},"SEQNO_PERSISTENCE":{"slow":"30 s"},"comment":"\
          Current MCBP SLA configuration","default":{"slow":"500 ms"},"version":1}
          2022-07-16T01:13:32.107772-07:00 INFO Enable standard input listener
          2022-07-16T01:13:32.107912-07:00 INFO NUMA: Set memory allocation policy to 'interleave'
          2022-07-16T01:13:32.107927-07:00 INFO Loading RBAC configuration from [/opt/couchbase/var/lib/couchbase/config/memcached.rbac]
          2022-07-16T01:13:32.108052-07:00 INFO Loading error maps from [/opt/couchbase/etc/couchbase/kv/error_maps]
          2022-07-16T01:13:32.108984-07:00 CRITICAL getrandom
          2022-07-16T01:13:32.109622-07:00 INFO ---------- Closing logfile
          2022-07-16T01:13:46.883063-07:00 INFO ---------- Opening logfile:
          2022-07-16T01:13:46.885317-07:00 INFO Couchbase version 7.2.0-1638 starting.
          2022-07-16T01:13:46.885346-07:00 INFO Process identifier: 624
          2022-07-16T01:13:46.885348-07:00 INFO Development asserts enabled
          2022-07-16T01:13:46.885381-07:00 INFO recalculate_max_connections: {"engine_fds":133982,"max_connections":65000,"max_fds":200000,"system_connections":5000}
          2022-07-16T01:13:46.885485-07:00 INFO Breakpad enabled. Minidumps will be written to '/opt/couchbase/var/lib/couchbase/crash'
          2022-07-16T01:13:46.885695-07:00 INFO Fine clock overhead: 199ns
          2022-07-16T01:13:46.885723-07:00 INFO Coarse clock overhead: 17ns)
          2022-07-16T01:13:46.885724-07:00 INFO (Clock measurement period: 1ns)
          2022-07-16T01:13:46.886364-07:00 INFO Using SLA configuration: {"COMPACT_DB":{"slow":"1800 s"},"CREATE_BUCKET":{"slow":"5 s"},"DELETE_BUCKET":{"slow":"10 s"},"SELECT_BUCKET":{"slow":"10 ms"},"SEQNO_PERSISTENCE":{"slow":"30 s"},"comment":"\
          Current MCBP SLA configuration","default":{"slow":"500 ms"},"version":1}
          2022-07-16T01:13:46.886376-07:00 INFO Enable standard input listener
          2022-07-16T01:13:46.886470-07:00 INFO NUMA: Set memory allocation policy to 'interleave'
          2022-07-16T01:13:46.886477-07:00 INFO Loading RBAC configuration from [/opt/couchbase/var/lib/couchbase/config/memcached.rbac]
          2022-07-16T01:13:46.886596-07:00 INFO Loading error maps from [/opt/couchbase/etc/couchbase/kv/error_maps]
          2022-07-16T01:13:46.887528-07:00 CRITICAL getrandom
          2022-07-16T01:13:46.889323-07:00 INFO ---------- Closing logfile
          ...
          

          Assigning to KV.

          dfinlay Dave Finlay added a comment - I'm not sure that this is the same issue as MB-52896. Memcached is crashing repeatedly but the CRITICAL log message looks to be different. 2022-07-16T01:13:32.104020-07:00 INFO ---------- Opening logfile: 2022-07-16T01:13:32.106569-07:00 INFO Couchbase version 7.2.0-1638 starting. 2022-07-16T01:13:32.106597-07:00 INFO Process identifier: 32746 2022-07-16T01:13:32.106599-07:00 INFO Development asserts enabled 2022-07-16T01:13:32.106628-07:00 INFO recalculate_max_connections: {"engine_fds":133982,"max_connections":65000,"max_fds":200000,"system_connections":5000} 2022-07-16T01:13:32.106796-07:00 INFO Breakpad enabled. Minidumps will be written to '/opt/couchbase/var/lib/couchbase/crash' 2022-07-16T01:13:32.107051-07:00 INFO Fine clock overhead: 226ns 2022-07-16T01:13:32.107081-07:00 INFO Coarse clock overhead: 17ns) 2022-07-16T01:13:32.107083-07:00 INFO (Clock measurement period: 1ns) 2022-07-16T01:13:32.107758-07:00 INFO Using SLA configuration: {"COMPACT_DB":{"slow":"1800 s"},"CREATE_BUCKET":{"slow":"5 s"},"DELETE_BUCKET":{"slow":"10 s"},"SELECT_BUCKET":{"slow":"10 ms"},"SEQNO_PERSISTENCE":{"slow":"30 s"},"comment":"\ Current MCBP SLA configuration","default":{"slow":"500 ms"},"version":1} 2022-07-16T01:13:32.107772-07:00 INFO Enable standard input listener 2022-07-16T01:13:32.107912-07:00 INFO NUMA: Set memory allocation policy to 'interleave' 2022-07-16T01:13:32.107927-07:00 INFO Loading RBAC configuration from [/opt/couchbase/var/lib/couchbase/config/memcached.rbac] 2022-07-16T01:13:32.108052-07:00 INFO Loading error maps from [/opt/couchbase/etc/couchbase/kv/error_maps] 2022-07-16T01:13:32.108984-07:00 CRITICAL getrandom 2022-07-16T01:13:32.109622-07:00 INFO ---------- Closing logfile 2022-07-16T01:13:46.883063-07:00 INFO ---------- Opening logfile: 2022-07-16T01:13:46.885317-07:00 INFO Couchbase version 7.2.0-1638 starting. 2022-07-16T01:13:46.885346-07:00 INFO Process identifier: 624 2022-07-16T01:13:46.885348-07:00 INFO Development asserts enabled 2022-07-16T01:13:46.885381-07:00 INFO recalculate_max_connections: {"engine_fds":133982,"max_connections":65000,"max_fds":200000,"system_connections":5000} 2022-07-16T01:13:46.885485-07:00 INFO Breakpad enabled. Minidumps will be written to '/opt/couchbase/var/lib/couchbase/crash' 2022-07-16T01:13:46.885695-07:00 INFO Fine clock overhead: 199ns 2022-07-16T01:13:46.885723-07:00 INFO Coarse clock overhead: 17ns) 2022-07-16T01:13:46.885724-07:00 INFO (Clock measurement period: 1ns) 2022-07-16T01:13:46.886364-07:00 INFO Using SLA configuration: {"COMPACT_DB":{"slow":"1800 s"},"CREATE_BUCKET":{"slow":"5 s"},"DELETE_BUCKET":{"slow":"10 s"},"SELECT_BUCKET":{"slow":"10 ms"},"SEQNO_PERSISTENCE":{"slow":"30 s"},"comment":"\ Current MCBP SLA configuration","default":{"slow":"500 ms"},"version":1} 2022-07-16T01:13:46.886376-07:00 INFO Enable standard input listener 2022-07-16T01:13:46.886470-07:00 INFO NUMA: Set memory allocation policy to 'interleave' 2022-07-16T01:13:46.886477-07:00 INFO Loading RBAC configuration from [/opt/couchbase/var/lib/couchbase/config/memcached.rbac] 2022-07-16T01:13:46.886596-07:00 INFO Loading error maps from [/opt/couchbase/etc/couchbase/kv/error_maps] 2022-07-16T01:13:46.887528-07:00 CRITICAL getrandom 2022-07-16T01:13:46.889323-07:00 INFO ---------- Closing logfile ... Assigning to KV.
          jwalker Jim Walker added a comment -

          The underlying cause is this change https://github.com/couchbase/platform/commit/5ffdb83b9e905e86f53d5d7267c70d2865aaac50 which makes use of boost::uuid which we know is incompatible with some older Linux systems.

          It is my understanding that 7.2/8.0 drop support for Centos7 entirely:

          • solution is to upgrade to a newer Centos7 (which is still likely unsupported)
          • upgrade to Centos8, definitley supported
          jwalker Jim Walker added a comment - The underlying cause is this change https://github.com/couchbase/platform/commit/5ffdb83b9e905e86f53d5d7267c70d2865aaac50 which makes use of boost::uuid which we know is incompatible with some older Linux systems. It is my understanding that 7.2/8.0 drop support for Centos7 entirely: solution is to upgrade to a newer Centos7 (which is still likely unsupported) upgrade to Centos8, definitley supported

          Jim Walker, is centos7 support being discontinued for 7.2/8.0 documented someplace? This will impact quite a large number of test VMs, so just want to be sure before I request bulk upgrades.

          pavithra.mahamani Pavithra Mahamani (Inactive) added a comment - Jim Walker , is centos7 support being discontinued for 7.2/8.0 documented someplace? This will impact quite a large number of test VMs, so just want to be sure before I request bulk upgrades.
          pavithra.mahamani Pavithra Mahamani (Inactive) added a comment - Thank you Dave Rigby

          People

            pavithra.mahamani Pavithra Mahamani (Inactive)
            pavithra.mahamani Pavithra Mahamani (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty