Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7369

[system test] memcached exited with message "Error initializing sasl"

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0
    • Fix Version/s: 2.0.1
    • Component/s: ns_server
    • Security Level: Public
    • Environment:
      windows 2008 R2 64bit

      Description

      Install couchbase server 1.8.1 on 4 nodes cluster with default installation path and custom data path.
      10.3.2.10
      10.3.2.11
      10.3.2.12
      10.3.2.16
      10.3.2.75

      Create 3 buckets, one default, one sasl and one dedicated port bucket.
      Load 20K items to each bucket.
      Maintain load about 3K at each bucket.
      Install couchbase server 2.0.0-1974 on node 10.3.2.76 with default installation path and custom data path.
      Swap rebalance (add node 76 and remove node 75). Rebalance done.
      Install couchbase server 2.0.0-1974 on node 10.3.2.77 with default installation path and custom data path.
      Swap rebalance (add node 77 and remove node 16). Rebalance done.

      Uninstall couchbase server 1.8.1 on node 16 and 75
      Install couchbase server 2.0.0-1974 on node 16 and 75 with custom installation path (c:/one) and custom data path (c:/data)
      Swap rebalance (add node 16 and 75 to cluster, remove node 11 and 12)
      Rebalance failed in few minutes

      I see a lot of error in log page with error

      Port server memcached on node 'ns_1@10.3.2.16' exited with status 1. Restarting. Messages: Wed Dec 05 16:16:29.949804 Pacific Standard Time 3: Error initializing sasl.

      [user:info,2012-12-05T14:59:51.730,ns_1@10.3.2.16:ns_node_disco<0.7785.0>:ns_node_disco:handle_info:162]Node 'ns_1@10.3.2.16' saw that node 'ns_1@10.3.2.12' came up.
      [ns_server:debug,2012-12-05T14:59:51.730,ns_1@10.3.2.16:ns_node_disco_events<0.7784.0>:ns_node_disco_rep_events:handle_event:42]Detected a new nodes (['ns_1@10.3.2.12']). Moving config around.
      [ns_server:info,2012-12-05T14:59:51.730,ns_1@10.3.2.16:ns_node_disco_events<0.7784.0>:ns_node_disco_log:handle_event:46]ns_node_disco_log: nodes changed: ['ns_1@10.3.2.10','ns_1@10.3.2.11',
      'ns_1@10.3.2.12','ns_1@10.3.2.16',
      'ns_1@10.3.2.76']
      [ns_server:info,2012-12-05T14:59:51.730,ns_1@10.3.2.16:ns_config_rep<0.7791.0>:ns_config_rep:handle_info:220]Replicating config to/from:
      ['ns_1@10.3.2.12']
      [ns_server:info,2012-12-05T14:59:51.730,ns_1@10.3.2.16:ns_config_rep<0.7791.0>:ns_config_rep:do_pull:341]Pulling config from: 'ns_1@10.3.2.12'

      [user:info,2012-12-05T14:59:51.746,ns_1@10.3.2.16:ns_port_memcached<0.7891.0>:ns_port_server:handle_info:107]Port server memcached on node 'ns_1@10.3.2.16' exited with status 1. Restarting. Messages: Wed Dec 05 14:59:51.730078 Pacific Standard Time 3: Error initializing sasl.
      [user:info,2012-12-05T14:59:51.746,ns_1@10.3.2.16:ns_node_disco<0.7785.0>:ns_node_disco:handle_info:162]Node 'ns_1@10.3.2.16' saw that node 'ns_1@10.3.2.77' came up.
      [ns_server:debug,2012-12-05T14:59:51.746,ns_1@10.3.2.16:xdc_rdoc_replication_srv<0.7914.0>:xdc_rdoc_replication_srv:handle_info:132]doing replicate_newnodes_docs
      [ns_server:info,2012-12-05T14:59:51.746,ns_1@10.3.2.16:ns_port_memcached<0.7891.0>:ns_port_server:log:171]memcached<0.7891.0>: Wed Dec 05 14:59:51.730078 Pacific Standard Time 3: Error initializing sasl.

      [ns_server:info,2012-12-05T14:59:51.746,ns_1@10.3.2.16:set_view_update_daemon<0.7918.0>:set_view_update_daemon:init:50]Set view update daemon, starting with the following settings:
      update interval: 5000ms
      minimum number of changes: 5000

      [error_logger:info,2012-12-05T14:59:51.746,ns_1@10.3.2.16:error_logger<0.6.0>:ale_error_logger_handler:log_report:72]
      =========================PROGRESS REPORT=========================
      supervisor:

      {local,ns_server_sup}

      started: [

      {pid,<0.7914.0>}

      ,

      {name,xdc_rdoc_replication_srv}

      ,
      {mfargs,{xdc_rdoc_replication_srv,start_link,[]}},

      {restart_type,permanent}

      ,

      {shutdown,1000}

      ,

      {child_type,worker}

      ]

      [ns_server:debug,2012-12-05T14:59:51.746,ns_1@10.3.2.16:xdc_rdoc_replication_srv<0.7914.0>:xdc_rdoc_replication_srv:replicate_change_to_node:160]Sending _design/_replicator_info to ns_1@10.3.2.10
      [ns_server:info,2012-12-05T14:59:51.746,ns_1@10.3.2.16:<0.7890.0>:supervisor_cushion:handle_info:62]Cushion managed supervisor for memcached failed:

      {abnormal,1}

      [ns_server:debug,2012-12-05T14:59:51.746,ns_1@10.3.2.16:xdc_rdoc_replication_srv<0.7914.0>:xdc_rdoc_replication_srv:replicate_change_to_node:160]Sending _design/_replicator_info to ns_1@10.3.2.11
      [ns_server:debug,2012-12-05T14:59:51.746,ns_1@10.3.2.16:xdc_rdoc_replication_srv<0.7914.0>:xdc_rdoc_replication_srv:replicate_change_to_node:160]Sending _design/_replicator_info to ns_1@10.3.2.12
      [user:warn,2012-12-05T14:59:51.746,ns_1@10.3.2.16:<0.7890.0>:supervisor_cushion:die_slowly:81]Service memcached exited on node 'ns_1@10.3.2.16' in 0.11s

      [ns_server:debug,2012-12-05T14:59:51.746,ns_1@10.3.2.16:xdc_rdoc_replication_srv<0.7914.0>:xdc_rdoc_replication_srv:replicate_change_to_node:160]Sending _design/_replicator_info to ns_1@10.3.2.76
      [ns_server:debug,2012-12-05T14:59:51.746,ns_1@10.3.2.16:ns_node_disco_events<0.7784.0>:ns_node_disco_rep_events:handle_event:42]Detected a new nodes (['ns_1@10.3.2.77']). Moving config around.
      [ns_server:debug,2012-12-05T14:59:51.746,ns_1@10.3.2.16:xdc_rdoc_replication_srv<0.7914.0>:xdc_rdoc_replication_srv:replicate_change_to_node:160]Sending _design/_replicator_info to ns_1@10.3.2.77
      [ns_server:debug,2012-12-05T14:59:51.746,ns_1@10.3.2.16:xdc_rdoc_replication_srv<0.7914.0>:xdc_rdoc_replication_srv:handle_info:132]doing replicate_newnodes_docs
      [user:info,2012-12-05T14:59:51.746,ns_1@10.3.2.16:ns_cluster<0.776.0>:ns_cluster:perform_actual_join:676]Node ns_1@10.3.2.16 joined cluster
      [ns_server:info,2012-12-05T14:59:51.746,ns_1@10.3.2.16:ns_node_disco_events<0.7784.0>:ns_node_disco_log:handle_event:46]ns_node_disco_log: nodes changed: ['ns_1@10.3.2.10','ns_1@10.3.2.11',
      'ns_1@10.3.2.12','ns_1@10.3.2.16',
      'ns_1@10.3.2.76','ns_1@10.3.2.77']
      [cluster:debug,2012-12-05T14:59:51.746,ns_1@10.3.2.16:ns_cluster<0.776.0>:ns_cluster:handle_call:133]complete_join([

      {<<"targetNode">>,<<"ns_1@10.3.2.16">>}

      ,
      {<<"availableStorage">>,
      {struct,[{<<"hdd">>,
      [{struct,[

      {<<"path">>,<<"C:\\">>}

      ,

      {<<"sizeKBytes">>,104752124}

      ,

      {<<"usagePercent">>,21}

      ]}]}]}},

      {<<"memoryQuota">>,6553}

      ,
      {<<"storageTotals">>,
      {struct,[{<<"ram">>,
      {struct,[

      {<<"usedByData">>,1324889160}

      ,

      {<<"total">>,8589467648.0}

      ,

      {<<"quotaTotal">>,6871318528.0}

      ,

      {<<"used">>,3064373248.0}

      ]}},
      {<<"hdd">>,
      {struct,[

      {<<"usedByData">>,1331384984}

      ,

      {<<"total">>,107266174976.0}

      ,

      {<<"quotaTotal">>,107266174976.0}

      ,

      {<<"used">>,22525896744.0}

      ,

      {<<"free">>,84740278232.0}

      ]}}]}},
      {<<"storage">>,
      {struct,[

      {<<"ssd">>,[]}

      ,
      {<<"hdd">>,
      [{struct,[

      {<<"path">>,<<"c:/data">>}

      ,

      {<<"quotaMb">>,<<"none">>}

      ,

      {<<"state">>,<<"ok">>}

      ]}]}]}},
      {<<"systemStats">>,
      {struct,[

      {<<"cpu_utilization_rate">>,45.3}

      ,

      {<<"swap_total">>,17176993792.0}

      ,

      {<<"swap_used">>,3208110080.0}

      ]}},
      {<<"interestingStats">>,
      {struct,[

      {<<"curr_items">>,661759}

      ,

      {<<"curr_items_tot">>,1325717}

      ,

      {<<"vb_replica_curr_items">>,663958}

      ]}},

      {<<"uptime">>,<<"52855">>}

      ,

      {<<"memoryTotal">>,8589467648.0}

      ,

      {<<"memoryFree">>,5.5250944e9}

      ,

      {<<"mcdMemoryReserved">>,6553}

      ,

      {<<"mcdMemoryAllocated">>,6553}

      ,

      {<<"otpNode">>,<<"ns_1@10.3.2.10">>}

      ,

      {<<"otpCookie">>,<<"nwuviqsbxzcqnmhv">>}

      ,

      {<<"clusterMembership">>,<<"active">>}

      ,

      {<<"status">>,<<"healthy">>}

      ,

      {<<"hostname">>,<<"10.3.2.10:8091">>}

      ,

      {<<"clusterCompatibility">>,1}

      ,

      {<<"version">>,<<"1.8.1-937-rel-enterprise">>}

      ,

      {<<"os">>,<<"windows">>}

      ,
      {<<"ports">>,
      {struct,[

      {<<"proxy">>,11211}

      ,

      {<<"direct">>,11210}

      ]}}]) ->

      {ok, ok}
      # Subject Project Status CR V
      For Gerrit Dashboard: &For+MB-7369=message:MB-7369

        Activity

        Hide
        Aliaksey Artamonau Aliaksey Artamonau added a comment -

        It seems that since we removed part of the config upgrade code in 2.0, in 1.8.1 some config keys and particularly

        {node, node(), issal}

        will have greater number of changes than on clean 2.0. And hence those keys will be overwritten when 2.0 node is added to a cluster if it used to be a part of the same cluster with 1.8.1.

        Show
        Aliaksey Artamonau Aliaksey Artamonau added a comment - It seems that since we removed part of the config upgrade code in 2.0, in 1.8.1 some config keys and particularly {node, node(), issal} will have greater number of changes than on clean 2.0. And hence those keys will be overwritten when 2.0 node is added to a cluster if it used to be a part of the same cluster with 1.8.1.
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Aliaksey discovered true cause of this and is fixing right now

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Aliaksey discovered true cause of this and is fixing right now
        Show
        Aliaksey Artamonau Aliaksey Artamonau added a comment - http://review.couchbase.org/23124
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        per bug scrub
        please upload the fix to 2.0.1 branch

        Show
        farshid Farshid Ghods (Inactive) added a comment - per bug scrub please upload the fix to 2.0.1 branch
        Hide
        farshid Farshid Ghods (Inactive) added a comment -
        Show
        farshid Farshid Ghods (Inactive) added a comment - merged to 2.0.1 http://review.couchbase.org/#/c/23376/

          People

          • Assignee:
            Aliaksey Artamonau Aliaksey Artamonau
            Reporter:
            thuan Thuan Nguyen
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes