Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51208

Node information on servers page displayed as loading in mixed mode cluster

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 7.1.0
    • 7.1.0
    • ns_server
    • Community Edition 7.1.0 build 2383
      Enterprise Edition 7.1.0 build 2383
    • Untriaged
    • Centos 64-bit
    • 1
    • Unknown
    • UI 2022-Feb

    Description

      When accessing node info on servers page from a node that has enterprise edition installed in a mixed cluster where few nodes are in community edition, the node info is displayed as loading and the actual info is never displayed. Although same issue is not observed from CE node.

      Steps to reproduce -

      1. Have a 3 node cluster with all nodes having community edition (CE) installed.
      2. Rebalance in a node which has EE installed.
      3. Rebalance out a node which has CE installed. (This might not be required, but since I did it while testing I have mentioned it here)
      4. Check node info on servers page from node the EE and CE node.

      Attachments

        1. node102.zip
          11.19 MB
        2. node103.zip
          8.72 MB
        3. node104.zip
          7.37 MB
        4. Screenshot 2022-02-25 at 3.08.40 PM.png
          Screenshot 2022-02-25 at 3.08.40 PM.png
          363 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          I am not able to reproduce this hang on my macbook. Looking at the ns_server.http_access.log on node104 I see this set of endpoints getting called continously...every 10 seconds. And they're returning 200 (success).

          10.112.205.1 - Administrator/UI [25/Feb/2022:01:40:24 -0800] "GET /nodes/ns_1%4010.112.205.104 HTTP/1.1" 200 3056 "http://10.112.205.104:8091/ui/index.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36" 4
          10.112.205.1 - Administrator/UI [25/Feb/2022:01:40:24 -0800] "GET /nodes/ns_1%4010.112.205.102 HTTP/1.1" 200 2956 "http://10.112.205.104:8091/ui/index.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36" 3
          10.112.205.1 - Administrator/UI [25/Feb/2022:01:40:24 -0800] "GET /nodes/ns_1%4010.112.205.103 HTTP/1.1" 200 2962 "http://10.112.205.104:8091/ui/index.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36" 4
          10.112.205.1 - Administrator/UI [25/Feb/2022:01:40:34 -0800] "GET /pools/default?etag=66757712&waitChange=10000 HTTP/1.1" 200 8347 "http://10.112.205.104:8091/ui/index.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36" 10006

          I'm going to forward this ticket to the UI folks to comment on what the UI is looking for and why it's spinning in the meantime.

          steve.watanabe Steve Watanabe added a comment - I am not able to reproduce this hang on my macbook. Looking at the ns_server.http_access.log on node104 I see this set of endpoints getting called continously...every 10 seconds. And they're returning 200 (success). 10.112.205.1 - Administrator/UI [25/Feb/2022:01:40:24 -0800] "GET /nodes/ns_1%4010.112.205.104 HTTP/1.1" 200 3056 "http://10.112.205.104:8091/ui/index.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36" 4 10.112.205.1 - Administrator/UI [25/Feb/2022:01:40:24 -0800] "GET /nodes/ns_1%4010.112.205.102 HTTP/1.1" 200 2956 "http://10.112.205.104:8091/ui/index.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36" 3 10.112.205.1 - Administrator/UI [25/Feb/2022:01:40:24 -0800] "GET /nodes/ns_1%4010.112.205.103 HTTP/1.1" 200 2962 "http://10.112.205.104:8091/ui/index.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36" 4 10.112.205.1 - Administrator/UI [25/Feb/2022:01:40:34 -0800] "GET /pools/default?etag=66757712&waitChange=10000 HTTP/1.1" 200 8347 "http://10.112.205.104:8091/ui/index.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36" 10006 I'm going to forward this ticket to the UI folks to comment on what the UI is looking for and why it's spinning in the meantime.
          dfinlay Dave Finlay added a comment -

          Umang: Could you repro and get a HAR file recording from the client side and post along with the server side logs for us to take a look at?

          (A HAR file recording can be obtained by opening developer tools, selecting the "Network tab", navigating to the server page, right clicking on the network panel and selecting "Save all as HAR file".

          dfinlay Dave Finlay added a comment - Umang : Could you repro and get a HAR file recording from the client side and post along with the server side logs for us to take a look at? (A HAR file recording can be obtained by opening developer tools, selecting the "Network tab", navigating to the server page, right clicking on the network panel and selecting "Save all as HAR file".
          owend Daniel Owen added a comment -

          Raluca Lupu has managed to reproduce. Therefore reassigning.

          owend Daniel Owen added a comment - Raluca Lupu has managed to reproduce. Therefore reassigning.
          raluca.lupu Raluca Lupu added a comment - - edited

          Hi Umang,

          On the EE node UI receives a 500 Internal Server Error on the first attempt to call POST "pools/default/stats/range/" endpoint;

          Here is the payload UI is sending:

          [{"nodes":["192.168.164.173:9000"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"index_memory_used_total"}]},{"nodes":["192.168.164.173:9000"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"fts_num_bytes_used_ram"}]},{"nodes":["192.168.164.173:9000"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"sysproc_mem_resident"},{"label":"proc","value":"java|cbas","operator":"=~"}],"applyFunctions":["sum"]},{"nodes":["192.168.164.173:9000"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"cbas_disk_used_bytes_total"}]},{"nodes":["[::1]:9001"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"index_memory_used_total"}]},{"nodes":["[::1]:9001"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"fts_num_bytes_used_ram"}]},{"nodes":["[::1]:9001"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"sysproc_mem_resident"},{"label":"proc","value":"java|cbas","operator":"=~"}],"applyFunctions":["sum"]},{"nodes":["[::1]:9001"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"cbas_disk_used_bytes_total"}]}]
          

           
          This is the server response:

          ["Unexpected server error, request logged."]
          

          I've also observed this error on ./cluster_run output:

          [ns_server:error,2022-02-28T16:04:16.374Z,n_1@127.0.0.1:<0.10851.0>:menelaus_util:reply_server_error_before_close:210]Server error during processing: ["web request failed",
                                           {path,"/pools/default/stats/range/"},
                                           {method,'POST'},
                                           {type,error},
                                           {what,badarg},
                                           {trace,
                                            [{erlang,list_to_binary,
                                              [[49,57,50,46,49,54,56,46,49,54,52,46,49,
                                                55,51,58|undefined]],
                                              [{error_info,
                                                #{module => erl_erts_errors}}]},
                                             {menelaus_web_node,build_node_hostname,4,
                                              [{file,"src/menelaus_web_node.erl"},
                                               {line,486}]},
                                             {menelaus_web_node,
                                              '-get_hostnames/3-lc$^0/1-0-',4,
                                              [{file,"src/menelaus_web_node.erl"},
                                               {line,595}]},
                                             {menelaus_web_node,find_node_hostname,3,
                                              [{file,"src/menelaus_web_node.erl"},
                                               {line,637}]},
                                             {menelaus_web_stats,
                                              '-validate_nodes_v2/3-fun-0-',2,
                                              [{file,"src/menelaus_web_stats.erl"},
                                               {line,1344}]},
                                             {misc,'-partitionmap/2-fun-0-',3,
                                              [{file,"src/misc.erl"},{line,3139}]},
                                             {menelaus_web_stats,
                                              '-validate_nodes_v2/3-fun-1-',2,
                                              [{file,"src/menelaus_web_stats.erl"},
                                               {line,1342}]},
                                             {validator,validate,3,
                                              [{file,"src/validator.erl"},
                                               {line,281}]}]}]
          ::1 - - [28/Feb/2022:16:04:16 +0000] "POST /pools/default/stats/range/ HTTP/1.1" 500 44 "http://localhost:9001/ui/index-dev.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.109 Safari/537.36" 2
          

          I think this issue should be resolved on the server side.

           

          raluca.lupu Raluca Lupu added a comment - - edited Hi Umang , On the EE node UI receives a 500 Internal Server Error on the first attempt to call POST "pools/default/stats/range/" endpoint; Here is the payload UI is sending: [{"nodes":["192.168.164.173:9000"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"index_memory_used_total"}]},{"nodes":["192.168.164.173:9000"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"fts_num_bytes_used_ram"}]},{"nodes":["192.168.164.173:9000"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"sysproc_mem_resident"},{"label":"proc","value":"java|cbas","operator":"=~"}],"applyFunctions":["sum"]},{"nodes":["192.168.164.173:9000"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"cbas_disk_used_bytes_total"}]},{"nodes":["[::1]:9001"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"index_memory_used_total"}]},{"nodes":["[::1]:9001"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"fts_num_bytes_used_ram"}]},{"nodes":["[::1]:9001"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"sysproc_mem_resident"},{"label":"proc","value":"java|cbas","operator":"=~"}],"applyFunctions":["sum"]},{"nodes":["[::1]:9001"],"step":3,"timeWindow":360,"start":-3,"metric":[{"label":"name","value":"cbas_disk_used_bytes_total"}]}]   This is the server response: ["Unexpected server error, request logged."] I've also observed this error on ./cluster_run output: [ns_server:error,2022-02-28T16:04:16.374Z,n_1@127.0.0.1:<0.10851.0>:menelaus_util:reply_server_error_before_close:210]Server error during processing: ["web request failed", {path,"/pools/default/stats/range/"}, {method,'POST'}, {type,error}, {what,badarg}, {trace, [{erlang,list_to_binary, [[49,57,50,46,49,54,56,46,49,54,52,46,49, 55,51,58|undefined]], [{error_info, #{module => erl_erts_errors}}]}, {menelaus_web_node,build_node_hostname,4, [{file,"src/menelaus_web_node.erl"}, {line,486}]}, {menelaus_web_node, '-get_hostnames/3-lc$^0/1-0-',4, [{file,"src/menelaus_web_node.erl"}, {line,595}]}, {menelaus_web_node,find_node_hostname,3, [{file,"src/menelaus_web_node.erl"}, {line,637}]}, {menelaus_web_stats, '-validate_nodes_v2/3-fun-0-',2, [{file,"src/menelaus_web_stats.erl"}, {line,1344}]}, {misc,'-partitionmap/2-fun-0-',3, [{file,"src/misc.erl"},{line,3139}]}, {menelaus_web_stats, '-validate_nodes_v2/3-fun-1-',2, [{file,"src/menelaus_web_stats.erl"}, {line,1342}]}, {validator,validate,3, [{file,"src/validator.erl"}, {line,281}]}]}] ::1 - - [28/Feb/2022:16:04:16 +0000] "POST /pools/default/stats/range/ HTTP/1.1" 500 44 "http://localhost:9001/ui/index-dev.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.109 Safari/537.36" 2 I think this issue should be resolved on the server side.  

          Build couchbase-server-7.2.0-1001 contains ns_server commit 43e8640 with commit message:
          MB-51208: get_hostnames should not assume the whole cluster is EE

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.2.0-1001 contains ns_server commit 43e8640 with commit message: MB-51208 : get_hostnames should not assume the whole cluster is EE

          Build couchbase-server-7.1.0-2423 contains ns_server commit 43e8640 with commit message:
          MB-51208: get_hostnames should not assume the whole cluster is EE

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-2423 contains ns_server commit 43e8640 with commit message: MB-51208 : get_hostnames should not assume the whole cluster is EE

          Verified with Community Edition 7.1.0 build 2423 and Enterprise Edition 7.1.0 build 2423.

          Node information on servers page displayed properly.

          chanabasappa.ghali Chanabasappa Ghali added a comment - Verified with Community Edition 7.1.0 build 2423 and Enterprise Edition 7.1.0 build 2423. Node information on servers page displayed properly.

          People

            umang.agrawal Umang
            umang.agrawal Umang
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty