Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-52385

[Backport MB-46288 to 6.6.6] Optimize cinfo in newMetadataProvider

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 1
    • Unknown

    Description

      [Backport MB-46288 to 6.6.6] [System test upgrade] : [Error] : Error messages like "DDLServiceMgr:rebalanceDone(): Failed to initialize metadata provider" seen in indexer log during upgrade

      Scripts to Repro
      1. Run the 6.6.2 longevity test for 3 days.

      ./sequoia -client 172.23.96.162:2375 -provider file:centos_third_cluster.yml -test tests/integration/test_allFeatures_madhatter_durability.yml -scope tests/integration/scope_Xattrs_Madhatter.yml -scale 3 -repeat 0 -log_level 0 -version 6.6.2-9588 -skip_setup=false -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=604800 -show_topology=true
      

      2. It had 27 nodes at the end of the test.
      3. Added 6 7.0.0(172.23.105.102,172.23.105.62,172.23.106.232,172.23.106.239,172.23.106.37, 172.23.106.246) nodes and rebalanced in and removed 6 node from 6.6.2(172.23.110.75,172.23.110.76,172.23.105.61,172.23.106.191,172.23.106.209,172.23.106.70)
      and rebalanced out.
      4. Failed over 6 nodes and graceful failover + recovery + rebalance.
      5. Now swap rebalance 6 nodes. 2 data + 2 index + 1 eventing + 1 analytics.
      6. Then again did bunch of rebalances(stop/start and rebalance out eventing) to workaround MB-46274 and MB-46246.

      At this point I saw error messages like the following on indexer.log

      172.23.105.62 : index

      /opt/couchbase/var/lib/couchbase/logs/indexer.log:2021-05-13T03:46:33.130-07:00 [Error] DDLServiceMgr:rebalanceDone(): Failed to initialize metadata provider.  Error=DDLServiceMgr: Failed to initialize metadata provider.  Unable to connect to all indexer nodes within 5 seconds..
      

      cbcollect_info attached. This was not seen in upgrade during 6.6.2->9588 to 7.0.0-5033.
      Possibly a dup of MB-46284?  

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            sai.teja Sai Krishna Teja added a comment - - edited

            Optimisation here in the original is below change from cinfo.Fetch to cinfo.FetchNodesAndSvsInfo to reduce the number of calls to ns_server

            sai.teja Sai Krishna Teja added a comment - - edited Optimisation here in the original is below change from cinfo.Fetch to cinfo.FetchNodesAndSvsInfo to reduce the number of calls to ns_server
            sai.teja Sai Krishna Teja added a comment - - edited

            Does not look like a must have in 6.6.6 as the current watcher timeout in 6.6.6 is 30 secs.
            This patch was not considered for 6.6.5 and only change to increase watcher timeout was considered.

            CC: Deepkaran Salooja Varun Velamuri

            sai.teja Sai Krishna Teja added a comment - - edited Does not look like a must have in 6.6.6 as the current watcher timeout in 6.6.6 is 30 secs. This patch was not considered for 6.6.5 and only change to increase watcher timeout was considered. CC: Deepkaran Salooja Varun Velamuri

            Build couchbase-server-6.6.6-10557 contains indexing commit 52e902a with commit message:
            MB-52385: Fetch only nodes and nodesvs in newMetadataProvider

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.6-10557 contains indexing commit 52e902a with commit message: MB-52385 : Fetch only nodes and nodesvs in newMetadataProvider

            Before fix : Calls were made for each bucket.

            127.0.0.1 - @index-cbauth [03/Jan/2023:20:43:43 -0800] "GET /pools/default?uuid=9f1f8881807cfddf453388d3e16ec423 HTTP/1.1" 200 4162 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:43:43 -0800] "GET /pools/default/buckets?v=55449572&uuid=9f1f8881807cfddf453388d3e16ec423 HTTP/1.1" 200 2 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:43:43 -0800] "GET /pools/default/nodeServices HTTP/1.1" 200 434 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:43:43 -0800] "GET /pools/default/serverGroups?v=52184775 HTTP/1.1" 200 1296 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default?uuid=9f1f8881807cfddf453388d3e16ec423 HTTP/1.1" 200 4848 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/buckets?v=54092344&uuid=9f1f8881807cfddf453388d3e16ec423 HTTP/1.1" 200 51522 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/b/test HTTP/1.1" 200 8322 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/b/test2 HTTP/1.1" 200 8327 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/b/test3 HTTP/1.1" 200 8327 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/b/test4 HTTP/1.1" 200 8327 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/b/test5 HTTP/1.1" 200 8327 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/nodeServices HTTP/1.1" 200 434 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/serverGroups?v=52184775 HTTP/1.1" 200 1627 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:59 -0800] "GET /pools/default?uuid=9f1f8881807cfddf453388d3e16ec423 HTTP/1.1" 200 4865 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:59 -0800] "GET /pools/default/buckets?v=54092344&uuid=9f1f8881807cfddf453388d3e16ec423 HTTP/1.1" 200 51607 - Go-http-client/1.1-indexer-newMetadataProvider
            

            After fix: Calls were made for cluster

             

            127.0.0.1 - @index-cbauth [03/Jan/2023:21:19:13 -0800] "GET /pools/default?uuid=f699c78aa8ae4afdb0eb1d282951b55a HTTP/1.1" 200 6311 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:21:19:13 -0800] "GET /pools/default/nodeServices HTTP/1.1" 200 792 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:21:19:20 -0800] "GET /pools/default?uuid=f699c78aa8ae4afdb0eb1d282951b55a HTTP/1.1" 200 6294 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:21:19:20 -0800] "GET /pools/default/nodeServices HTTP/1.1" 200 792 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:21:19:28 -0800] "GET /pools/default?uuid=f699c78aa8ae4afdb0eb1d282951b55a HTTP/1.1" 200 6311 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:21:19:28 -0800] "GET /pools/default/nodeServices HTTP/1.1" 200 792 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:21:19:39 -0800] "GET /pools/default?uuid=f699c78aa8ae4afdb0eb1d282951b55a HTTP/1.1" 200 6328 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:21:19:39 -0800] "GET /pools/default/nodeServices HTTP/1.1" 200 792 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:21:20:06 -0800] "GET /pools/default?uuid=f699c78aa8ae4afdb0eb1d282951b55a HTTP/1.1" 200 6313 - Go-http-client/1.1-indexer-newMetadataProvider
            127.0.0.1 - @index-cbauth [03/Jan/2023:21:20:06 -0800] "GET /pools/default/nodeServices HTTP/1.1" 200 792 - Go-http-client/1.1-indexer-newMetadataProvider 

             

             lfc@LFCs-MacBook-Pro testrunner % grep newMetadataProvider ~/Desktop/new_issue/http_access_internal_6_6_6_10550.log | wc -l
                  58
            lfc@LFCs-MacBook-Pro testrunner % grep newMetadataProvider ~/Desktop/new_issue/http_access_internal_on6_6_6_10574.log | wc -l
                  10
            lfc@LFCs-MacBook-Pro testrunner % 

            hemant.rajput Hemant Rajput added a comment - Before fix : Calls were made for each bucket. 127.0.0.1 - @index-cbauth [03/Jan/2023:20:43:43 -0800] "GET /pools/default?uuid=9f1f8881807cfddf453388d3e16ec423 HTTP/1.1" 200 4162 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:43:43 -0800] "GET /pools/default/buckets?v=55449572&uuid=9f1f8881807cfddf453388d3e16ec423 HTTP/1.1" 200 2 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:43:43 -0800] "GET /pools/default/nodeServices HTTP/1.1" 200 434 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:43:43 -0800] "GET /pools/default/serverGroups?v=52184775 HTTP/1.1" 200 1296 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default?uuid=9f1f8881807cfddf453388d3e16ec423 HTTP/1.1" 200 4848 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/buckets?v=54092344&uuid=9f1f8881807cfddf453388d3e16ec423 HTTP/1.1" 200 51522 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/b/test HTTP/1.1" 200 8322 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/b/test2 HTTP/1.1" 200 8327 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/b/test3 HTTP/1.1" 200 8327 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/b/test4 HTTP/1.1" 200 8327 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/b/test5 HTTP/1.1" 200 8327 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/nodeServices HTTP/1.1" 200 434 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:52 -0800] "GET /pools/default/serverGroups?v=52184775 HTTP/1.1" 200 1627 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:59 -0800] "GET /pools/default?uuid=9f1f8881807cfddf453388d3e16ec423 HTTP/1.1" 200 4865 - Go-http-client/1.1-indexer-newMetadataProvider 127.0.0.1 - @index-cbauth [03/Jan/2023:20:45:59 -0800] "GET /pools/default/buckets?v=54092344&uuid=9f1f8881807cfddf453388d3e16ec423 HTTP/1.1" 200 51607 - Go-http-client/1.1-indexer-newMetadataProvider After fix: Calls were made for cluster   127.0 . 0.1 - @index -cbauth [ 03 /Jan/ 2023 : 21 : 19 : 13 - 0800 ] "GET /pools/default?uuid=f699c78aa8ae4afdb0eb1d282951b55a HTTP/1.1" 200 6311 - Go-http-client/ 1.1 -indexer-newMetadataProvider 127.0 . 0.1 - @index -cbauth [ 03 /Jan/ 2023 : 21 : 19 : 13 - 0800 ] "GET /pools/default/nodeServices HTTP/1.1" 200 792 - Go-http-client/ 1.1 -indexer-newMetadataProvider 127.0 . 0.1 - @index -cbauth [ 03 /Jan/ 2023 : 21 : 19 : 20 - 0800 ] "GET /pools/default?uuid=f699c78aa8ae4afdb0eb1d282951b55a HTTP/1.1" 200 6294 - Go-http-client/ 1.1 -indexer-newMetadataProvider 127.0 . 0.1 - @index -cbauth [ 03 /Jan/ 2023 : 21 : 19 : 20 - 0800 ] "GET /pools/default/nodeServices HTTP/1.1" 200 792 - Go-http-client/ 1.1 -indexer-newMetadataProvider 127.0 . 0.1 - @index -cbauth [ 03 /Jan/ 2023 : 21 : 19 : 28 - 0800 ] "GET /pools/default?uuid=f699c78aa8ae4afdb0eb1d282951b55a HTTP/1.1" 200 6311 - Go-http-client/ 1.1 -indexer-newMetadataProvider 127.0 . 0.1 - @index -cbauth [ 03 /Jan/ 2023 : 21 : 19 : 28 - 0800 ] "GET /pools/default/nodeServices HTTP/1.1" 200 792 - Go-http-client/ 1.1 -indexer-newMetadataProvider 127.0 . 0.1 - @index -cbauth [ 03 /Jan/ 2023 : 21 : 19 : 39 - 0800 ] "GET /pools/default?uuid=f699c78aa8ae4afdb0eb1d282951b55a HTTP/1.1" 200 6328 - Go-http-client/ 1.1 -indexer-newMetadataProvider 127.0 . 0.1 - @index -cbauth [ 03 /Jan/ 2023 : 21 : 19 : 39 - 0800 ] "GET /pools/default/nodeServices HTTP/1.1" 200 792 - Go-http-client/ 1.1 -indexer-newMetadataProvider 127.0 . 0.1 - @index -cbauth [ 03 /Jan/ 2023 : 21 : 20 : 06 - 0800 ] "GET /pools/default?uuid=f699c78aa8ae4afdb0eb1d282951b55a HTTP/1.1" 200 6313 - Go-http-client/ 1.1 -indexer-newMetadataProvider 127.0 . 0.1 - @index -cbauth [ 03 /Jan/ 2023 : 21 : 20 : 06 - 0800 ] "GET /pools/default/nodeServices HTTP/1.1" 200 792 - Go-http-client/ 1.1 -indexer-newMetadataProvider    lfc@LFCs-MacBook-Pro testrunner % grep newMetadataProvider ~/Desktop/new_issue/http_access_internal_6_6_6_10550.log | wc -l       58 lfc@LFCs-MacBook-Pro testrunner % grep newMetadataProvider ~/Desktop/new_issue/http_access_internal_on6_6_6_10574.log | wc -l       10 lfc@LFCs-MacBook-Pro testrunner % 

            People

              hemant.rajput Hemant Rajput
              amit.kulkarni Amit Kulkarni
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty