Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-43486

Leak due to verbose logging MultiScanInternal()

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • 7.0.0
    • Cheshire-Cat
    • query
    • Untriaged
    • 1
    • Unknown

    Description

      During the analysis of MB-43070, we encountered a query OOM as follows:

      cbcollects:

      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.106.134.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.106.136.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.120.58.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.120.73.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.120.74.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.120.75.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.120.77.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.120.81.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.120.86.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.121.77.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.123.24.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.123.25.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.123.26.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.123.31.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.123.32.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.123.33.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.96.122.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.96.14.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.96.18.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.96.243.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.96.254.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.96.48.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.97.105.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.97.110.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.97.112.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.97.149.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.97.150.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.97.151.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1608351004/collectinfo-2020-12-19T041007-ns_1%40172.23.97.74.zip

      Query nodes are 172.23.97.148 & 149.

      On inspection of query node 149, this is where memory goes:

      (pprof) lines
      (pprof) top
      Showing nodes accounting for 9.81GB, 98.02% of 10.01GB total
      Dropped 462 nodes (cum <= 0.05GB)
      Showing top 10 nodes out of 28
      flat flat% sum% cum cum%
      9.58GB 95.71% 95.71% 9.58GB 95.72% github.com/couchbase/indexing/secondary/queryport/client.(*GsiClient).MultiScanInternal /home/marco/query/src/github.com/couchbase/indexing/secondary/logging/logging.go:297
      0.09GB 0.9% 96.61% 0.09GB 0.95% [cbq-engine]
      0.08GB 0.75% 97.35% 0.08GB 0.75% strings.Map /usr/local/go/src/strings/builder.go:40
      0.07GB 0.67% 98.02% 0.07GB 0.67% github.com/couchbase/gometa/protocol.createConnection /home/marco/query/src/github.com/couchbase/gometa/protocol/followerServer.go:220
      0 0% 98.02% 0.07GB 0.67% github.com/couchbase/gometa/protocol.createConnection /home/marco/query/src/github.com/couchbase/gometa/protocol/followerServer.go:218
      0 0% 98.02% 0.07GB 0.67% github.com/couchbase/gometa/protocol.syncWithPeer /home/marco/query/src/github.com/couchbase/gometa/protocol/discovery.go:852
      0 0% 98.02% 0.06GB 0.59% github.com/couchbase/indexing/secondary/manager/client.(*watcher).updateServiceMap /home/marco/query/src/github.com/couchbase/indexing/secondary/manager/client/metadata_provider.go:5084
      0 0% 98.02% 0.06GB 0.63% github.com/couchbase/indexing/secondary/planner.(*IndexerConstraint).CanAddIndex /home/marco/query/src/github.com/couchbase/indexing/secondary/planner/planner.go:3519
      0 0% 98.02% 0.06GB 0.63% github.com/couchbase/indexing/secondary/planner.(*RandomPlacement).findLeastUsedAndPopulatedTargetNode /home/marco/query/src/github.com/couchbase/indexing/secondary/planner/planner.go:5340
      0 0% 98.02% 0.06GB 0.63% github.com/couchbase/indexing/secondary/planner.(*SAPlanner).DropReplica /home/marco/query/src/github.com/couchbase/indexing/secondary/planner/planner.go:659
      In particular, the memory goes in this bit of logging:

      https://github.com/couchbase/indexing/blob/dec9e932a90e3339eb787d317609dfac2a281b85/secondary/queryport/client/client.go#L909

      The verbose formatting placed in a local variable causes the string to be allocated on the heap, because the variable is passed higher up the stack.
      Lazy logging would help in this case, or at least pass the constant directly to logging.Verbosef().

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-43486
          # Subject Branch Project Status CR V

          Activity

            People

              marco.greco Marco Greco (Inactive)
              marco.greco Marco Greco (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There is 1 open Gerrit change

                  PagerDuty