Uploaded image for project: 'Couchbase Gateway'
  1. Couchbase Gateway
  2. CBG-3554

[3.1.2 backport] Increasing memory usage when failing to apply a database config from the bucket

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 3.1.2
    • 3.1.2, 3.0.9
    • SyncGateway
    • Security Level: Public
    • None
    • CBG Sprint 138
    • 3

    Description

      When sync gateway polls the bucket for db configs, finds one and attempts to load and apply that config, if there is a failure that is repeated, we see increasing memory usage. 

      Repro steps:

      1. Create a db
      2. Write some docs to allocate sequences
      3. Take down sync gateway and manually edit the _sync:seq doc to a string
      4. Bring sync gateway back up
      5. Allow the fetch config interval to run over and over, currently observing incresing memory usage

      We seem to create agents when we fetch the config:

      2023-10-20T13:53:44.415+01:00 [DBG] gocb+: SDK Version: gocbcore/v10.0.2
      2023-10-20T13:53:44.415+01:00 [DBG] gocb+: Creating new agent group: &{AgentConfig:{BucketName: UserAgent:gocb/v2.3.1 SeedConfig:{HTTPAddrs:[127.0.0.1:8091] MemdAddrs:[127.0.0.1:11210]} SecurityConfig:{UseTLS:false TLSRootCAProvider:0x102e9aea0 InitialBootstrapNonTLS:false Auth:0x140140d7750 AuthMechanisms:[]} CompressionConfig:{Enabled:false DisableDecompression:false MinSize:0 MinRatio:0} ConfigPollerConfig:{HTTPRedialPeriod:0s HTTPRetryDelay:0s HTTPMaxWait:0s CccpMaxWait:0s CccpPollPeriod:0s} IoConfig:{NetworkType: UseMutationTokens:true UseDurations:true UseOutOfOrderResponses:true DisableXErrorHello:false DisableJSONHello:false DisableSyncReplicationHello:false EnablePITRHello:false UseCollections:true} KVConfig:{ConnectTimeout:10s PoolSize:2 MaxQueueSize:0} HTTPConfig:{MaxIdleConns:64000 MaxIdleConnsPerHost:256 IdleConnectionTimeout:1m30s} DefaultRetryStrategy:0x140140d75c0 CircuitBreakerConfig:{Enabled:true VolumeThreshold:0 ErrorThresholdPercentage:0 SleepWindow:0s RollingWindow:0s CompletionCallback:<nil> CanaryTimeout:0s} OrphanReporterConfig:{Enabled:true ReportInterval:0s SampleSize:0} TracerConfig:{Tracer:0x140140d75e0 NoRootTraceSpans:true} MeterConfig:{Meter:<nil>}}}
      2023-10-20T13:53:44.415+01:00 [DBG] gocb+: SDK Version: gocbcore/v10.0.2
      2023-10-20T13:53:44.415+01:00 [DBG] gocb+: Creating new agent: &{BucketName: UserAgent:gocb/v2.3.1 SeedConfig:{HTTPAddrs:[127.0.0.1:8091] MemdAddrs:[127.0.0.1:11210]} SecurityConfig:{UseTLS:false TLSRootCAProvider:0x102e9aea0 InitialBootstrapNonTLS:false Auth:0x140140d7750 AuthMechanisms:[]} CompressionConfig:{Enabled:false DisableDecompression:false MinSize:0 MinRatio:0} ConfigPollerConfig:{HTTPRedialPeriod:0s HTTPRetryDelay:0s HTTPMaxWait:0s CccpMaxWait:0s CccpPollPeriod:0s} IoConfig:{NetworkType: UseMutationTokens:true UseDurations:true UseOutOfOrderResponses:true DisableXErrorHello:false DisableJSONHello:false DisableSyncReplicationHello:false EnablePITRHello:false UseCollections:true} KVConfig:{ConnectTimeout:10s PoolSize:2 MaxQueueSize:0} HTTPConfig:{MaxIdleConns:64000 MaxIdleConnsPerHost:256 IdleConnectionTimeout:1m30s} DefaultRetryStrategy:0x140140d75c0 CircuitBreakerConfig:{Enabled:true VolumeThreshold:0 ErrorThresholdPercentage:0 SleepWindow:0s RollingWindow:0s CompletionCallback:<nil> CanaryTimeout:0s} OrphanReporterConfig:{Enabled:true ReportInterval:0s SampleSize:0} TracerConfig:{Tracer:0x140140d75e0 NoRootTraceSpans:true} MeterConfig:{Meter:<nil>}}
      2023-10-20T13:53:44.418+01:00 [DBG] gocb+: Aggregate metrics: {"kv":{"get":{"percentiles_us":{"100.0":"<= 1500.00","50.0":"<= 1500.00","90.0":"<= 1500.00","99.0":"<= 1500.00","99.9":"<= 1500.00"},"total_count":1}},"meta":{"emit_interval_s":600000000000},"query":{"query":{"percentiles_us":{"100.0":"<= 38443.36","50.0":"<= 17085.94","90.0":"<= 38443.36","99.0":"<= 38443.36","99.9":"<= 38443.36"},"total_count":9}}}
       
      2023-10-20T13:53:44.420+01:00 [DBG] gocb+: CCCP Looper starting.
      2023-10-20T13:53:44.489+01:00 [DBG] gocb+: SDK Version: gocbcore/v10.0.2
      2023-10-20T13:53:44.489+01:00 [DBG] gocb+: Creating new agent: &{BucketName:test UserAgent:gocb/v2.3.1 SeedConfig:{HTTPAddrs:[127.0.0.1:8091] MemdAddrs:[127.0.0.1:11210]} SecurityConfig:{UseTLS:false TLSRootCAProvider:0x102e9aea0 InitialBootstrapNonTLS:false Auth:0x140140d7750 AuthMechanisms:[]} CompressionConfig:{Enabled:false DisableDecompression:false MinSize:0 MinRatio:0} ConfigPollerConfig:{HTTPRedialPeriod:0s HTTPRetryDelay:0s HTTPMaxWait:0s CccpMaxWait:0s CccpPollPeriod:0s} IoConfig:{NetworkType: UseMutationTokens:true UseDurations:true UseOutOfOrderResponses:true DisableXErrorHello:false DisableJSONHello:false DisableSyncReplicationHello:false EnablePITRHello:false UseCollections:true} KVConfig:{ConnectTimeout:10s PoolSize:2 MaxQueueSize:0} HTTPConfig:{MaxIdleConns:64000 MaxIdleConnsPerHost:256 IdleConnectionTimeout:1m30s} DefaultRetryStrategy:0x140140d75c0 CircuitBreakerConfig:{Enabled:true VolumeThreshold:0 ErrorThresholdPercentage:0 SleepWindow:0s RollingWindow:0s CompletionCallback:<nil> CanaryTimeout:0s} OrphanReporterConfig:{Enabled:true ReportInterval:0s SampleSize:0} TracerConfig:{Tracer:0x140140d75e0 NoRootTraceSpans:true} MeterConfig:{Meter:<nil>}}
      2023-10-20T13:53:44.491+01:00 [DBG] gocb+: CCCP Looper starting. 

      I see no evidence we actually close these after the failure to unmarshal the _sync:seq doc.

      Below are some heap profiles of the repro runnning for 1 minute, 4 minutes and 10 minutes respectively:

       

      Num goroutines also climbs fairly rapidly, I saw 733 then after a couple more mins we see 964 then another couple of minutes later I see 1123.

       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              tor.colvin Tor Colvin
              tor.colvin Tor Colvin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty