Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-58546

[7.1.5-MP] - XDCR - ckptMgr stop can hang if start was not called

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Yes

    Description

      MB-49559 introduced a change where checkpoint manager will do background connection initiation. With it, the Stop() method will depend on the connection being finished. Under normal circumstances, this won't be an issue.

      However, when combined with the fix for MB-57129, the checkpoint manager Stop() can be called before Start(), and this can lead to hanging checkpoint manager go-routines:

      goroutine profile: total 146
      #       0x100332291     net/http.(*persistConn).writeLoop+0xf1  /Users/neil.huang/.cbdepscache/exploded/x86_64/go-1.20.6/go/src/net/http/transport.go:2410
       
      12 @ 0x10003ecd6 0x10000ac3d 0x10000a738 0x100634d17 0x100634d06 0x1006362f2 0x1006af39c 0x100072fe1
      #       0x100634d16     github.com/couchbase/goxdcr/pipeline_svc.(*CheckpointManager).waitForInitConnDone+0x56  /Users/neil.huang/source/couchbase/goproj/src/github.com/couchbase/goxdcr/pipeline_svc/checkpoint_manager.go:643
      #       0x100634d05     github.com/couchbase/goxdcr/pipeline_svc.(*CheckpointManager).closeConnections+0x45     /Users/neil.huang/source/couchbase/goproj/src/github.com/couchbase/goxdcr/pipeline_svc/checkpoint_manager.go:718
      #       0x1006362f1     github.com/couchbase/goxdcr/pipeline_svc.(*CheckpointManager).Stop+0x31                 /Users/neil.huang/source/couchbase/goproj/src/github.com/couchbase/goxdcr/pipeline_svc/checkpoint_manager.go:848
      100 36629    0 36629  x1006af39b     github.com/couchbase/goxdcr/pipeline_ctx.(*PipelineRuntimeCtx).stopService+0x9b         /Users/neil.huang/source/couchbase/goproj/src/github.com/couchbase/goxdcr/pipeline_ctx/pipeline_runtimeCtx.go:196
       
      12 @ 0x10003ecd6 0x1000502cf 0x1000502a6 0x10006e667 0x10009018b 0x1006aec0f 0x1003f77c2 0x100072fe1
      #       0x10006e666     sync.runtime_Semacquire+0x26                                                    /Users/neil.huang/.cbdepscache/exploded/x86_64/go-1.20.6/go/src/runtime/sema.go:62
      #       0x10009018a     sync.(*WaitGroup).Wait+0x4a                                                     /Users/neil.huang/.cbdepscache/exploded/x86_64/go-1.20.6/go/src/sync/waitgroup.go:116
      #       0x1006aec0e     github.com/couchbase/goxdcr/pipeline_ctx.(*PipelineRuntimeCtx).Stop.func2+0x38e /Users/neil.huang/source/couchbase/goproj/src/github.com/couchbase/goxdcr/pipeline_ctx/pipeline_runtimeCtx.go:170
      #       0x1003f77c1     github.com/couchbase/goxdcr/base.ExecWithTimeout.func1+0x21                     /User  0     0  2240k      0 --:--:-- --:s/neil.huang/source/couchbase/goproj/src/github.com/couchbase/goxdcr/base/simple_utils.go:48
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ayush.nayyar Ayush Nayyar
              neil.huang Neil Huang
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty