Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-55406

[BP 7.1.4] - XDCR - Backfill Request Handler deadlock

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.1.4
    • 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.1.0, 7.1.1, 7.1.2, 7.1.3
    • XDCR
    • Untriaged
    • 0
    • No

    Description

      Looking at the stack trace we can see that there are a lot of go-routines stuck:

      340 @ 0x43d456 0x40a745 0x40a2fd 0xa1eb14 0xa1e9a5 0x46cde1
      #       0xa1eb13        github.com/couchbase/goxdcr/backfill_manager.(*BackfillRequestHandler).handleBackfillRequestWithArgs+0x133      /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_request_handler.go:384
      #       0xa1e9a4        github.com/couchbase/goxdcr/backfill_manager.(*BackfillRequestHandler).HandleBackfillRequest+0x24               /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_request_handler.go:365
       
      261 @ 0x43d456 0x40a745 0x40a2fd 0xa1eb14 0xa1c10e 0xa1c0a6 0xa1bacd 0xa1ad8c 0xa0c987 0x9fc56a 0x8c78d9 0x8c704a 0x8c6925 0x46cde1
      #       0xa1eb13        github.com/couchbase/goxdcr/backfill_manager.(*BackfillRequestHandler).handleBackfillRequestWithArgs+0x133      /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_request_handler.go:384
      #       0xa1c10d        github.com/couchbase/goxdcr/backfill_manager.(*BackfillRequestHandler).HandleBackfillRequest+0x54d              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_request_handler.go:365
      #       0xa1c0a5        github.com/couchbase/goxdcr/backfill_manager.(*BackfillMgr).mergeP2PReqAndUnlockCommon+0x4e5                    /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_manager.go:1997
      #       0xa1bacc        github.com/couchbase/goxdcr/backfill_manager.(*BackfillMgr).mergePushRequest+0x42c                              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_manager.go:1948
      #       0xa1ad8b        github.com/couchbase/goxdcr/backfill_manager.(*BackfillMgr).MergeIncomingPeerNodesBackfill+0x2cb                /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_manager.go:1863
      #       0xa0c986        github.com/couchbase/goxdcr/backfill_manager.(*pipelineSvcWrapper).UpdateSettings+0x5a6                         /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_manager.go:188
      #       0x9fc569        github.com/couchbase/goxdcr/pipeline_manager.(*PipelineManager).HandlePeerCkptPush+0x569                        /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/pipeline_manager/pipeline_manager.go:1222
      #       0x8c78d8        github.com/couchbase/goxdcr/peerToPeer.(*PeriodicPushHandler).storePushReqInfoByType+0xf8                       /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/peerToPeer/periodicPushHandler.go:229
      #       0x8c7049        github.com/couchbase/goxdcr/peerToPeer.(*PeriodicPushHandler).storePushRequestInfo+0x349                        /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/peerToPeer/periodicPushHandler.go:190
      #       0x8c6924        github.com/couchbase/goxdcr/peerToPeer.(*PeriodicPushHandler).handleRequest.func1+0xa4                          /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/peerToPeer/periodicPushHandler.go:120
       
      261 @ 0x43d456 0x44ded3 0x44dead 0x468ce5 0x484f12 0x8c65cc 0x46cde1
      #       0x468ce4        sync.runtime_Semacquire+0x24                                                            /home/couchbase/.cbdepscache/exploded/x86_64/go-1.18.1/go/src/runtime/sema.go:56
      #       0x484f11        sync.(*WaitGroup).Wait+0x51                                                             /home/couchbase/.cbdepscache/exploded/x86_64/go-1.18.1/go/src/sync/waitgroup.go:136
      #       0x8c65cb        github.
      

      Looks like the backfill request handler is stuck on a wait, and unable to handle future requests.

      1 @ 0x43d456 0x44ded3 0x44dead 0x468ce5 0x484f12 0x9f1487 0x9f4c0d 0xa10003 0x931293 0x93043c 0x93041c 0xa2127e 0xa1d1df 0x46cde1
      #       0x468ce4        sync.runtime_Semacquire+0x24                                                                            /home/couchbase/.cbdepscache/exploded/x86_64/go-1.18.1/go/src/runtime/sema.go:56
      #       0x484f11        sync.(*WaitGroup).Wait+0x51                                                                             /home/couchbase/.cbdepscache/exploded/x86_64/go-1.18.1/go/src/sync/waitgroup.go:136
      #       0x9f1486        github.com/couchbase/goxdcr/pipeline_manager.(*PipelineOpSerializer).StopBackfillWithCb+0x166           /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/pipeline_manager/pipelineOpSerializer.go:276
      #       0x9f4c0c        github.com/couchbase/goxdcr/pipeline_manager.(*PipelineManager).HaltBackfillWithCb+0x2c                 /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/pipeline_manager/pipeline_manager.go:231
      #       0xa10002        github.com/couchbase/goxdcr/backfill_manager.(*BackfillMgr).backfillReplSpecChangeHandlerCallback+0x4a2 /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_manager.go:548
      #       0x931292        github.com/couchbase/goxdcr/metadata_svc.(*BackfillReplicationService).updateCacheInternal+0x1f2        /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/metadata_svc/backfill_repl_service.go:605
      #       0x93043b        github.com/couchbase/goxdcr/metadata_svc.(*BackfillReplicationService).updateCache+0xfb                 /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/metadata_svc/backfill_repl_service.go:587
      #       0x93041b        github.com/couchbase/goxdcr/metadata_svc.(*BackfillReplicationService).SetBackfillReplSpec+0xdb         /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/metadata_svc/backfill_repl_service.go:496
      #       0xa2127d        github.com/couchbase/goxdcr/backfill_manager.(*BackfillRequestHandler).metaKvOp+0x3d                    /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_request_handler.go:771
      #       0xa1d1de        github.com/couchbase/goxdcr/backfill_manager.(*BackfillRequestHandler).run+0x33e                        /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_request_handler.go:308
      

      That job is waiting on:

      1 @ 0x43d456 0x44ded3 0x44dead 0x468ce5 0x484f12 0xa022a6 0x9fa2bc 0x9fa2ad 0x9f2e56 0x46cde1
      #       0x468ce4        sync.runtime_Semacquire+0x24                                                                    /home/couchbase/.cbdepscache/exploded/x86_64/go-1.18.1/go/src/runtime/sema.go:56
      #       0x484f11        sync.(*WaitGroup).Wait+0x51                                                                     /home/couchbase/.cbdepscache/exploded/x86_64/go-1.18.1/go/src/sync/waitgroup.go:136
      #       0xa022a5        github.com/couchbase/goxdcr/pipeline_manager.(*PipelineUpdater).sendStopBackfillPipeline+0xc5   /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/pipeline_manager/pipeline_manager.go:2076
      #       0x9fa2bb        github.com/couchbase/goxdcr/pipeline_manager.(*PipelineUpdater).stopBackfillPipeline+0x9b       /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/pipeline_manager/pipeline_manager.go:2043
      #       0x9fa2ac        github.com/couchbase/goxdcr/pipeline_manager.(*PipelineManager).StopBackfillWithStoppedCb+0x8c  /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/pipeline_manager/pipeline_manager.go:948
      #       0x9f2e55        github.com/couchbase/goxdcr/pipeline_manager.(*PipelineOpSerializer).handleJobs+0xf15           /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/pipeline_manager/pipelineOpSerializer.go:452
      

      Which happens to be running and deadlocking on something when the handler for when explicit mapping is changed:

      1 @ 0x43d456 0x40b5ec 0x40b018 0xa1ebe5 0xa16fdf 0xa16faa 0x8a6c8a 0xa16e2d 0xa11898 0xcfa00f 0xa03c52 0xa005d8 0x9fe8fd 0x483822 0x9fe307 0x9fe2d5 0x46cde1
      #       0xa1ebe4        github.com/couchbase/goxdcr/backfill_manager.(*BackfillRequestHandler).handleBackfillRequestWithArgs+0x204      /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_request_handler.go:397
      #       0xa16fde        github.com/couchbase/goxdcr/backfill_manager.(*BackfillRequestHandler).HandleBackfillRequest+0x5e               /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_request_handler.go:365
      #       0xa16fa9        github.com/couchbase/goxdcr/backfill_manager.(*BackfillMgr).handleExplicitMapChangeBackfillReq.func1+0x29       /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_manager.go:1365
      #       0x8a6c89        github.com/couchbase/goxdcr/utils.(*Utilities).ExponentialBackoffExecutorWithFinishSignal+0x109                 /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/utils/utils.go:2576
      #       0xa16e2c        github.com/couchbase/goxdcr/backfill_manager.(*BackfillMgr).handleExplicitMapChangeBackfillReq+0x26c            /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_manager.go:1367
      #       0xa11897        github.com/couchbase/goxdcr/backfill_manager.(*BackfillMgr).GetExplicitMappingChangeHandler.func1+0x557         /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/backfill_manager/backfill_manager.go:679
      #       0xcfa00e        github.com/couchbase/goxdcr/replication_manager.needSpecialCallbackUpdate.func1+0x2e                            /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/replication_manager/metakv_change_listener.go:317
      #       0xa03c51        github.com/couchbase/goxdcr/pipeline_manager.(*PipelineUpdater).executeQueuedCallbacks+0x111                    /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/pipeline_manager/pipeline_manager.go:2283
      #       0xa005d7        github.com/couchbase/goxdcr/pipeline_manager.(*PipelineUpdater).update+0x557                                    /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/pipeline_manager/pipeline_manager.go:1839
      #       0x9fe8fc        github.com/couchbase/goxdcr/pipeline_manager.(*PipelineUpdater).run.func1+0x5bc                                 /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/pipeline_manager/pipeline_manager.go:1545
      #       0x483821        sync.(*Once).doSlow+0xc1                                                                                        /home/couchbase/.cbdepscache/exploded/x86_64/go-1.18.1/go/src/sync/once.go:68
      #       0x9fe306        sync.(*Once).Do+0x46                                                                                            /home/couchbase/.cbdepscache/exploded/x86_64/go-1.18.1/go/src/sync/once.go:59
      #       0x9fe2d4        github.com/couchbase/goxdcr/pipeline_manager.(*PipelineUpdater).run+0x14                                        /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/goxdcr/pipeline_manager/pipeline_manager.go:1523
      

      We’ve got ourselves a deadlock problem.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ayush.nayyar Ayush Nayyar
              neil.huang Neil Huang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty