Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-52403

[System Test][XDCR] Xmem is stuck

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • 7.1.2
    • 7.1.1
    • XDCR
    • 7.1.1-3094

    Description

      Running longevity with magma enabled. Seeing several instances of this error in goxdcr log on source and target nodes:

      172.23.108.141 : xdcr
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log:2022-05-26T19:11:53.097-07:00 ERRO GOXDCR.GenericSupervisor: PipelineSupervisor_6273636120c99a03336a137500a45f07/bucket4/bucket4 Received error report : Xmem is stuck
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log:. errors_seen=map[xmem_6273636120c99a03336a137500a45f07/bucket4/bucket4_172.23.97.183:11210_0:Xmem is stuck]
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log:2022-05-26T19:11:53.097-07:00 INFO GOXDCR.ReplMgr: Supervisor PipelineSupervisor_6273636120c99a03336a137500a45f07/bucket4/bucket4 of type *supervisor.GenericSupervisor reported errors map[xmem_6273636120c99a03336a137500a45f07/bucket4/bucket4_172.23.97.183:11210_0:Xmem is stuck]
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log:2022-05-26T19:11:53.098-07:00 ERRO GOXDCR.XmemNozzle: xmem_6273636120c99a03336a137500a45f07/bucket4/bucket4_172.23.97.183:11210_0 Raise error condition Xmem is stuck
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log:2022-05-26T19:11:53.098-07:00 INFO GOXDCR.PipelineMgr: Replication 6273636120c99a03336a137500a45f07/bucket4/bucket4's status experienced changes or errors (r.update_err_ch : xmem_6273636120c99a03336a137500a45f07/bucket4/bucket4_172.23.97.183:11210_0 : Xmem is stuck), updating now
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log:2022-05-26T19:11:53.098-07:00 INFO GOXDCR.PipelineMgr: Try to fix Pipeline 6273636120c99a03336a137500a45f07/bucket4/bucket4. Current error(s)=r.update_err_ch : xmem_6273636120c99a03336a137500a45f07/bucket4/bucket4_172.23.97.183:11210_0 : Xmem is stuck
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log:2022-05-26T19:11:53.098-07:00 INFO GOXDCR.PipelineMgr: Replication status is updated with error(s) r.update_err_ch : xmem_6273636120c99a03336a137500a45f07/bucket4/bucket4_172.23.97.183:11210_0 : Xmem is stuck, current status=name={6273636120c99a03336a137500a45f07/bucket4/bucket4}, status={Pending}, errors={[{"time":"2022-05-26T19:11:53.098149708-07:00","errMsg":"xmem_6273636120c99a03336a137500a45f07/bucket4/bucket4_172.23.97.183:11210_0 : Xmem is stuck"}]}, oldProgress={Pipeline is running}, progress={Received error report : Xmem is stuck}
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log:2022-05-26T19:11:53.324-07:00 INFO GOXDCR.StatsMgr: 6273636120c99a03336a137500a45f07/bucket4/bucket4-160911592 expvar=Stats for pipeline 6273636120c99a03336a137500a45f07/bucket4/bucket4-160911592 {"Backfill Old Progress": "", "Backfill Progress": "", "Errors": "[{\"time\":\"2022-05-26T19:11:53.098149708-07:00\",\"errMsg\":\"xmem_6273636120c99a03336a137500a45f07/bucket4/bucket4_172.23.97.183:11210_0 : Xmem is stuck\"}]", "OldProgress": "Received error report : Xmem is stuck", "Overview": {"CurrentTime": 1653617513324398678, "add_docs_written": 0, "bandwidth_usage": 0, "changes_left": 15742, "data_merged": 0, "data_replicated": 24877733, "datapool_failed_gets": 0, "dcp_datach_length": 1106, "dcp_dispatch_time": 64, "deletion_docs_written": 14477, "deletion_failed_cr_source": 0, "deletion_filtered": 0, "deletion_received_from_dcp": 14477, "docs_checked": 276262, "docs_cloned": 0, "docs_failed_cr_source": 13673, "docs_failed_cr_target": 13
      , "docs_filtered": 0, "docs_merged": 0, "docs_opt_repd": 14477, "docs_processed": 375681, "docs_received_from_dcp": 74137, "docs_rep_queue": 0, "docs_unable_to_filter": 0, "docs_written": 45376, "expiry_docs_merged": 0, "expiry_docs_written": 45376, "expiry_failed_cr_source": 13673, "expiry_filtered": 0, "expiry_received_from_dcp": 74137, "expiry_stripped": 0, "num_checkpoints": 0, "num_failedckpts": 0, "rate_doc_checks": 0, "rate_doc_opt_repd": 0, "rate_received_from_dcp": 0, "rate_replicated": 0, "resp_wait_time": 14, "set_docs_written": 30899, "set_failed_cr_source": 13673, "set_filtered": 0, "set_received_from_dcp": 59660, "size_rep_queue": 0, "target_docs_skipped": 0, "throttle_latency": 0, "throughput_throttle_latency": 0, "time_committing": 0, "wtavg_docs_latency": 0, "wtavg_get_doc_latency": 0, "wtavg_merge_latency": 0, "wtavg_meta_latency": 0}, "Progress": "Async listeners have been stopped", "Status": "Pending"}
      

      Logs: https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1653620348/collectinfo-2022-05-27T025910-ns_1%40172.23.108.141.zip

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            pavithra.mahamani Pavithra Mahamani (Inactive)
            pavithra.mahamani Pavithra Mahamani (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty