Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-14973

GoXDCR: It takes 10 mins of no replication to detect xmem is stuck

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 4.0.0
    • 4.0.0
    • XDCR
    • Security Level: Public
    • centOS 6.x

    Description

      Build


      4.0.0-2109

      Found during manual testing.

      1.C1 [.186] --> C2[.188], existing default buckets, replication
      2. Rebalance-in .189 on C2.
      3. In parallel, start load on C1.
      Replication stops with error NOT_MY_VBUCKET

      Replication b0a4b2ca4dbe46ff9c9a299b9d21cc19/default/default failed. err=map[xmem_b0a4b2ca4dbe46ff9c9a299b9d21cc19/default/default_10.3.4.188:11210_1:Fatal error when receiving responses from memcached in target cluster.]
      

      This error was seen at 16:16:47 - Wed May 13, 2015

      Then pipeline was constructed -
      Replication b0a4b2ca4dbe46ff9c9a299b9d21cc19/default/default started running. xdcr000 ns_1@127.0.0.1 16:17:00 - Wed May 13, 2015

      4. Rebalance completed at 16:19:06 - Wed May 13, 2015

      Rebalance completed successfully.
      ns_orchestrator001	ns_1@10.3.4.188	16:19:06 - Wed May 13, 2015
      

      5. Although pipeline was constructed at 16:17:00, for next 10 mins there was no replication between .186 and .188 until the next error message is reported on .186 at 16:27:04 - Wed May 13, 2015

      Replication b0a4b2ca4dbe46ff9c9a299b9d21cc19/default/default failed. err=map[xmem_b0a4b2ca4dbe46ff9c9a299b9d21cc19/default/default_10.3.4.188:11210_1:Xmem is stuck]	xdcr000	ns_1@127.0.0.1	16:27:04 - Wed May 13, 2015
      

      6. Replication then starts on .186 @
      Replication b0a4b2ca4dbe46ff9c9a299b9d21cc19/default/default started running. xdcr000 ns_1@127.0.0.1 16:27:17 - Wed May 13, 2015

      Questions
      ---------
      1. Why is C1 reporting xmem stuck, although target cluster rebalance completed 8 mins earlier?
      2. Why does it take 10 mins to report Xmem is stuck?

      Attaching cbcollect from .186 and .188

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            apiravi Aruna Piravi (Inactive)
            apiravi Aruna Piravi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty