Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6379

xdcr - Replication slows down significantly after enabling/disabling firewall on destination cluster nodes.

    XMLWordPrintable

Details

    • Story
    • Resolution: Fixed
    • Major
    • 2.0-beta
    • 2.0
    • XDCR
    • Security Level: Public
    • None
    • 2.0-1621
      1024 vbuckets
      ubuntu
      unidirectional replication w/ 15M items.
      saslbucket

    Description

      Setup
      -------------------
      -Create 2 clusters with sasl bucket
      -Load 15M items on source.

      • Create unidirectional replication from source to destination cluster.
      • After 4M items are replicated, enabled/disable firewall to drop incoming packets on the destination cluster.

      On Master node
      sudo iptables -A INPUT -p tcp --dport 8092 -j DROP
      sudo iptables -A INPUT -p tcp --dport 8091 -j DROP

      On Non-master node
      sudo iptables -A INPUT -p tcp --dport 8092 -j DROP
      sudo iptables -A INPUT -p tcp --dport 8091 -j DROP

      And then disable this firewall using
      sudo iptables --flush on master/non-master nodes.

      Observation
      -------------------
      Replication is not broken, but the replication rate has dropped significantly.

      The first 4M items were replicated at a rate of about 3k items/sec.

      -The next replications are much slowed, as low as 0 items/sec on some nodes.
      -Replication is frequently 0 on the nodes, it picks up after 10-20 minutes of inactivity.

      Attaching screenshots

      Per xdcr replication logic, incase of intermittent networks, replication will keep trying to replicate upto X amount of time/period?

      Seeing CRASH reports with timeouts mainly on the source end

      error_logger:error,2012-08-22T10:22:19.434,ns_1@10.3.121.32:error_logger:ale_error_logger_handler:log_report:72]
      =========================CRASH REPORT=========================
      crasher:
      initial call: xdc_vbucket_rep:init/1
      pid: <0.4570.1>
      registered_name: []
      exception exit: {function_clause,
      [{xdc_vbucket_rep_ckpt,source_cur_seq,
      [{rep_state,
      {rep,<<"69d340dbe30a03c5c91d25958a000f73">>,
      <<"saslbucket">>,
      <<"/remoteClusters/a/buckets/saslbucket">>,
      [

      {connection_timeout,30000},
      [ {continuous,true},
      {http_connections,20},
      {retries,10},



      ** Reason for termination ==
      ** {function_clause,
      [{xdc_vbucket_rep_ckpt,source_cur_seq,
      [{rep_state,
      {rep,<<"69d340dbe30a03c5c91d25958a000f73">>,
      <<"saslbucket">>,
      <<"/remoteClusters/a/buckets/saslbucket">>,
      [{connection_timeout,30000}

      ,

      {continuous,true}

      ,

      {http_connections,20}

      ,

      {retries,10}

      ,
      {socket_options,[

      {keepalive,true}

      ,

      {nodelay,false}

      ]},

      {worker_batch_size,500}

      ,

      {worker_processes,4}

      ]},

      {rep_vb_status,443,<0.4858.1>,idle,0,13149,13149}

      ,
      <0.2963.1>,<0.2960.1>,<<"saslbucket/443">>,
      <<"http://Administrator:password@10.3.121.36:8092/saslbucket%2f443%3bcc89df5b3739177398ed813a238513fd">>,
      undefined,undefined,undefined,undefined,[],
      {[

      {<<"session_id">>,<<"2cbd9134b6f3a00514e8f636d7911b40">>}

      ,

      {<<"source_last_seq">>,13179}

      ,

      {<<"start_time">>,<<"Wed, 22 Aug 2012 17:02:32 GMT">>}

      ,

      {<<"end_time">>,<<"Wed, 22 Aug 2012 17:13:55 GMT">>}

      ,

      {<<"docs_checked">>,13149}

      ,

      {<<"docs_written">>,13149}

      ,
      {<<"history">>,
      [{[{<<"session_id">>,

      • Adding logs from the clusters.

      We would expect that replication rate to eventually catch-up ? [After any network disruptions/ intermittent network on/off ...]

      Attachments

        For Gerrit Dashboard: MB-6379
        # Subject Branch Project Status CR V

        Activity

          People

            junyi Junyi Xie (Inactive)
            ketaki Ketaki Gangal (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty