Uploaded image for project: 'Couchbase Lite'
  1. Couchbase Lite
  2. CBL-131

Continuous replicator does not back off retries

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.6.0, 2.5.2
    • 2.5.3
    • Java-Android
    • Security Level: Public
    • None

    Description

      A continuous replicator that reaches the stopped state, for a reason that the core replicator believes to be network dependent and transient, will attempt to start a new replicator.  It should be doing exponential backoff.  It is not doing so, because it sets the retry count to 0 as soon as each new replicator reaches the connecting state.

       

      https://github.com/couchbase/couchbase-lite-android-ce/issues/11

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          pasin Pasin Suriyentrakorn added a comment - - edited

          I was able to reproduce the issue. The behavior is quite subtle and I think I understand the issue now.

          My test:

          1. Run the replicator connecting to non existing SG. The Android emulator was running on a Mac machine that has network connection.
          2. Was able to to see that the retry logic seemed to be performed without honoring the delay set.
          3. Some time the retry count / delay was reset. This happened randomly.

          Cause of the #2:

          • There were two retry streams/series going on at different rates. One was initiated by network error and the other one was by the Reachability.
          • The reachability reported as reachable so that it initiated its own retry series.

          Cause of the #3:

          • LiteCore replicator sometimes could report status as Busy with error instead of Stop with error. We need to check if this happened with 2.5.2 and Cobalt or not.

           

           

          pasin Pasin Suriyentrakorn added a comment - - edited I was able to reproduce the issue. The behavior is quite subtle and I think I understand the issue now. My test: Run the replicator connecting to non existing SG. The Android emulator was running on a Mac machine that has network connection. Was able to to see that the retry logic seemed to be performed without honoring the delay set. Some time the retry count / delay was reset. This happened randomly. Cause of the #2: There were two retry streams/series going on at different rates. One was initiated by network error and the other one was by the Reachability. The reachability reported as reachable so that it initiated its own retry series. Cause of the #3: LiteCore replicator sometimes could report status as Busy with error instead of Stop with error. We need to check if this happened with 2.5.2 and Cobalt or not.    

          Priya Rajagopal I have the fix in 2.5.3-4 build.

          pasin Pasin Suriyentrakorn added a comment - Priya Rajagopal I have the fix in 2.5.3-4 build.
          pasin Pasin Suriyentrakorn added a comment - - edited

          This issue has been fixed in 2.5.3 and Cobalt. I have verified the fix manually.

          pasin Pasin Suriyentrakorn added a comment - - edited This issue has been fixed in 2.5.3 and Cobalt. I have verified the fix manually.

          People

            pasin Pasin Suriyentrakorn
            blake.meike Blake Meike
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty