Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-8235

error restarting couch_server: {{read_loop_died, {problem_reopening_file,

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 2.1.0
    • 2.1.0
    • ns_server, XDCR
    • Security Level: Public
    • None

    Description

      Junyi > For some reasons, the CouchDB updater crashed during XDCR, and caused cascading results that babysitting proc restart the CouchDB multiple times, and then cause XDCR replicator crashed due to the inconsistent instance startup time (Source database out of sync).

      Test topology:
      source:bucket0 <- bidirection -> dest1:bucket0
      source:bucket0 -> dest2:bucket0
      dest3:bucket0
      dest4:bucket0
      we have 4 outbound streams from source and 1 inbound.

      there is a 20k frontend load on bucket0 with
      get:70%,delete:10%,update:10%,set:10%,expire:5%

      inbound load from destination is from a 4k load with
      get:90%,delete:2%,update:5%,set:5%,expire:5%

      data is loaded till about 70% dgm.
      No views, no rebalancing.

      In Couchdb Logs we see:
      [couchdb:error,2013-05-09T7:44:46.318,ns_1@172.23.105.55:couch_server<0.8331.208>:couch_log:error:42]Unexpected message, restarting couch_server: {'EXIT',<0.16961.208>,
      {{read_loop_died,
      {problem_reopening_file,

      {error,system_limit}

      ,

      {set_close_after,infinity, <0.16958.208>}

      ,
      <0.16959.208>,
      "/opt/couchbase/var/lib/couchbase/data/bucket1/345.couch.2",
      10}},

      {gen_server,call, [<0.16958.208>,snapshot_reads, infinity]}

      }}

      In xdcr there are sync errors suggesting we increase max_dbs, but seems we are already hitting a limit as couchdb is restarting:

      Replication `a1c985cbafac10e773b130f01d1ba85c/bucket0/bucket0` ...failed: Source database out of sync. Try to increase max_dbs_open at the source's server.

      Attaching logs here from time of crash. full logs were copied up and left here:

      172.23.105.55:/0509_couchdb/ (use rsa key)

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            tommie Tommie McAfee (Inactive)
            tommie Tommie McAfee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty