Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-4546

EPERF - Protocol Error between ep-engine and mccouch causes resetConnection

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0-beta
    • Fix Version/s: None
    • Component/s: couchbase-bucket
    • Security Level: Public
    • Labels:
    • Environment:
      eperf cluster testing on ec2

      Description

      When running eperf test on a ec2 clust, we see this occasionally in the server logs:

      [ns_server:info] [2011-12-16 19:38:29] [ns_1@10.68.154.26:<0.848.0>:ns_port_server:log:155] memcached<0.848.0>: Rubbish received on the backend stream. closing it
      memcached<0.848.0>: Trying to connect to mccouch: "localhost:11213"
      memcached<0.848.0>: Connected to mccouch: "localhost:11213"

      This means somehow the the protocol between mccouch and mc-kvstore is getting mixed up. Fortunately, it looks like no data is lost as the connection is being reset and the data is being resent, but there is some sort of bug that needs to be fixed as it's a performance problem and potentially could be a more serious bug undiagnosed.

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        karan Karan Kumar (Inactive) added a comment -

        [couchdb:info] [2011-12-16 19:38:18] [ns_1@10.6.87.130:<0.1763.0>:couch_log:info:39] 10.6.153.208 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:18] [ns_1@10.6.87.130:<0.1765.0>:couch_log:info:39] 10.6.151.112 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:18] [ns_1@10.6.87.130:<0.3289.0>:couch_log:info:39] 10.6.153.208 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:18] [ns_1@10.6.87.130:<0.1739.0>:couch_log:info:39] 10.188.213.11 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:18] [ns_1@10.6.87.130:<0.1777.0>:couch_log:info:39] 10.6.117.231 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:18] [ns_1@10.6.87.130:<0.1868.0>:couch_log:info:39] 10.191.7.242 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:19] [ns_1@10.6.87.130:<0.1725.0>:couch_log:info:39] 10.190.157.18 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:20] [ns_1@10.6.87.130:<0.3453.0>:couch_log:info:39] 10.6.117.231 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:20] [ns_1@10.6.87.130:<0.3500.0>:couch_log:info:39] 10.68.154.26 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:20] [ns_1@10.6.87.130:<0.3381.0>:couch_log:info:39] 10.6.129.242 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:20] [ns_1@10.6.87.130:<0.3526.0>:couch_log:info:39] 10.191.7.242 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:20] [ns_1@10.6.87.130:<0.3407.0>:couch_log:info:39] 10.188.213.11 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:20] [ns_1@10.6.87.130:<0.3487.0>:couch_log:info:39] 10.190.157.18 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:21] [ns_1@10.6.87.130:<0.3383.0>:couch_log:info:39] 10.6.129.242 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:21] [ns_1@10.6.87.130:<0.1655.0>:couch_log:info:39] 10.6.151.112 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:21] [ns_1@10.6.87.130:<0.640.0>:couch_log:info:39] 10.6.177.27 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:22] [ns_1@10.6.87.130:<0.1434.0>:couch_log:info:39] 10.68.154.26 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1724.0>:couch_log:info:39] 10.6.177.27 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1763.0>:couch_log:info:39] 10.6.153.208 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1765.0>:couch_log:info:39] 10.6.151.112 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.3289.0>:couch_log:info:39] 10.6.153.208 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1739.0>:couch_log:info:39] 10.188.213.11 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1777.0>:couch_log:info:39] 10.6.117.231 - - GET /default%2Fmaster/ 200
        [ns_server:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1378.0>:ns_port_server:log:155] memcached<0.1378.0>: Rubbish received on the backend stream. closing it
        memcached<0.1378.0>: Trying to connect to mccouch: "localhost:11213"
        memcached<0.1378.0>: Connected to mccouch: "localhost:11213"

        [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1868.0>:couch_log:info:39] 10.191.7.242 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:24] [ns_1@10.6.87.130:<0.1725.0>:couch_log:info:39] 10.190.157.18 - - GET /default%2Fmaster/ 200
        [ns_server:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:ns_config_rep:ns_config_rep:do_pull:257] Pulling config from: 'ns_1@10.6.151.112'

        [couchdb:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:<0.3453.0>:couch_log:info:39] 10.6.117.231 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:<0.3500.0>:couch_log:info:39] 10.68.154.26 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:<0.3381.0>:couch_log:info:39] 10.6.129.242 - - GET /default%2Fmaster/ 200
        [couchdb:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:<0.3526.0>:couch_log:info:39] 10.191.7.242 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:<0.3407.0>:couch_log:info:39] 10.188.213.11 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:<0.3487.0>:couch_log:info:39] 10.190.157.18 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:26] [ns_1@10.6.87.130:<0.3383.0>:couch_log:info:39] 10.6.129.242 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:26] [ns_1@10.6.87.130:<0.1655.0>:couch_log:info:39] 10.6.151.112 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:26] [ns_1@10.6.87.130:<0.640.0>:couch_log:info:39] 10.6.177.27 - - GET /_replicator/ 200
        [couchdb:info] [2011-12-16 19:38:27] [ns_1@10.6.87.130:<0.1434.0>:couch_log:info:39] 10.68.154.26 - - GET /default%2Fmaster/ 200

        Show
        karan Karan Kumar (Inactive) added a comment - [couchdb:info] [2011-12-16 19:38:18] [ns_1@10.6.87.130:<0.1763.0>:couch_log:info:39] 10.6.153.208 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:18] [ns_1@10.6.87.130:<0.1765.0>:couch_log:info:39] 10.6.151.112 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:18] [ns_1@10.6.87.130:<0.3289.0>:couch_log:info:39] 10.6.153.208 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:18] [ns_1@10.6.87.130:<0.1739.0>:couch_log:info:39] 10.188.213.11 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:18] [ns_1@10.6.87.130:<0.1777.0>:couch_log:info:39] 10.6.117.231 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:18] [ns_1@10.6.87.130:<0.1868.0>:couch_log:info:39] 10.191.7.242 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:19] [ns_1@10.6.87.130:<0.1725.0>:couch_log:info:39] 10.190.157.18 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:20] [ns_1@10.6.87.130:<0.3453.0>:couch_log:info:39] 10.6.117.231 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:20] [ns_1@10.6.87.130:<0.3500.0>:couch_log:info:39] 10.68.154.26 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:20] [ns_1@10.6.87.130:<0.3381.0>:couch_log:info:39] 10.6.129.242 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:20] [ns_1@10.6.87.130:<0.3526.0>:couch_log:info:39] 10.191.7.242 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:20] [ns_1@10.6.87.130:<0.3407.0>:couch_log:info:39] 10.188.213.11 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:20] [ns_1@10.6.87.130:<0.3487.0>:couch_log:info:39] 10.190.157.18 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:21] [ns_1@10.6.87.130:<0.3383.0>:couch_log:info:39] 10.6.129.242 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:21] [ns_1@10.6.87.130:<0.1655.0>:couch_log:info:39] 10.6.151.112 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:21] [ns_1@10.6.87.130:<0.640.0>:couch_log:info:39] 10.6.177.27 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:22] [ns_1@10.6.87.130:<0.1434.0>:couch_log:info:39] 10.68.154.26 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1724.0>:couch_log:info:39] 10.6.177.27 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1763.0>:couch_log:info:39] 10.6.153.208 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1765.0>:couch_log:info:39] 10.6.151.112 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.3289.0>:couch_log:info:39] 10.6.153.208 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1739.0>:couch_log:info:39] 10.188.213.11 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1777.0>:couch_log:info:39] 10.6.117.231 - - GET /default%2Fmaster/ 200 [ns_server:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1378.0>:ns_port_server:log:155] memcached<0.1378.0>: Rubbish received on the backend stream. closing it memcached<0.1378.0>: Trying to connect to mccouch: "localhost:11213" memcached<0.1378.0>: Connected to mccouch: "localhost:11213" [couchdb:info] [2011-12-16 19:38:23] [ns_1@10.6.87.130:<0.1868.0>:couch_log:info:39] 10.191.7.242 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:24] [ns_1@10.6.87.130:<0.1725.0>:couch_log:info:39] 10.190.157.18 - - GET /default%2Fmaster/ 200 [ns_server:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:ns_config_rep:ns_config_rep:do_pull:257] Pulling config from: 'ns_1@10.6.151.112' [couchdb:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:<0.3453.0>:couch_log:info:39] 10.6.117.231 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:<0.3500.0>:couch_log:info:39] 10.68.154.26 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:<0.3381.0>:couch_log:info:39] 10.6.129.242 - - GET /default%2Fmaster/ 200 [couchdb:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:<0.3526.0>:couch_log:info:39] 10.191.7.242 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:<0.3407.0>:couch_log:info:39] 10.188.213.11 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:25] [ns_1@10.6.87.130:<0.3487.0>:couch_log:info:39] 10.190.157.18 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:26] [ns_1@10.6.87.130:<0.3383.0>:couch_log:info:39] 10.6.129.242 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:26] [ns_1@10.6.87.130:<0.1655.0>:couch_log:info:39] 10.6.151.112 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:26] [ns_1@10.6.87.130:<0.640.0>:couch_log:info:39] 10.6.177.27 - - GET /_replicator/ 200 [couchdb:info] [2011-12-16 19:38:27] [ns_1@10.6.87.130:<0.1434.0>:couch_log:info:39] 10.68.154.26 - - GET /default%2Fmaster/ 200
        Hide
        karan Karan Kumar (Inactive) added a comment -

        Attaching diags

        Show
        karan Karan Kumar (Inactive) added a comment - Attaching diags
        Hide
        karan Karan Kumar (Inactive) added a comment -

        Assigning to Sharon, for triaging this appropriately.

        Show
        karan Karan Kumar (Inactive) added a comment - Assigning to Sharon, for triaging this appropriately.
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        Steve,

        can we get someone from ep-engine to just comment on the bug to understand this crash ?

        this have been seen a few times under load and also have been reported in the forum.

        we haven't seen this in the recent large cluster runs but i am not sure if this is completely fixed.

        we also don't need to spend time fixing this since we are replacing this path with couchstore

        Show
        farshid Farshid Ghods (Inactive) added a comment - Steve, can we get someone from ep-engine to just comment on the bug to understand this crash ? this have been seen a few times under load and also have been reported in the forum. we haven't seen this in the recent large cluster runs but i am not sure if this is completely fixed. we also don't need to spend time fixing this since we are replacing this path with couchstore
        Hide
        steve Steve Yen added a comment -

        Jin,
        Since we're likely replacing this pathway with couchstore, this seems low priority – however, there might be some case shared mccouch bug on the erlang side that would bite us even with the switch to couchstore.

        Show
        steve Steve Yen added a comment - Jin, Since we're likely replacing this pathway with couchstore, this seems low priority – however, there might be some case shared mccouch bug on the erlang side that would bite us even with the switch to couchstore.
        Hide
        jin Jin Lim added a comment -

        More info/clarification needed.

        Show
        jin Jin Lim added a comment - More info/clarification needed.
        Hide
        chiyoung Chiyoung Seo added a comment -

        Fixed by Liang as part of extending mccouch mock server and testing various network communication failures.

        Show
        chiyoung Chiyoung Seo added a comment - Fixed by Liang as part of extending mccouch mock server and testing various network communication failures.

          People

          • Assignee:
            karan Karan Kumar (Inactive)
            Reporter:
            damien damien
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes