Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-3575

Moxi should automatically skip removed or uninitialized nodes

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.5.3
    • Fix Version/s: 1.7.0
    • Component/s: moxi
    • Security Level: Public
    • Flagged:
      Release Note

      Description

      -Configure Moxi with a list of URL's:
      ./moxi http://node1:8091/pools/default/bucketsStreaming/default, http://node2:8091/pools/default/bucketsStreaming/default, http://node3:8091/pools/default/bucketsStreaming/default

      -Rebalance node 1 out of the cluster (for maintenance)
      -Restart Moxi
      -Moxi will continuously spin on node1 with the following errors:
      2011-04-08 15:30:21: (agent_config.c.389) configuration received
      2011-04-08 15:30:21: (agent_config.c.448) ERROR: could not parse JSON from REST server: Requested resource not found.

      Two things need to happen here:
      -Moxi's logging needs to be improved to tell the user which Node it is connecting to and which ones are giving it problems (not just once at the top, but for every log mesage)
      -Moxi needs to realize that this node has an invalid config (it knows it got an error at least) and move onto the next.

        Activity

        perry Perry Krug created issue -
        Hide
        perry Perry Krug added a comment -

        Workaround is to stop the service on the node that was removed from the cluster.

        Show
        perry Perry Krug added a comment - Workaround is to stop the service on the node that was removed from the cluster.
        perry Perry Krug made changes -
        Field Original Value New Value
        Flagged [Release Note]
        sharon Sharon Barr (Inactive) made changes -
        Assignee Steve Yen [ steve ]
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        This also happens with fast bucket deletion patches. Steps are:

        • create cluster
        • load data
        • delete bucket
        • create bucket with same name
        • try loading data
        • moxi fails because it keeps invalid connections in pool
        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - This also happens with fast bucket deletion patches. Steps are: create cluster load data delete bucket create bucket with same name try loading data moxi fails because it keeps invalid connections in pool
        perry Perry Krug made changes -
        Assignee Steve Yen [ steve ]
        farshid Farshid Ghods (Inactive) made changes -
        Fix Version/s 1.7 alpha 2 [ 10180 ]
        farshid Farshid Ghods (Inactive) made changes -
        Assignee Steve Yen [ steve ]
        Fix Version/s 1.7 beta 1 [ 10110 ]
        Fix Version/s 1.7 alpha 2 [ 10180 ]
        Hide
        steve Steve Yen added a comment -

        This requires some internal API changes between moxi and libconflate to pass the error knowledge across several function invocations to the right place. Not risky.

        Show
        steve Steve Yen added a comment - This requires some internal API changes between moxi and libconflate to pass the error knowledge across several function invocations to the right place. Not risky.
        Hide
        perry Perry Krug added a comment -

        A Pivotal Tracker story has been created for this Issue: http://www.pivotaltracker.com/story/show/13245071

        Show
        perry Perry Krug added a comment - A Pivotal Tracker story has been created for this Issue: http://www.pivotaltracker.com/story/show/13245071
        steve Steve Yen made changes -
        Summary Moxi fails to realize that a node is invalid Moxi should automatically skip removed or uninitialized nodes
        sharon Sharon Barr (Inactive) made changes -
        Fix Version/s 1.7 Release Candidate [ 10182 ]
        Fix Version/s 1.7 beta [ 10110 ]
        Hide
        steve Steve Yen added a comment -

        Some fixes on the way to the real fix...

        http://review.membase.org/6338
        http://review.membase.org/6339

        Show
        steve Steve Yen added a comment - Some fixes on the way to the real fix... http://review.membase.org/6338 http://review.membase.org/6339
        Hide
        steve Steve Yen added a comment -

        Unfortunately, the code that discovers a problem is in an asynchronous thread that's "far away" from the REST HTTP code.

        Will bend code to our will.

        Show
        steve Steve Yen added a comment - Unfortunately, the code that discovers a problem is in an asynchronous thread that's "far away" from the REST HTTP code. Will bend code to our will.
        Show
        steve Steve Yen added a comment - http://review.membase.org/6342 http://review.membase.org/6343
        steve Steve Yen made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        farshid Farshid Ghods (Inactive) made changes -
        Labels 1.7.0-release-notes
        Hide
        perry Perry Krug added a comment -

        Perry Krug deleted the linked story in Pivotal Tracker

        Show
        perry Perry Krug added a comment - Perry Krug deleted the linked story in Pivotal Tracker
        ingenthr Matt Ingenthron made changes -
        Fix Version/s 1.7 GA [ 10111 ]
        Fix Version/s 1.7 Release Candidate [ 10182 ]
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        this test is failing now ( 1.8.1 manifest )

        OK
        ./t/issue-MB-3575.sh: line 212: 19957 Terminated: 15 ./moxi z http://127.0.0.1:22100/bad,http://127.0.0.1:4567/pools/default/bucketsStreaming/default,http://127.0.0.1:22101/bad -Z port_listen=11266,downstream_conn_max=1,downstream_max=0,downstream_timeout=300,wait_queue_timeout=300,downstream_conn_queue_timeout=300,connect_timeout=300,auth_timeout=300 2>> /tmp/issueMB-3575.out
        No matching processes belonging to you were found
        ----------------------
        FAIL count expect 0, got 20
        make: *** [test] Error 1

        Show
        farshid Farshid Ghods (Inactive) added a comment - this test is failing now ( 1.8.1 manifest ) OK ./t/issue- MB-3575 .sh: line 212: 19957 Terminated: 15 ./moxi z http://127.0.0.1:22100/bad,http://127.0.0.1:4567/pools/default/bucketsStreaming/default,http://127.0.0.1:22101/bad -Z port_listen=11266,downstream_conn_max=1,downstream_max=0,downstream_timeout=300,wait_queue_timeout=300,downstream_conn_queue_timeout=300,connect_timeout=300,auth_timeout=300 2>> /tmp/issue MB-3575 .out No matching processes belonging to you were found ---------------------- FAIL count expect 0, got 20 make: *** [test] Error 1
        farshid Farshid Ghods (Inactive) made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        never mind .didn't have gem sinatra installed

        Show
        farshid Farshid Ghods (Inactive) added a comment - never mind .didn't have gem sinatra installed
        farshid Farshid Ghods (Inactive) made changes -
        Status Reopened [ 4 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]

          People

          • Assignee:
            steve Steve Yen
            Reporter:
            perry Perry Krug
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes