Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7337

[system test] node shown as pending for a long time after index path change

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0
    • Fix Version/s: 2.0.1
    • Component/s: ns_server
    • Security Level: Public
    • Labels:
    • Environment:
      Windows 2008 R2 64bit

      Description

      Online upgrade 5 nodes cluster from 1.8.1
      Cluster has one default bucket with 20 million items. Data path is set to c:/data
      10.3.2.11
      10.3.2.12
      10.3.2.16
      10.3.2.10
      10.3.2.75

      to 2.0.0-1971
      10.3.2.11 (data path and index path is set to default path when install 2.0.0-1971)
      10.3.2.16
      10.3.2.75
      10.3.2.76
      10.3.2.77

      Change index path in node 11 to new path (c:/index), couchbase server on node 11 restart.

      curl -i -v --data "index_path=c:/index" "http://Administrator:password@10.3.2.11:8091/nodes/self/controller/settings"

      • About to connect() to 10.3.2.11 port 8091 (#0)
      • Trying 10.3.2.11... Connection refused
      • couldn't connect to host
      • Closing connection #0
        curl: (7) couldn't connect to host

      I try to run again with cygwin style path

      curl -i -v --data "index_path=/cygdrive/c/index" "http://Administrator:password@10.3.2.11:8091/nodes/self/controller/settings"

      • About to connect() to 10.3.2.11 port 8091 (#0)
      • Trying 10.3.2.11... connected
      • Connected to 10.3.2.11 (10.3.2.11) port 8091 (#0)
      • Server auth using Basic with user 'Administrator'
        > POST /nodes/self/controller/settings HTTP/1.1
        > Authorization: Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==
        > User-Agent: curl/7.21.3 (x86_64-pc-linux-gnu) libcurl/7.21.3 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.18
        > Host: 10.3.2.11:8091
        > Accept: /
        > Content-Length: 28
        > Content-Type: application/x-www-form-urlencoded
        >
        < HTTP/1.1 400 Bad Request
        HTTP/1.1 400 Bad Request
        < Server: Couchbase Server 2.0.0-1971-rel-enterprise
        Server: Couchbase Server 2.0.0-1971-rel-enterprise
        < Pragma: no-cache
        Pragma: no-cache
        < Date: Mon, 03 Dec 2012 21:14:06 GMT
        Date: Mon, 03 Dec 2012 21:14:06 GMT
        < Content-Type: application/json
        Content-Type: application/json
        < Content-Length: 47
        Content-Length: 47
        < Cache-Control: no-cache
        Cache-Control: no-cache

      <

      • Connection #0 to host 10.3.2.11 left intact
      • Closing connection #0
        ["An absolute path is required for index_path"]

      In log page, see couchbase server restart on node 11

      Couchbase Server has started on web port 8091 on node 'ns_1@10.3.2.11'. menelaus_sup001 ns_1@10.3.2.11 13:12:38 - Mon Dec 3, 2012
      Shutting down bucket "default" on 'ns_1@10.3.2.11' for server shutdown ns_memcached002 ns_1@10.3.2.11 13:09:28 - Mon Dec 3, 2012
      Setting database directory path to c:/Program Files/Couchbase/Server/var/lib/couchbase/data and index directory path to c:/index ns_storage_conf000 ns_1@10.3.2.11 13:09:28 - Mon Dec 3, 2012

      Try connect to memcached on node 11, it hang

      thuan@ubu-1604:/opt/couchbase/bin$ ./cbstats 10.3.2.11:11210 raw warmup

      1. 6337.tar
        14.17 MB
        Ketaki Gangal
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        thuan Thuan Nguyen added a comment - - edited

        Reproduce in ubuntu 11.04 64bit with couchbase server 2.0.0-1971
        Install couchbase server 2.0.0-1971 on node 10.3.2.4 and set data and index to default path.
        Create default bucket.
        Change index path to /data from default path using curl command

        huan@ubu-1604:/opt/couchbase/bin$ curl -i -v --data "index_path=/data" "http://Administrator:password@10.3.2.4:8091/nodes/self/controller/settings" * About to connect() to 10.3.2.4 port 8091 (#0)

        • Trying 10.3.2.4... connected
        • Connected to 10.3.2.4 (10.3.2.4) port 8091 (#0)
        • Server auth using Basic with user 'Administrator'
          > POST /nodes/self/controller/settings HTTP/1.1
          > Authorization: Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==
          > User-Agent: curl/7.21.3 (x86_64-pc-linux-gnu) libcurl/7.21.3 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.18
          > Host: 10.3.2.4:8091
          > Accept: /
          > Content-Length: 16
          > Content-Type: application/x-www-form-urlencoded
          >
          < HTTP/1.1 200 OK
          HTTP/1.1 200 OK
          < Server: Couchbase Server 2.0.0-1971-rel-enterprise
          Server: Couchbase Server 2.0.0-1971-rel-enterprise
          < Pragma: no-cache
          Pragma: no-cache
          < Date: Mon, 03 Dec 2012 23:50:40 GMT
          Date: Mon, 03 Dec 2012 23:50:40 GMT
          < Content-Length: 0
          Content-Length: 0
          < Cache-Control: no-cache
          Cache-Control: no-cache

        <

        • Connection #0 to host 10.3.2.4 left intact
        • Closing connection #0

        Couchbase server shutdown as in log below.

        Couchbase Server has started on web port 8091 on node 'ns_1@127.0.0.1'. menelaus_sup001 ns_1@127.0.0.1 15:50:39 - Mon Dec 3, 2012
        I'm the only node, so I'm the master. mb_master000 ns_1@127.0.0.1 15:50:39 - Mon Dec 3, 2012
        Shutting down bucket "default" on 'ns_1@127.0.0.1' for server shutdown ns_memcached002 ns_1@127.0.0.1 15:50:30 - Mon Dec 3, 2012
        Setting database directory path to /opt/couchbase/var/lib/couchbase/data and index directory path to /data ns_storage_conf000 ns_1@127.0.0.1 15:50:30 - Mon Dec 3, 2012

        Show
        thuan Thuan Nguyen added a comment - - edited Reproduce in ubuntu 11.04 64bit with couchbase server 2.0.0-1971 Install couchbase server 2.0.0-1971 on node 10.3.2.4 and set data and index to default path. Create default bucket. Change index path to /data from default path using curl command huan@ubu-1604:/opt/couchbase/bin$ curl -i -v --data "index_path=/data" "http://Administrator:password@10.3.2.4:8091/nodes/self/controller/settings" * About to connect() to 10.3.2.4 port 8091 (#0) Trying 10.3.2.4... connected Connected to 10.3.2.4 (10.3.2.4) port 8091 (#0) Server auth using Basic with user 'Administrator' > POST /nodes/self/controller/settings HTTP/1.1 > Authorization: Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA== > User-Agent: curl/7.21.3 (x86_64-pc-linux-gnu) libcurl/7.21.3 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.18 > Host: 10.3.2.4:8091 > Accept: / > Content-Length: 16 > Content-Type: application/x-www-form-urlencoded > < HTTP/1.1 200 OK HTTP/1.1 200 OK < Server: Couchbase Server 2.0.0-1971-rel-enterprise Server: Couchbase Server 2.0.0-1971-rel-enterprise < Pragma: no-cache Pragma: no-cache < Date: Mon, 03 Dec 2012 23:50:40 GMT Date: Mon, 03 Dec 2012 23:50:40 GMT < Content-Length: 0 Content-Length: 0 < Cache-Control: no-cache Cache-Control: no-cache < Connection #0 to host 10.3.2.4 left intact Closing connection #0 Couchbase server shutdown as in log below. Couchbase Server has started on web port 8091 on node 'ns_1@127.0.0.1'. menelaus_sup001 ns_1@127.0.0.1 15:50:39 - Mon Dec 3, 2012 I'm the only node, so I'm the master. mb_master000 ns_1@127.0.0.1 15:50:39 - Mon Dec 3, 2012 Shutting down bucket "default" on 'ns_1@127.0.0.1' for server shutdown ns_memcached002 ns_1@127.0.0.1 15:50:30 - Mon Dec 3, 2012 Setting database directory path to /opt/couchbase/var/lib/couchbase/data and index directory path to /data ns_storage_conf000 ns_1@127.0.0.1 15:50:30 - Mon Dec 3, 2012
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        this is not a blocker bug because it does not destroy any data.

        we can add this to documentation that resetting the index path will restart the couchbase server

        Show
        farshid Farshid Ghods (Inactive) added a comment - this is not a blocker bug because it does not destroy any data. we can add this to documentation that resetting the index path will restart the couchbase server
        Hide
        Aliaksey Artamonau Aliaksey Artamonau added a comment -

        It's not really because of index path change. The problem is that we introduced a regression that would kill memcached port (not memcached itself) only after 60 seconds of wait. In some scenarios this could cause a data loss. For instance, if someone shut couchbase server down and then reboots the machine. On the moment of reboot there can still be memcached process alive writing something to databases.

        Show
        Aliaksey Artamonau Aliaksey Artamonau added a comment - It's not really because of index path change. The problem is that we introduced a regression that would kill memcached port (not memcached itself) only after 60 seconds of wait. In some scenarios this could cause a data loss. For instance, if someone shut couchbase server down and then reboots the machine. On the moment of reboot there can still be memcached process alive writing something to databases.
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        Alaiksey ,

        can you confirm the expected behavior ( after your fix ) :
        1- should couchbase server itself restart ?
        2- should memcached restart ?
        3- does this restart mccouch ?
        4- current index files are wiped out or kept as it is?
        5- what happens to the ddoc definitions ? do they get copied over from original to the new location
        6- does this API change the index path for all nodes in the cluster or is this per node ?

        Show
        farshid Farshid Ghods (Inactive) added a comment - Alaiksey , can you confirm the expected behavior ( after your fix ) : 1- should couchbase server itself restart ? 2- should memcached restart ? 3- does this restart mccouch ? 4- current index files are wiped out or kept as it is? 5- what happens to the ddoc definitions ? do they get copied over from original to the new location 6- does this API change the index path for all nodes in the cluster or is this per node ?
        Hide
        Aliaksey Artamonau Aliaksey Artamonau added a comment -

        1-3. Yes, to apply path changes ns_server restarts itself entirely including memcached and mccouch.
        4. Current index files are kept intact.
        5. Design document definitions are stored in master database that is stored together with other databases (i.e. in the database directory).
        6. The API is per node.

        Show
        Aliaksey Artamonau Aliaksey Artamonau added a comment - 1-3. Yes, to apply path changes ns_server restarts itself entirely including memcached and mccouch. 4. Current index files are kept intact. 5. Design document definitions are stored in master database that is stored together with other databases (i.e. in the database directory). 6. The API is per node.
        Show
        steve Steve Yen added a comment - http://review.couchbase.org/#/c/23020/
        Hide
        steve Steve Yen added a comment -

        moved to 2.0.1 per bug-scrub.

        Show
        steve Steve Yen added a comment - moved to 2.0.1 per bug-scrub.
        Hide
        andreibaranouski Andrei Baranouski added a comment -

        build 1974, centos 5.7

        observation when change index path:
        couchbase restarts, bucket was deleted

        Couchbase Server has started on web port 8091 on node 'ns_1@127.0.0.1'. menelaus_sup001 ns_1@127.0.0.1 16:11:24 - Wed Dec 5, 2012
        I'm the only node, so I'm the master. mb_master000 ns_1@127.0.0.1 16:11:24 - Wed Dec 5, 2012
        Shutting down bucket "default" on 'ns_1@127.0.0.1' for deletion ns_memcached002 ns_1@127.0.0.1 16:11:16 - Wed Dec 5, 2012
        Setting database directory path to /opt/couchbase/var/lib/couchbase/data and index directory path to /tmp ns_storage_conf000 ns_1@127.0.0.1 16:11:16 - Wed Dec 5, 2012
        Bucket "default" loaded on node 'ns_1@127.0.0.1' in 0 seconds. ns_memcached001 ns_1@127.0.0.1 16:10:01 - Wed Dec 5, 2012

        Show
        andreibaranouski Andrei Baranouski added a comment - build 1974, centos 5.7 observation when change index path: couchbase restarts, bucket was deleted Couchbase Server has started on web port 8091 on node 'ns_1@127.0.0.1'. menelaus_sup001 ns_1@127.0.0.1 16:11:24 - Wed Dec 5, 2012 I'm the only node, so I'm the master. mb_master000 ns_1@127.0.0.1 16:11:24 - Wed Dec 5, 2012 Shutting down bucket "default" on 'ns_1@127.0.0.1' for deletion ns_memcached002 ns_1@127.0.0.1 16:11:16 - Wed Dec 5, 2012 Setting database directory path to /opt/couchbase/var/lib/couchbase/data and index directory path to /tmp ns_storage_conf000 ns_1@127.0.0.1 16:11:16 - Wed Dec 5, 2012 Bucket "default" loaded on node 'ns_1@127.0.0.1' in 0 seconds. ns_memcached001 ns_1@127.0.0.1 16:10:01 - Wed Dec 5, 2012
        Hide
        Aliaksey Artamonau Aliaksey Artamonau added a comment -

        Bucket should not be deleted when only index path is changed. I cannot reproduce it on my system. Could you please attach logs?

        Show
        Aliaksey Artamonau Aliaksey Artamonau added a comment - Bucket should not be deleted when only index path is changed. I cannot reproduce it on my system. Could you please attach logs?
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        Ketaki,

        please reproduce and update logs or pass the cluster to Aliaksey A.

        Show
        farshid Farshid Ghods (Inactive) added a comment - Ketaki, please reproduce and update logs or pass the cluster to Aliaksey A.
        Hide
        ketaki Ketaki Gangal added a comment -

        Hi Aliaksey,

        I can repro this every time on my tests.

        • Create a 3 node cluster with 2 buckets.
          -Load 10k items.
        • Create 1 view
        • Change index path : curl -i -v --data "index_path=/data" "http://Administrator:password@10.1.3.176:8091/nodes/self/controller/settings"

        Choosing the index path change on the *master node above.

        • Post index path change, no data /bucket on the cluster.
        • ls -a on nodes
          shows empty @indexes file and empty data dir.

        [root@grape-003 couchbase]# cd data/
        [root@grape-003 data]# ls
        @indexes isasl.pw ns_log _replicator.couch.1 _users.couch.1

        Show
        ketaki Ketaki Gangal added a comment - Hi Aliaksey, I can repro this every time on my tests. Create a 3 node cluster with 2 buckets. -Load 10k items. Create 1 view Change index path : curl -i -v --data "index_path=/data" "http://Administrator:password@10.1.3.176:8091/nodes/self/controller/settings" Choosing the index path change on the *master node above. Post index path change, no data /bucket on the cluster. ls -a on nodes shows empty @indexes file and empty data dir. [root@grape-003 couchbase] # cd data/ [root@grape-003 data] # ls @indexes isasl.pw ns_log _replicator.couch.1 _users.couch.1
        Hide
        ketaki Ketaki Gangal added a comment -

        Adding logs here.

        Show
        ketaki Ketaki Gangal added a comment - Adding logs here.
        Hide
        ketaki Ketaki Gangal added a comment -

        Opened another bug to track the behaviour. http://www.couchbase.com/issues/browse/MB-7368

        Not seeing above on the current testing.

        Show
        ketaki Ketaki Gangal added a comment - Opened another bug to track the behaviour. http://www.couchbase.com/issues/browse/MB-7368 Not seeing above on the current testing.
        Hide
        Aliaksey Artamonau Aliaksey Artamonau added a comment -

        We found that the issue was that we didn't wait for memcached termination correctly. Then ns_server would start memcached again while the previous instance was still shutting down. Probably because it's windows, no eaddrinuse errors were reported. ns_server was just unable to connect to memcached. When old memcached instance finally died, node returned to a good state.

        Show
        Aliaksey Artamonau Aliaksey Artamonau added a comment - We found that the issue was that we didn't wait for memcached termination correctly. Then ns_server would start memcached again while the previous instance was still shutting down. Probably because it's windows, no eaddrinuse errors were reported. ns_server was just unable to connect to memcached. When old memcached instance finally died, node returned to a good state.
        Hide
        Aliaksey Artamonau Aliaksey Artamonau added a comment -

        fix merged

        Show
        Aliaksey Artamonau Aliaksey Artamonau added a comment - fix merged

          People

          • Assignee:
            ketaki Ketaki Gangal
            Reporter:
            thuan Thuan Nguyen
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes