Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-529

system test: fts crashing after failed writes to mount directory using pvc

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Critical
    • 1.0.0
    • None
    • None
    • None

    Description

      This failure happens in system test so the operator is paused. What happens is we create a cluster of 4 cb nodes with all services. FTS has 800mb ram, data service has 2GB ram. Portworx persistent volumes are used for all mount paths and are 2GB per claim. We create 10 buckets, with default being one and the nine other buckets are related to tpcc testing. default is loaded with data from pillow fight about 1GB, and the other buckets combined use about another 2GB. I also create a single scorch index on default.  The system tests fail at the rebalance stage. When I look at the logs I see 100s of these messages:

      Service 'fts' exited with status 1. Restarting. Messages: 2018-08-10T18:08:00.433+00:00 [INFO] -uuid="bbbde05249420ad7b3999e2dd4d151ae" 2018-08-10T18:08:00.433+00:00 [INFO] -version="false" 2018-08-10T18:08:00.433+00:00 [INFO] -weight="1" 2018-08-10T18:08:00.433+00:00 [INFO] GOMAXPROCS=8 2018-08-10T18:08:00.433+00:00 [INFO] main: registered bleve stores 2018-08-10T18:08:00.433+00:00 [INFO] boltdb 2018-08-10T18:08:00.433+00:00 [INFO] metrics 2018-08-10T18:08:00.433+00:00 [INFO] goleveldb 2018-08-10T18:08:00.433+00:00 [INFO] moss 2018-08-10T18:08:00.433+00:00 [INFO] gtreap 2018-08-10T18:08:00.433+00:00 [INFO] main: curr dir: "/opt/couchbase/var/lib/couchbase" 2018-08-10T18:08:00.433+00:00 [INFO] main: data dir: "/mnt/index/@fts" 2018-08-10T18:08:00.434+00:00 [FATA] main: could not write uuidPath: /mnt/index/@fts/cbft.uuid, err: &os.PathError{Op:"write", Path:"/mnt/index/@fts/cbft.uuid", Err:0x1c}, Please check that your -data/-dataDir parameter ("/mnt/index/@fts") is to a writable directory where cbft can persist data. – main.main() at main.go:149 [goport(/opt/couchbase/bin/cbft)] 2018/08/10 18:08:00 child process exited with status 1

      I am not totally sure if this is a bug. My guess is either some data is ballooning bigger than the pvc and then fts cannot write to /mnt/index/@fts/cbft.uuid, or if its a issue with setting the mount path to /mnt/index and then fts tries to write to a non-existing sub path. I will try to increase the pvc size and see if that fixes this, but in the mean time I wanted to file this in case it is a bigger issue.

       logs:

      https://s3.amazonaws.com/bugdb/jira/qe/collectinfo-2018-08-10T172721-ns_1%40test-couchbase-t4k26-0000.test-couchbase-t4k26.default.svc.zip
      https://s3.amazonaws.com/bugdb/jira/qe/collectinfo-2018-08-10T172721-ns_1%40test-couchbase-t4k26-0001.test-couchbase-t4k26.default.svc.zip
      https://s3.amazonaws.com/bugdb/jira/qe/collectinfo-2018-08-10T172721-ns_1%40test-couchbase-t4k26-0002.test-couchbase-t4k26.default.svc.zip
      https://s3.amazonaws.com/bugdb/jira/qe/collectinfo-2018-08-10T172721-ns_1%40test-couchbase-t4k26-0003.test-couchbase-t4k26.default.svc.zip

       

      Attachments

        Activity

          People

            mikew Mike Wiederhold [X] (Inactive)
            korrigan.clark Korrigan Clark (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              PagerDuty