Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Cannot Reproduce
-
Cheshire-Cat
-
7.0.0-2792-enterprise
-
Untriaged
-
Centos 64-bit
-
-
1
-
Unknown
Description
Build: 7.0.0-2792
Scenario:
- Two node cluster
+----------------+----------------------+-----------------+------------+------------+----------------------+
| Node | Services | CPU_utilization | Mem_total | Mem_free | Swap_mem_used |
+----------------+----------------------+-----------------+------------+------------+----------------------+
| 172.23.123.162 | cbas | 16.5407854985 | 4201672704 | 3151945728 | 3145728 / 3758092288 |
| 172.23.123.164 | fts, index, kv, n1ql | 1.02963335008 | 4201672704 | 3381792768 | 6815744 / 3758092288 |
+----------------+----------------------+-----------------+------------+------------+----------------------+
- Couchbase bucket with replica=0
+----------------+----------------------+-----------------+------------+------------+----------------------+
| Node | Services | CPU_utilization | Mem_total | Mem_free | Swap_mem_used |
+----------------+----------------------+-----------------+------------+------------+----------------------+
| 172.23.123.162 | cbas | 16.5407854985 | 4201672704 | 3151945728 | 3145728 / 3758092288 |
| 172.23.123.164 | fts, index, kv, n1ql | 1.02963335008 | 4201672704 | 3381792768 | 6815744 / 3758092288 |
+----------------+----------------------+-----------------+------------+------------+----------------------+
- Create CBAS dataset and connect to the couchbase bucket
- Restart KV node
Observation:
After node restart at step#4, seeing continnuous indexer and fts services crash message from the node-1 (172.23.123.164)
Cbcollect logs:
https://cb-jira.s3.us-east-2.amazonaws.com/logs/fts_indexer_crash/collectinfo-2020-08-10T143824-ns_1%40172.23.123.162.zip
https://cb-jira.s3.us-east-2.amazonaws.com/logs/fts_indexer_crash/collectinfo-2020-08-10T143824-ns_1%40172.23.123.164.zip
TAF test case:
./testrunner -i /tmp/testexec.26703.ini replicas=0,durability=MAJORITY,GROUP=P1;durability,upgrade_version=7.0.0-2792 -t cbas.cbas_bucket_operations.CBASBucketOperations.test_restart_kv_server_impact_on_bucket,num_items=100000,create_secondary_indexes=False,cb_bucket_name=default,cbas_bucket_name=default_bucket,cbas_dataset_name=default_ds,GROUP=P1;durability
|
Test log link: http://qa.sc.couchbase.com/job/test_suite_executor-TAF/49619/consoleText
[<<"Service 'indexer' exited with status 2. Restarting. Messages:\n2020-08-10T07:25:56.775-07:00 [Info] Indexer started with command line: [/opt/couchbase/bin/indexer -adminPort=9100 -scanPort=9101 -httpPort=9102 -streamInitPort=9103 -streamCatchupPort=9104 -streamMaintPort=9105 --httpsPort=19102 --certFile=/opt/couchbase/var/lib/couchbase/config/memcached-cert.pem --keyFile=/opt/couchbase/var/lib/couchbase/config/memcached-key.pem -vbuckets=1024 -cluster=127.0.0.1:8091 -storageDir=/data/idx/@2i -diagDir=/opt/couchbase/var/lib/couchbase/crash -logDir=/opt/couchbase/var/lib/couchbase/logs -nodeUUID=fe088e028f2f4b27bf0efe524b9611c3 -ipv6=false -isEnterprise=true]\npanic: mkdir /data/idx: permission denied\n\ngoroutine 1 [running]:\ngithub.com/couchbase/indexing/secondary/common.CrashOnError(...)\n\t/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/common/util.go:413\nmain.main()\n\tgoproj/src/github.com/couchbase/indexing/secondary/cmd/indexer/main.go:111 +0x1658\n">>],
|
[],info,
|
{{2020,8,10},{7,25,56}}},
|
{log_entry,{1597,69556,972091},
|
'ns_1@172.23.123.164',ns_log,0,
|
[<<"Service 'fts' exited with status 1. Restarting. Messages:\n2020-08-10T07:25:56.964-07:00 [INFO] -staticDir=\"static\"\n2020-08-10T07:25:56.964-07:00 [INFO] -staticETag=\"\"\n2020-08-10T07:25:56.964-07:00 [INFO] -tags=\"feed,janitor,pindex,queryer,cbauth_service\"\n2020-08-10T07:25:56.964-07:00 [INFO] -tlsCertFile=\"/opt/couchbase/var/lib/couchbase/config/memcached-cert.pem\"\n2020-08-10T07:25:56.964-07:00 [INFO] -tlsKeyFile=\"/opt/couchbase/var/lib/couchbase/config/memcached-key.pem\"\n2020-08-10T07:25:56.964-07:00 [INFO] -uuid=\"fe088e028f2f4b27bf0efe524b9611c3\"\n2020-08-10T07:25:56.964-07:00 [INFO] -version=\"false\"\n2020-08-10T07:25:56.964-07:00 [INFO] -weight=\"1\"\n2020-08-10T07:25:56.964-07:00 [INFO] GOMAXPROCS=4\n2020-08-10T07:25:56.964-07:00 [INFO] main: registered bleve stores\n2020-08-10T07:25:56.964-07:00 [INFO] goleveldb\n2020-08-10T07:25:56.964-07:00 [INFO] moss\n2020-08-10T07:25:56.964-07:00 [INFO] gtreap\n2020-08-10T07:25:56.964-07:00 [INFO] boltdb\n2020-08-10T07:25:56.964-07:00 [INFO] metrics\n2020-08-10T07:25:56.964-07:00 [FATA] main: data directory does not exist, dataDir: /data/idx/@fts -- main.main() at main.go:116\n">>],
|
[],info,
|
{{2020,8,10},{7,25,56}}},
|
Ashwin Govindarajulu, had a quick look into the logs and it mentions that
So this is expected behaviour for FTS as there is no directory paths for it to start booting from.
I am not sure how this happened in your test environment?
Can you please cross-check this and ensure that the boot directory paths are available for FTS?
And mostly the same issue can cause indexer also restarts. (FTS shares the same higher level directory tree with that of indexer)
Also, please note GSI and FTS are totally unrelated products and hence two tickets might need next time any issues happens.