Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-32547

Swap FTS node rebalance failed (boltdb panic)

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 6.0.0
    • 6.5.0
    • fts
    • Untriaged
    • Unknown

    Description

      Build 6.0.0-1693

      Observed that swap rebalance of FTS node in high bucket density test failed with following error-

      Rebalance exited with reason {service_rebalance_failed,fts, {rebalance_failed, {service_error, <<"nodes: sample, res: (*http.Response)(nil), urlUUID: monitor.UrlUUID{Url:\"http://172.23.96.20:8094\", UUID:\"050b9271854fba4a40721562e160e9be\"}, kind: /api/stats, err: Get http://%40fts-cbauth:51f01258286dc0e326539b393b07c7c1@172.23.96.20:8094/api/stats: EOF">>}}}
      

      The setup has 30 buckets and 1 FTS index per bucket and no its queries.

      Tried swap rebalance again on same setup got same error.

      Logs-
      FTS node which is rebalanced out- https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-arke-multi-bucket-250/172.23.96.20.zip
      FTS node which is rebalanced in- https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-arke-multi-bucket-250/172.23.97.15.zip
      Analytics node- https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-arke-multi-bucket-250/172.23.96.23.zip
      Eventing node- https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-arke-multi-bucket-250/172.23.97.177.zip
      Index+query node- https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-arke-multi-bucket-250/172.23.97.19.zip
      Index+query node- https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-arke-multi-bucket-250/172.23.97.20.zip
      KV node- https://s3.amazonaws.com/bugdb/jira/30_bucket_reb_logs/collectinfo-2019-01-09T041824-ns_1%40172.23.97.12.zip
      KV node- https://s3.amazonaws.com/bugdb/jira/30_bucket_reb_logs/collectinfo-2019-01-09T041824-ns_1%40172.23.97.13.zip
      KV node- https://s3.amazonaws.com/bugdb/jira/30_bucket_reb_logs/collectinfo-2019-01-09T041824-ns_1%40172.23.97.14.zip

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          Closing this issue as we did not see panic in FTS logs in last rebalance failure comment.

          To track REST API issue ticket logged at MB-34931

          Wil reopen this if we see bolted panic again.

           

          mahesh.mandhare Mahesh Mandhare (Inactive) added a comment - - edited Closing this issue as we did not see panic in FTS logs in last rebalance failure comment. To track REST API issue ticket logged at  MB-34931 Wil reopen this if we see bolted panic again.  
          steve Steve Yen added a comment -

          > I see this MB-32547 ticket is against 6.0.0? Which is Alice?

          Whoops, that's wrong. The latest cbcollect-info's from Mahesh Mandhare are with using 6.5.0, so the REST API resilience checks should already be in 6.5.0 codepaths.

          I've spawned off a separate ticket – MB-34931 – to reduce confusion (at least my confusion!).

          steve Steve Yen added a comment - > I see this MB-32547 ticket is against 6.0.0? Which is Alice? Whoops, that's wrong. The latest cbcollect-info's from Mahesh Mandhare are with using 6.5.0, so the REST API resilience checks should already be in 6.5.0 codepaths. I've spawned off a separate ticket – MB-34931 – to reduce confusion (at least my confusion!).
          steve Steve Yen added a comment -

          I see this MB-32547 ticket is against 6.0.0? Which is Alice?

          One related thing is I recall Sreekanth Sivasankaran had added (I think after Alice shipped) some more resilience codepaths when FTS receives errors when trying to poll for REST /api/stats across the cluster... Related: MB-31258 / http://review.couchbase.org/99672

          If so, this might have been fixed already in Mad-Hatter?

          Assigning to Sreekanth Sivasankaran to get his sanity check on that.

          steve Steve Yen added a comment - I see this MB-32547 ticket is against 6.0.0? Which is Alice? One related thing is I recall Sreekanth Sivasankaran had added (I think after Alice shipped) some more resilience codepaths when FTS receives errors when trying to poll for REST /api/stats across the cluster... Related: MB-31258 / http://review.couchbase.org/99672 If so, this might have been fixed already in Mad-Hatter? Assigning to Sreekanth Sivasankaran to get his sanity check on that.
          steve Steve Yen added a comment - - edited

          From the latest, recent *.log files, not seeing any more panic()'s, so the original issue from awhile back with boltdb looks fixed. So, this might deserve spawning off new MB ticket to keep things tracking these issues easier.

          From the recent ns_server.fts.log from node *.14, here are some log message lines from around the 2019-07-04T10:19:33 timeframe around the error of "/api/stats, err: Get http://%40fts-cbauth:***@172.23.96.20:8094/api/stats: EOF". The log file for node *.20, btw, doesn't have any useful fts log.

          2019-07-04T10:19:32.631-07:00 [INFO] ctl/manager: GetTaskList, haveTasksRev: 4718, changed, rv: &{Rev:[52 55 50 48] Tasks:[{Rev:[52 55 49 57] ID:rebalance:c7f7247e99cc3d24c3b37\
          3256000b310 Type:task-rebalance Status:task-running IsCancelable:true Progress:1.4555555555555555 DetailedProgress:map[] Description:topology change ErrorMessage: Extra:map[top\
          ologyChange:{ID:c7f7247e99cc3d24c3b373256000b310 CurrentTopologyRev:[] Type:topology-change-rebalance KeepNodes:[{NodeInfo:{NodeID:e982f22ef1a2276d471ce79e1491c68a Priority:0 O\
          paque:<nil>} RecoveryType:recovery-full}] EjectNodes:[{NodeID:57a702e0b3ebec5ca47f209844098537 Priority:0 Opaque:<nil>}]}]}]}
          2019-07-04T10:19:32.660-07:00 [INFO] cbdatasource: server: 172.23.97.13:11210, uprOpenName: fts:bucket-9-fts_less_words_23477b50fc00abbd_6ddbfb54-39d3da9c, worker, looping beg,\
           vbucketState: "running" (has 169 vbuckets), 855-1023
          2019-07-04T10:19:32.740-07:00 [INFO] janitor: pindexes to remove: 0
          2019-07-04T10:19:32.740-07:00 [INFO] janitor: pindexes to add: 0
          2019-07-04T10:19:32.740-07:00 [INFO] janitor: pindexes to restart: 0
          2019-07-04T10:19:33.033-07:00 [INFO] ctl/manager: revNum: 4721, progress: 1.455556
          2019-07-04T10:19:33.033-07:00 [INFO] ctl/manager: GetTaskList, haveTasksRev: 4720, changed, rv: &{Rev:[52 55 50 50] Tasks:[{Rev:[52 55 50 49] ID:rebalance:c7f7247e99cc3d24c3b37\
          3256000b310 Type:task-rebalance Status:task-running IsCancelable:true Progress:1.4555555555555555 DetailedProgress:map[] Description:topology change ErrorMessage: Extra:map[top\
          ologyChange:{ID:c7f7247e99cc3d24c3b373256000b310 CurrentTopologyRev:[] Type:topology-change-rebalance KeepNodes:[{NodeInfo:{NodeID:e982f22ef1a2276d471ce79e1491c68a Priority:0 O\
          paque:<nil>} RecoveryType:recovery-full}] EjectNodes:[{NodeID:57a702e0b3ebec5ca47f209844098537 Priority:0 Opaque:<nil>}]}]}]}
          2019-07-04T10:19:33.041-07:00 [WARN] ctl: ReportProgress, err: nodes: sample, res: (*http.Response)(nil), urlUUID: monitor.UrlUUID{Url:"http://172.23.96.20:8094", UUID:"57a702e\
          0b3ebec5ca47f209844098537"}, kind: /api/stats, err: Get http://%40fts-cbauth:***@172.23.96.20:8094/api/stats: EOF -- ctl.(*Ctl).startCtlLOCKED.func2.2() at ctl.go:807
          2019-07-04T10:19:33.181-07:00 [INFO] janitor: feeds to remove: 0
          2019-07-04T10:19:33.181-07:00 [INFO] janitor: feeds to add: 0
          2019-07-04T10:19:33.181-07:00 [INFO] janitor: awakes, op: kick, msg: cfg changed, key: planPIndexes
          2019-07-04T10:19:33.305-07:00 [INFO] cbdatasource: server: 172.23.97.12:11210, uprOpenName: fts:bucket-13-fts_less_words_6cd3c04d7bcc992e_13aa53f3-62873057, worker, looping beg\
          , vbucketState: "running" (has 87 vbuckets), 0-86
          2019-07-04T10:19:33.314-07:00 [INFO] cbdatasource: server: 172.23.97.13:11210, uprOpenName: fts:bucket-13-fts_less_words_6cd3c04d7bcc992e_13aa53f3-62873057, worker, looping beg\
          , vbucketState: "running" (has 84 vbuckets), 87-170
          2019-07-04T10:19:33.508-07:00 [INFO] cbdatasource: server: 172.23.97.13:11210, uprOpenName: fts:bucket-13-fts_less_words_6cd3c04d7bcc992e_6ddbfb54-55928503, worker, looping beg\
          , vbucketState: "running" (has 169 vbuckets), 855-1023
          2019-07-04T10:19:33.606-07:00 [INFO] cbdatasource: server: 172.23.97.13:11210, uprOpenName: fts:bucket-5-fts_less_words_7d34a0ae348705c4_f4e0a48a-7ddfe2f3, worker, looping beg,\
           vbucketState: "running" (has 86 vbuckets), 256-341
          2019-07-04T10:19:33.606-07:00 [INFO] cbdatasource: server: 172.23.97.12:11210, uprOpenName: fts:bucket-5-fts_less_words_7d34a0ae348705c4_f4e0a48a-7ddfe2f3, worker, looping beg,\
           vbucketState: "running" (has 85 vbuckets), 171-255
          2019-07-04T10:19:33.896-07:00 [INFO] cbdatasource: server: 172.23.97.13:11210, uprOpenName: fts:bucket-14-fts_more_words_7c1051b3fbc3ea31_13aa53f3-373bc0bc, worker, looping beg\
          , vbucketState: "running" (has 84 vbuckets), 87-170
          2019-07-04T10:19:33.896-07:00 [INFO] cbdatasource: server: 172.23.97.12:11210, uprOpenName: fts:bucket-14-fts_more_words_7c1051b3fbc3ea31_13aa53f3-373bc0bc, worker, looping beg\
          , vbucketState: "running" (has 87 vbuckets), 0-86
          2019-07-04T10:19:34.047-07:00 [INFO] ctl/manager: revNum: 4723, progress: 0.000000
          2019-07-04T10:19:34.047-07:00 [INFO] ctl/manager: GetTaskList, haveTasksRev: 4722, changed, rv: &{Rev:[52 55 50 52] Tasks:[{Rev:[52 55 50 51] ID:rebalance:c7f7247e99cc3d24c3b37\
          3256000b310 Type:task-rebalance Status:task-failed IsCancelable:true Progress:0 DetailedProgress:map[] Description:topology change ErrorMessage:nodes: sample, res: (*http.Respo\
          nse)(nil), urlUUID: monitor.UrlUUID{Url:"http://172.23.96.20:8094", UUID:"57a702e0b3ebec5ca47f209844098537"}, kind: /api/stats, err: Get http://%40fts-cbauth:***@172.23.96.20:8\
          094/api/stats: EOF Extra:map[topologyChange:{ID:c7f7247e99cc3d24c3b373256000b310 CurrentTopologyRev:[] Type:topology-change-rebalance KeepNodes:[{NodeInfo:{NodeID:e982f22ef1a22\
          76d471ce79e1491c68a Priority:0 Opaque:<nil>} RecoveryType:recovery-full}] EjectNodes:[{NodeID:57a702e0b3ebec5ca47f209844098537 Priority:0 Opaque:<nil>}]}]}]}
          2019-07-04T10:19:34.047-07:00 [INFO] ctl/manager: GetCurrentTopology, haveTopologyRev: 6, changed, rv: &{Rev:[55] Nodes:[57a702e0b3ebec5ca47f209844098537 e982f22ef1a2276d471ce7\
          9e1491c68a] IsBalanced:false Messages:[error: nodes: sample, res: (*http.Response)(nil), urlUUID: monitor.UrlUUID{Url:"http://172.23.96.20:8094", UUID:"57a702e0b3ebec5ca47f2098\
          44098537"}, kind: /api/stats, err: Get http://%40fts-cbauth:***@172.23.96.20:8094/api/stats: EOF]}
          2019-07-04T10:19:34.054-07:00 [INFO] ctl/manager: CancelTask, taskId: rebalance:c7f7247e99cc3d24c3b373256000b310, taskRev: 
          2019-07-04T10:19:34.054-07:00 [INFO] ctl/manager: stop taskHandle, ctlTopology.Rev: 6
          2019-07-04T10:19:34.054-07:00 [INFO] ctl/manager: CancelTask, taskId: rebalance:c7f7247e99cc3d24c3b373256000b310, taskRev: [], done
          2019-07-04T10:19:34.054-07:00 [INFO] ctl/manager: GetTaskList, haveTasksRev: 4724, changed, rv: &{Rev:[52 55 50 53] Tasks:[]}
          2019-07-04T10:19:34.222-07:00 [INFO] cbdatasource: server: 172.23.97.13:11210, uprOpenName: fts:bucket-14-fts_more_words_7c1051b3fbc3ea31_18572d87-5cd49993, worker, looping beg\
          

          steve Steve Yen added a comment - - edited From the latest, recent *.log files, not seeing any more panic()'s, so the original issue from awhile back with boltdb looks fixed. So, this might deserve spawning off new MB ticket to keep things tracking these issues easier. From the recent ns_server.fts.log from node *.14, here are some log message lines from around the 2019-07-04T10:19:33 timeframe around the error of "/api/stats, err: Get http://%40fts-cbauth:***@172.23.96.20:8094/api/stats: EOF". The log file for node *.20, btw, doesn't have any useful fts log. 2019-07-04T10:19:32.631-07:00 [INFO] ctl/manager: GetTaskList, haveTasksRev: 4718, changed, rv: &{Rev:[52 55 50 48] Tasks:[{Rev:[52 55 49 57] ID:rebalance:c7f7247e99cc3d24c3b37\ 3256000b310 Type:task-rebalance Status:task-running IsCancelable:true Progress:1.4555555555555555 DetailedProgress:map[] Description:topology change ErrorMessage: Extra:map[top\ ologyChange:{ID:c7f7247e99cc3d24c3b373256000b310 CurrentTopologyRev:[] Type:topology-change-rebalance KeepNodes:[{NodeInfo:{NodeID:e982f22ef1a2276d471ce79e1491c68a Priority:0 O\ paque:<nil>} RecoveryType:recovery-full}] EjectNodes:[{NodeID:57a702e0b3ebec5ca47f209844098537 Priority:0 Opaque:<nil>}]}]}]} 2019-07-04T10:19:32.660-07:00 [INFO] cbdatasource: server: 172.23.97.13:11210, uprOpenName: fts:bucket-9-fts_less_words_23477b50fc00abbd_6ddbfb54-39d3da9c, worker, looping beg,\ vbucketState: "running" (has 169 vbuckets), 855-1023 2019-07-04T10:19:32.740-07:00 [INFO] janitor: pindexes to remove: 0 2019-07-04T10:19:32.740-07:00 [INFO] janitor: pindexes to add: 0 2019-07-04T10:19:32.740-07:00 [INFO] janitor: pindexes to restart: 0 2019-07-04T10:19:33.033-07:00 [INFO] ctl/manager: revNum: 4721, progress: 1.455556 2019-07-04T10:19:33.033-07:00 [INFO] ctl/manager: GetTaskList, haveTasksRev: 4720, changed, rv: &{Rev:[52 55 50 50] Tasks:[{Rev:[52 55 50 49] ID:rebalance:c7f7247e99cc3d24c3b37\ 3256000b310 Type:task-rebalance Status:task-running IsCancelable:true Progress:1.4555555555555555 DetailedProgress:map[] Description:topology change ErrorMessage: Extra:map[top\ ologyChange:{ID:c7f7247e99cc3d24c3b373256000b310 CurrentTopologyRev:[] Type:topology-change-rebalance KeepNodes:[{NodeInfo:{NodeID:e982f22ef1a2276d471ce79e1491c68a Priority:0 O\ paque:<nil>} RecoveryType:recovery-full}] EjectNodes:[{NodeID:57a702e0b3ebec5ca47f209844098537 Priority:0 Opaque:<nil>}]}]}]} 2019-07-04T10:19:33.041-07:00 [WARN] ctl: ReportProgress, err: nodes: sample, res: (*http.Response)(nil), urlUUID: monitor.UrlUUID{Url:"http://172.23.96.20:8094", UUID:"57a702e\ 0b3ebec5ca47f209844098537"}, kind: /api/stats, err: Get http://%40fts-cbauth:***@172.23.96.20:8094/api/stats: EOF -- ctl.(*Ctl).startCtlLOCKED.func2.2() at ctl.go:807 2019-07-04T10:19:33.181-07:00 [INFO] janitor: feeds to remove: 0 2019-07-04T10:19:33.181-07:00 [INFO] janitor: feeds to add: 0 2019-07-04T10:19:33.181-07:00 [INFO] janitor: awakes, op: kick, msg: cfg changed, key: planPIndexes 2019-07-04T10:19:33.305-07:00 [INFO] cbdatasource: server: 172.23.97.12:11210, uprOpenName: fts:bucket-13-fts_less_words_6cd3c04d7bcc992e_13aa53f3-62873057, worker, looping beg\ , vbucketState: "running" (has 87 vbuckets), 0-86 2019-07-04T10:19:33.314-07:00 [INFO] cbdatasource: server: 172.23.97.13:11210, uprOpenName: fts:bucket-13-fts_less_words_6cd3c04d7bcc992e_13aa53f3-62873057, worker, looping beg\ , vbucketState: "running" (has 84 vbuckets), 87-170 2019-07-04T10:19:33.508-07:00 [INFO] cbdatasource: server: 172.23.97.13:11210, uprOpenName: fts:bucket-13-fts_less_words_6cd3c04d7bcc992e_6ddbfb54-55928503, worker, looping beg\ , vbucketState: "running" (has 169 vbuckets), 855-1023 2019-07-04T10:19:33.606-07:00 [INFO] cbdatasource: server: 172.23.97.13:11210, uprOpenName: fts:bucket-5-fts_less_words_7d34a0ae348705c4_f4e0a48a-7ddfe2f3, worker, looping beg,\ vbucketState: "running" (has 86 vbuckets), 256-341 2019-07-04T10:19:33.606-07:00 [INFO] cbdatasource: server: 172.23.97.12:11210, uprOpenName: fts:bucket-5-fts_less_words_7d34a0ae348705c4_f4e0a48a-7ddfe2f3, worker, looping beg,\ vbucketState: "running" (has 85 vbuckets), 171-255 2019-07-04T10:19:33.896-07:00 [INFO] cbdatasource: server: 172.23.97.13:11210, uprOpenName: fts:bucket-14-fts_more_words_7c1051b3fbc3ea31_13aa53f3-373bc0bc, worker, looping beg\ , vbucketState: "running" (has 84 vbuckets), 87-170 2019-07-04T10:19:33.896-07:00 [INFO] cbdatasource: server: 172.23.97.12:11210, uprOpenName: fts:bucket-14-fts_more_words_7c1051b3fbc3ea31_13aa53f3-373bc0bc, worker, looping beg\ , vbucketState: "running" (has 87 vbuckets), 0-86 2019-07-04T10:19:34.047-07:00 [INFO] ctl/manager: revNum: 4723, progress: 0.000000 2019-07-04T10:19:34.047-07:00 [INFO] ctl/manager: GetTaskList, haveTasksRev: 4722, changed, rv: &{Rev:[52 55 50 52] Tasks:[{Rev:[52 55 50 51] ID:rebalance:c7f7247e99cc3d24c3b37\ 3256000b310 Type:task-rebalance Status:task-failed IsCancelable:true Progress:0 DetailedProgress:map[] Description:topology change ErrorMessage:nodes: sample, res: (*http.Respo\ nse)(nil), urlUUID: monitor.UrlUUID{Url:"http://172.23.96.20:8094", UUID:"57a702e0b3ebec5ca47f209844098537"}, kind: /api/stats, err: Get http://%40fts-cbauth:***@172.23.96.20:8\ 094/api/stats: EOF Extra:map[topologyChange:{ID:c7f7247e99cc3d24c3b373256000b310 CurrentTopologyRev:[] Type:topology-change-rebalance KeepNodes:[{NodeInfo:{NodeID:e982f22ef1a22\ 76d471ce79e1491c68a Priority:0 Opaque:<nil>} RecoveryType:recovery-full}] EjectNodes:[{NodeID:57a702e0b3ebec5ca47f209844098537 Priority:0 Opaque:<nil>}]}]}]} 2019-07-04T10:19:34.047-07:00 [INFO] ctl/manager: GetCurrentTopology, haveTopologyRev: 6, changed, rv: &{Rev:[55] Nodes:[57a702e0b3ebec5ca47f209844098537 e982f22ef1a2276d471ce7\ 9e1491c68a] IsBalanced:false Messages:[error: nodes: sample, res: (*http.Response)(nil), urlUUID: monitor.UrlUUID{Url:"http://172.23.96.20:8094", UUID:"57a702e0b3ebec5ca47f2098\ 44098537"}, kind: /api/stats, err: Get http://%40fts-cbauth:***@172.23.96.20:8094/api/stats: EOF]} 2019-07-04T10:19:34.054-07:00 [INFO] ctl/manager: CancelTask, taskId: rebalance:c7f7247e99cc3d24c3b373256000b310, taskRev: 2019-07-04T10:19:34.054-07:00 [INFO] ctl/manager: stop taskHandle, ctlTopology.Rev: 6 2019-07-04T10:19:34.054-07:00 [INFO] ctl/manager: CancelTask, taskId: rebalance:c7f7247e99cc3d24c3b373256000b310, taskRev: [], done 2019-07-04T10:19:34.054-07:00 [INFO] ctl/manager: GetTaskList, haveTasksRev: 4724, changed, rv: &{Rev:[52 55 50 53] Tasks:[]} 2019-07-04T10:19:34.222-07:00 [INFO] cbdatasource: server: 172.23.97.13:11210, uprOpenName: fts:bucket-14-fts_more_words_7c1051b3fbc3ea31_18572d87-5cd49993, worker, looping beg\

          Build 6.5.0-3633

          Observed that in high bucket density tests swap rebalance of FTS is failed again with same error-

          Rebalance exited with reason {service_rebalance_failed,fts,
          {rebalance_failed,
          {service_error,
          <<"nodes: sample, res: (*http.Response)(nil), urlUUID: monitor.UrlUUID{Url:\"http://172.23.96.20:8094\", UUID:\"57a702e0b3ebec5ca47f209844098537\"}, kind: /api/stats, err: Get http://%40fts-cbauth:***@172.23.96.20:8094/api/stats: EOF">>}}}.
          Rebalance Operation Id = fb7f73c7f76bd2f87888891e0aa3a299

          Logs-

          FTS node coming in- https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.14.zip

          FTS node going out- https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.96.20.zip

          Other nodes-
          https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.96.23.zip
          https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.12.zip
          https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.13.zip
          https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.15.zip
          https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.177.zip
          https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.19.zip
          https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.20.zip
           

          mahesh.mandhare Mahesh Mandhare (Inactive) added a comment - Build 6.5.0-3633 Observed that in high bucket density tests swap rebalance of FTS is failed again with same error- Rebalance exited with reason {service_rebalance_failed,fts, {rebalance_failed, {service_error, <<"nodes: sample, res: (*http.Response)(nil), urlUUID: monitor.UrlUUID{Url:\"http://172.23.96.20:8094\", UUID:\"57a702e0b3ebec5ca47f209844098537\"}, kind: /api/stats, err: Get http://%40fts-cbauth:***@172.23.96.20:8094/api/stats: EOF">>}}}. Rebalance Operation Id = fb7f73c7f76bd2f87888891e0aa3a299 Logs- FTS node coming in-  https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.14.zip FTS node going out-  https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.96.20.zip Other nodes- https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.96.23.zip https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.12.zip https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.13.zip https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.15.zip https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.177.zip https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.19.zip https://s3.amazonaws.com/bugdb/jira/fts_swap_failure/collectinfo-2019-07-05T030032-ns_1%40172.23.97.20.zip  

          People

            Sreekanth Sivasankaran Sreekanth Sivasankaran
            mahesh.mandhare Mahesh Mandhare (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty