Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Duplicate
Priority: Major
Fix Version/s: 7.0.0
Affects Version/s: Cheshire-Cat
Component/s: ns_server
Labels:
Environment:
7.0.0-4735

Triage:
Untriaged
Operating System:
Centos 64-bit
Link to Log File, atop/blg, CBCollectInfo, Core dump:

Hide
https://cb-engineering.s3.amazonaws.com/add_back_failed/collectinfo-2021-03-24T141138-ns_1%40172.23.100.12.zip
https://cb-engineering.s3.amazonaws.com/add_back_failed/collectinfo-2021-03-24T141138-ns_1%40172.23.100.13.zip
https://cb-engineering.s3.amazonaws.com/add_back_failed/collectinfo-2021-03-24T141138-ns_1%40172.23.100.18.zip
https://cb-engineering.s3.amazonaws.com/add_back_failed/collectinfo-2021-03-24T141138-ns_1%40172.23.100.19.zip

Show
https://cb-engineering.s3.amazonaws.com/add_back_failed/collectinfo-2021-03-24T141138-ns_1%40172.23.100.12.zip https://cb-engineering.s3.amazonaws.com/add_back_failed/collectinfo-2021-03-24T141138-ns_1%40172.23.100.13.zip https://cb-engineering.s3.amazonaws.com/add_back_failed/collectinfo-2021-03-24T141138-ns_1%40172.23.100.18.zip https://cb-engineering.s3.amazonaws.com/add_back_failed/collectinfo-2021-03-24T141138-ns_1%40172.23.100.19.zip
Story Points:
1
Is this a Regression?:
Unknown

Description

Build: 7.0.0-4735

Scenario:

4 node cluster, 2 Couchbase buckets

+---------------+----------+-------+------------+------------+------------------------+-------------------+

| Node          | Services |  CPU  | Mem_total  | Mem_free   | Swap_mem_used          | Active / Replica  |

+---------------+----------+-------+------------+------------+------------------------+-------------------+

| 172.23.100.13 | kv       | 58.53 | 4201684992 | 3464732672 | 91488256 / 3758092288  | 3781 / 7529       |

| 172.23.100.12 | kv       | 58.72 | 4201689088 | 3418853376 | 110215168 / 3758092288 | 3781 / 7800       |

| 172.23.100.18 | kv       | 59.69 | 4201684992 | 3461459968 | 102236160 / 3758092288 | 3783 / 7491       |

| 172.23.100.19 | kv       | 59.50 | 4201684992 | 3445219328 | 115343360 / 3758092288 | 3785 / 7438       |

+---------------+----------+-------+------------+------------+------------------------+-------------------++--------------+-----------+----------+------------+-----+-------+------------+-----------+-----------+

| Bucket       | Type      | Replicas | Durability | TTL | Items | RAM Quota  | RAM Used  | Disk Used |

+--------------+-----------+----------+------------+-----+-------+------------+-----------+-----------+

| couchstore.0 | couchbase | 2        | none       | 0   | 7235  | 2986344448 | 98909336  | 424106110 |

| couchstore.1 | couchbase | 2        | none       | 0   | 7895  | 2986344448 | 101506488 | 513393138 |

+--------------+-----------+----------+------------+-----+-------+------------+-----------+-----------+

Load initial 10K docs into both the buckets through Transaction
Failover 172.23.100.12 node (Id 02b2ab7412476b02141e1cff571d37ad success)
Failover 172.23.100.18 node (Id 6e28ce7a0762769cd03679e696ebd2de success)
Eject .12 and .18 using rebalance-out operation (Id b97763b901d7c36e43737d51d7533a7f)
Try adding back .12 node

Observation:

Adding back node 172.23.100.12 failed with reason,

Failed to connect to http://172.23.100.12:8091. Could not connect to "172.23.100.12" on port 8091. This could be due to an incorrect host/port combination or a firewall in place between the servers.

172.23.100.13 - ns_server.info.log

[user:info,2021-03-24T07:03:33.272-07:00,ns_1@172.23.100.13:<0.29517.0>:ns_cluster:add_node_to_group:89]Failed to add node 172.23.100.12:8091 to cluster. Failed to connect to http://172.23.100.12:8091. Could not connect to "172.23.100.12" on port 8091.  This could be due to an incorrect host/port combination or a firewall in place between the servers.

172.23.100.12 - ns_server.inf.o.log

[ns_server:info,2021-03-24T07:03:33.022-07:00,ns_1@172.23.100.12:ns_couchdb_port<0.22809.1>:ns_port_server:log:224]ns_couchdb<0.22809.1>: 2021-03-24 07:03:32.818687

ns_couchdb<0.22809.1>:     args: ['ns_1@172.23.100.12']

ns_couchdb<0.22809.1>:     format: "** Connection attempt from disallowed node ~w ** ~n"

ns_couchdb<0.22809.1>:     label: {error_logger,error_msg}[ns_server:warn,2021-03-24T07:03:33.023-07:00,ns_1@172.23.100.12:memcached_refresh<0.22736.1>:ns_memcached:connect:1207]Unable to connect: {error,{badmatch,[{inet,{error,econnrefused}}]}}.

[ns_server:info,2021-03-24T07:03:33.228-07:00,ns_1@172.23.100.12:ns_couchdb_port<0.22809.1>:ns_port_server:log:224]ns_couchdb<0.22809.1>: 2021-03-24 07:03:33.027905

ns_couchdb<0.22809.1>:     args: ['ns_1@172.23.100.12']

ns_couchdb<0.22809.1>:     format: "** Connection attempt from disallowed node ~w ** ~n"

ns_couchdb<0.22809.1>:     label: {error_logger,error_msg}

TAF test:

rebalance_new.swaprebalancetests.SwapRebalanceFailedTests:

    test_add_back_failed_node,doc_size=512,standard_buckets=2,transaction_timeout=150,swap-orchestrator=True,num-swap=2,atomicity=True,replicas=2,durability=PERSIST_TO_MAJORITY,nodes_init=4,num_items=10000

Attachments

Issue Links

duplicates

MB-45016 Node add fails with error "Already part of cluster"

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Ashwin Govindarajulu

Reporter:: Ashwin Govindarajulu

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 24/Mar/21 9:18 AM

Updated:: 17/Jun/21 2:49 PM

Resolved:: 26/Mar/21 5:23 PM

Gerrit Reviews

There are no open Gerrit changes

Failed to add node back into cluster after node_failover and add back case

Details

Description

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty