Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
7.6.0
-
Enterprise Edition 7.6.0 build 1668
-
Untriaged
-
Centos 64-bit
-
0
-
Yes
-
KV 2023-4
Description
Script to Repro
./sequoia -client 172.23.104.168:2375 -provider file:centos_second_cluster.yml -test tests/integration/7.6/test_7.6.yml -scope tests/integration/7.6/scope_7.6_magma.yml -scale 1 -repeat 0 -log_level 0 -version 7.6.0-1668 -skip_setup=false -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=0 -show_topology=true
|
We were in the beginning of longevity test where we were loading data and rebalancing out a kv node(172.23.96.14)
[2023-10-19T21:46:13-07:00, sequoiatools/pillowfight:7.0:4dd4d3] -U couchbase://172.23.97.74/default?select_bucket=true -M 512 -I 2000 -B 200 -t 1 --rate-limit 1000 -P password --durability majority -c -1 --json
|
[pull] sequoiatools/pillowfight:7.0
|
[2023-10-19T21:46:18-07:00, sequoiatools/pillowfight:7.0:f6f552] -U couchbase://172.23.97.74/WAREHOUSE?select_bucket=true -M 512 -I 2000 -B 200 -t 1 --rate-limit 1000 -P password --durability majority -c -1 --json
|
[pull] sequoiatools/pillowfight:7.0
|
[2023-10-19T21:46:23-07:00, sequoiatools/pillowfight:7.0:f1b5b9] -U couchbase://172.23.97.74/NEW_ORDER?select_bucket=true -M 512 -I 2000 -B 200 -t 1 --rate-limit 1000 -P password --durability majority -c -1 --json
|
[pull] sequoiatools/pillowfight:7.0
|
[2023-10-19T21:46:28-07:00, sequoiatools/pillowfight:7.0:b660e3] -U couchbase://172.23.97.74/ITEM?select_bucket=true -M 512 -I 2000 -B 200 -t 1 --rate-limit 1000 -P password --durability majority -c -1 --json
|
[pull] sequoiatools/pillowfight:7.0
|
[2023-10-19T21:46:33-07:00, sequoiatools/pillowfight:7.0:444cf7] -U couchbase://172.23.97.74/bucket4?select_bucket=true -M 512 -I 2000 -B 200 -t 1 --rate-limit 1000 -P password --durability majority -c -1 --json
|
[pull] sequoiatools/pillowfight:7.0
|
[2023-10-19T21:46:38-07:00, sequoiatools/pillowfight:7.0:7a9690] -U couchbase://172.23.97.74/bucket5?select_bucket=true -M 512 -I 2000 -B 200 -t 1 --rate-limit 1000 -P password --durability majority -c -1 --json
|
[pull] sequoiatools/pillowfight:7.0
|
[2023-10-19T21:46:43-07:00, sequoiatools/pillowfight:7.0:2b4968] -U couchbase://172.23.97.74/bucket6?select_bucket=true -M 512 -I 2000 -B 200 -t 1 --rate-limit 1000 -P password --durability majority -c -1 --json
|
[pull] sequoiatools/pillowfight:7.0
|
[2023-10-19T21:46:48-07:00, sequoiatools/pillowfight:7.0:2e3d8a] -U couchbase://172.23.97.74/bucket7?select_bucket=true -M 512 -I 2000 -B 200 -t 1 --rate-limit 1000 -P password --durability majority -c -1 --json
|
[pull] sequoiatools/pillowfight:7.0
|
[2023-10-19T21:46:53-07:00, sequoiatools/pillowfight:7.0:6d3d0f] -U couchbase://172.23.97.74/bucket8?select_bucket=true -M 512 -I 2000 -B 200 -t 1 --rate-limit 1000 -P password --durability majority -c -1 --json
|
[pull] sequoiatools/pillowfight:7.0
|
[2023-10-19T21:46:58-07:00, sequoiatools/pillowfight:7.0:3dcc91] -U couchbase://172.23.97.74/bucket9?select_bucket=true -M 512 -I 2000 -B 200 -t 1 --rate-limit 1000 -P password --durability majority -c -1 --json
|
[pull] sequoiatools/cmd
|
[2023-10-19T21:47:03-07:00, sequoiatools/cmd:2da9d5] 600
|
[pull] sequoiatools/couchbase-cli:7.1
|
[2023-10-19T21:57:31-07:00, sequoiatools/couchbase-cli:7.1:58a049] rebalance -c 172.23.97.74:8091 --server-remove 172.23.96.14:8091 -u Administrator -p password
|
The rebalance was hung for almost 2 hours. On digging deeper I saw following CRITICAL messages.
172.23.96.122
[root@localhost logs]# grep -a CRITICAL memcached.log.000* | head
|
memcached.log.000003.txt:2023-10-19T22:07:29.899227-07:00 CRITICAL [(WAREHOUSE) magma_0]Fatal error: Duplicate document detected in batch with history mode disabled. Doc:00000000000000000215
|
memcached.log.000003.txt:2023-10-19T22:07:29.999379-07:00 CRITICAL (WAREHOUSE) MagmaKVStore::saveDocs vb:48 WriteDocs failed. Status:Invalid: Duplicate document detected in batch with history mode disabled. Doc:00000000000000000215
|
memcached.log.000003.txt:2023-10-19T22:07:30.000168-07:00 CRITICAL [(WAREHOUSE) magma_0]Fatal error: Duplicate document detected in batch with history mode disabled. Doc:00000000000000000215
|
memcached.log.000003.txt:2023-10-19T22:07:30.100288-07:00 CRITICAL (WAREHOUSE) MagmaKVStore::saveDocs vb:48 WriteDocs failed. Status:Invalid: Duplicate document detected in batch with history mode disabled. Doc:00000000000000000215
|
memcached.log.000003.txt:2023-10-19T22:07:30.101205-07:00 CRITICAL [(WAREHOUSE) magma_0]Fatal error: Duplicate document detected in batch with history mode disabled. Doc:00000000000000000215
|
memcached.log.000003.txt:2023-10-19T22:07:30.201452-07:00 CRITICAL (WAREHOUSE) MagmaKVStore::saveDocs vb:48 WriteDocs failed. Status:Invalid: Duplicate document detected in batch with history mode disabled. Doc:00000000000000000215
|
memcached.log.000003.txt:2023-10-19T22:07:30.202160-07:00 CRITICAL [(WAREHOUSE) magma_0]Fatal error: Duplicate document detected in batch with history mode disabled. Doc:00000000000000000215
|
memcached.log.000003.txt:2023-10-19T22:07:30.302495-07:00 CRITICAL (WAREHOUSE) MagmaKVStore::saveDocs vb:48 WriteDocs failed. Status:Invalid: Duplicate document detected in batch with history mode disabled. Doc:00000000000000000215
|
memcached.log.000003.txt:2023-10-19T22:07:30.303212-07:00 CRITICAL [(WAREHOUSE) magma_0]Fatal error: Duplicate document detected in batch with history mode disabled. Doc:00000000000000000215
|
memcached.log.000003.txt:2023-10-19T22:07:30.403320-07:00 CRITICAL (WAREHOUSE) MagmaKVStore::saveDocs vb:48 WriteDocs failed. Status:Invalid: Duplicate document detected in batch with history mode disabled. Doc:00000000000000000215
|
[root@localhost logs]#
|
cbcollect_info attached. This was not seen on the longevity we had on 7.6.0-1601.