Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
Cheshire-Cat
-
7.0.0-4908
-
Untriaged
-
1
-
Unknown
-
Magma 2021-April, May 20
Description
Steps:
1. Create a 4 node cluster
2. Create magma bucket and load 5M items.
3. Fill up the disk. Leave 200MB on all the nodes in the data path.
4. Increase replica to 3 and rebalance.
5. Rebalance hung due to disk full on all the nodes.
6. Stop Rebalance operation and read the data.
7. Update bucket replica to 0 and rebalance. Rebalance completed.
Below messages in the logs are observed while rebalance is hung in step 5:
2021-04-14T04:04:37.411234-07:00 CRITICAL (default) (default) magma_0 MagmaKVStore::prepareToDeleteImpl vb:784 GetKVStoreRevision failed. Status:NotExists: KVStore ID:784 does not exist
|
2021-04-14T04:04:37.489561-07:00 CRITICAL (default) (default) magma_7 MagmaKVStore::prepareToDeleteImpl vb:263 GetKVStoreRevision failed. Status:NotExists: KVStore ID:263 does not exist
|
2021-04-14T04:04:37.502326-07:00 CRITICAL (default) (default) magma_3 MagmaKVStore::prepareToDeleteImpl vb:171 GetKVStoreRevision failed. Status:NotExists: KVStore ID:171 does not exist
|
QE Test |
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/magma_temp_job4.ini sdk_timeout=60,bucket_eviction_policy=fullEviction,randomize_value=True,doc_size=1024,bucket_storage=magma,enable_dp=true -t magma.magma_disk_full.MagmaDiskFull.test_disk_full_on_increasing_replica,nodes_init=4,num_items=5000000,doc_size=4096,sdk_timeout=60,replicas=1,durability=majority,GROUP=P0'
|
|
Test Input params:
|
{'doc_size': '1024', 'conf_file': 'conf/magma/disk_full.conf', 'spec': 'disk_full', 'num_nodes': 4, 'rerun': False, 'GROUP': 'P0', 'enable_dp': 'true', 'sdk_timeout': '60', 'case_number': 17, 'cluster_name': 'magma_temp_job4', 'ini': '/tmp/magma_temp_job4.ini', 'replicas': '1', 'durability': 'majority', 'bucket_storage': 'magma', 'bucket_eviction_policy': 'fullEviction', 'logs_folder': '/data/workspace/magma_temp_job4/logs/testrunner-21-Apr-13_23-09-33/test_17', 'nodes_init': '4', 'num_items': '5000000', 'randomize_value': 'True'}
|
I don't think this is an issue. We can see the kvstores that do not exist failed to create since the disk was full. We can see that in the logs of the node.