Details
-
Bug
-
Resolution: Fixed
-
Test Blocker
-
Columnar 1.0.0
-
Columnar Edition 1.0.0 build 2190
-
Untriaged
-
0
-
Unknown
-
Analytics Sprint 45, Analytics Sprint 46
Description
During day 3 of system test, the cluster seems to have become unusable and it hasn't come back to healthy state. It's been in this state for almost 1+ hour.
A couple of caveats -
There are a large number of Kafka collections( Not sure https://issues.couchbase.com/browse/MB-61350 is the cause but Kafka ingestion had happened around
(2024-07-03T13:25:04.694+00:00) |
and the cluster becoming unusable messages span from
2024-07-04T06:35:32.116 to 2024-07-04T07:45:13.268+00:00 |
All the timestamps are from node-001. Cluster was fine for around 15 hours. In those 15 hours, 3 rebalances were triggered to go from 4 to 8 and then 8 to 16 and finally 16 to 32 nodes. Because Kafka ingestion was slow it is possible that the cluster went into rebalance state while the ingestion was not complete. I'm alluding to this comment made by Ali here
The workload is as follows -
No. of remote collections 80 (* 50 million per collection) |
Standalone collections 50 ( total count in these is 300 * 8 million. Some collections have 8 million docs and some are in multiples of 8 million) |
Kafka collections 72 (* 10 million per collection) |
|
Total doc count comes up to 7.2 billion documents ( Around 10.6 TB to 14 TB) (approximately assuming doc size of 1.5-2KB) |
Number of links = around 7 ( 3 remote + 2 kafka + 2 external). |
There are other exceptions such as NullPointerException and IllegalStateException on other nodes. I'm not sure if this has anything to do with this. But I'll file tickets for those separately as they seem to be occurring at a different time.
cbcollect ->
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-001.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-002.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-003.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-004.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-005.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-006.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-007.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-008.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-009.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-010.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-011.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-012.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-013.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-014.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-015.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-016.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-017.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-018.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-019.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-020.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-021.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-022.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-023.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-024.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-025.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-026.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-027.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-028.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-029.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-030.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-031.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJul3/collectinfo-2024-07-04T074358-ns_1%40svc-da-node-032.b2yoytucmykunsrf.sandbox.nonprod-project-avengers.com.zip
Supportal snapshot -> http://supportal.couchbase.com/snapshot/b18e5645ce6e07c85b98850b3221e5e4::31
Attachments
Issue Links
- is duplicated by
-
MB-62598 [System Test] java.lang.IllegalStateException: Read unexpected number of bytes error seen
- Closed