Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-60781

[Rebalance] : Rebalance exited with reason {service_rebalance_failed,fts,{agent_died,<35544.11597.3>,{lost_connection,{'ns_1@172.23.105.199',shutdown}}}}.

    XMLWordPrintable

Details

    Description

      Steps to reproduce

      1. Created a magma bucket named "test-bucket" with expiry timeout = 1
      2. Loaded 10k documents onto it - Had fields as follows
        1. content - a text field
        2. id - doc_id 
        3. type - was populated with "vector"
        4. vector - a vector of dimensions 2048
      3. Post that continuously mutated docs. At this point there is expiry + mutations going around
      4. Created a vector search index for the bucket with 6 partitions and 1 replica - All the fields were indexed
      5. Ran queries continuously - Query attached
      6. Failed over a node and added it back

      Rebalance fails 

      Rebalance exited with reason {service_rebalance_failed,fts,{agent_died,<35544.11597.3>,{lost_connection,{'ns_1@172.23.105.199',shutdown}}}}.

      Post that seeing continuous fts crashes

      Service 'fts' exited with status 1. Restarting. Messages:
      2024-02-13T23:45:24.033-08:00 [INFO] (GOCBCORE) Creating new dcp agent: &{UserAgent:fts:stats-test_bucket-13704593 BucketName:test_bucket SeedConfig:{HTTPAddrs:[127.0.0.1:8091] MemdAddrs:[] SRVRecord:<nil>} SecurityConfig:{UseTLS:false TLSRootCAProvider:0xc21780 NoTLSSeedNode:true Auth:0x2c5db00 AuthMechanisms:[]} CompressionConfig:{Enabled:false DisableDecompression:false MinSize:0 MinRatio:0} ConfigPollerConfig:{HTTPRedialPeriod:0s HTTPRetryDelay:0s HTTPMaxWait:0s CccpMaxWait:0s CccpPollPeriod:0s} IoConfig:{NetworkType:default UseMutationTokens:false UseDurations:false UseOutOfOrderResponses:false DisableXErrorHello:false DisableJSONHello:false DisableSyncReplicationHello:false EnablePITRHello:false UseCollections:true} KVConfig:{ConnectTimeout:7s ServerWaitBackoff:0s PoolSize:0 MaxQueueSize:0 ConnectionBufferSize:20971520} HTTPConfig:{MaxIdleConns:300 MaxIdleConnsPerHost:100 ConnectTimeout:1m0s IdleConnectionTimeout:0s} DCPConfig:{AgentPriority:0 UseExpiryOpcode:false UseStreamID:true UseOSOBackfill:true BackfillOrder:1 BufferSize:20971520 DisableBufferAcknowledgement:false}}
      2024-02-13T23:45:24.032-08:00 [FATA] scorch AsyncError, path: /opt/couchbase/var/lib/couchbase/data/@fts/test_bucket._default.test_index_59b164db4d1a8905_aa574717.pindex/store, treating this as fatal, err: merging err: merging failed: Error in faiss::Index* faiss::read_index(IOReader*, int) at /home/couchbase/jenkins/workspace/couchbase-server-unix/faiss/faiss/impl/index_read.cpp:1025: Index type 0x00000001 ("\x01\x00\x00\x00") not recognized, stack dump: /opt/couchbase/var/lib/couchbase/data/@fts/dumps/1707896724.fts.stack.dump.txt -- main.initBleveOptions.func2() at init_bleve.go:113 

      Service 'fts' exited with status 1. Restarting. Messages: 2024-02-13T23:45:35.001-08:00 [INFO] pindex_bleve: started runBatchWorker: 2 for pindex: test_bucket._default.test_index_59b164db4d1a8905_fdd087c2 2024-02-13T23:45:35.001-08:00 [INFO] pindex_bleve: started runBatchWorker: 3 for pindex: test_bucket._default.test_index_59b164db4d1a8905_fdd087c2 2024-02-13T23:45:35.004-08:00 [INFO] ctl: cfgEvent, kind: indexDefs 2024-02-13T23:45:35.004-08:00 [INFO] cfg_metakv: metaKVCallback, path: /fts/cbgt/cfg/lastRebalanceStatusKey, key: lastRebalanceStatusKey, deletion: false 2024-02-13T23:45:35.037-08:00 [FATA] scorch AsyncError, path: /opt/couchbase/var/lib/couchbase/data/@fts/test_bucket._default.test_index_59b164db4d1a8905_aa574717.pindex/store, treating this as fatal, err: merging err: merging failed: Error in faiss::Index* faiss::read_index(IOReader*, int) at /home/couchbase/jenkins/workspace/couchbase-server-unix/faiss/faiss/impl/index_read.cpp:1025: Index type 0x00000001 ("\x01\x00\x00\x00") not recognized, stack dump: /opt/couchbase/var/lib/couchbase/data/@fts/dumps/1707896735.fts.stack.dump.txt -- main.initBleveOptions.func2() at init_bleve.go:113
      

       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              raghav.sk Raghav S K
              raghav.sk Raghav S K
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty