Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-59005

Temporal inconsistency between Indexer and Projector when fetching latest active vbucket mappings

    XMLWordPrintable

Details

    • 0

    Description

      Indexer and projector rely on their local ns_server to fetch the latest active vbucket mappings. During a KV rebalance or failover, active vbucket mappings change and the updated state can take a while to be replicated across all ns_servers. It can lead to a case where the projector node's ns_server receives the update first before an indexer node's ns_server. In general, this is not an issue but a specific chain of events with a race condition can cause the MAINT stream to be stuck in a repair loop, failing any index builds that rely on the affected MAINT stream.

       

      The chain of events is as follows,

       

      1. An event triggers a change in the active vbucket ownership (e.g. KV rebalance or a failover)
      2. Indexer receives a vbucket X MAINT stream end from the old KV node (A)
      3. Indexer attempts stream repair on the new KV node (B). [Ownership of vbucket X is updated in the Indexer's local ns_server]
      4. As the rebalance is not complete, the indexer receives feed.invalidBucket from B.
      5. Indexer initiates stream repair of all vbuckets with MTR to all kv nodes
      6. Indexer receives a vbucket Y MAINT StreamBegin from B [Due to the above MTR] before the StreamEnd from A (race condition). The StreamBegin was sent as B's ns_server is updated with the latest ownership of vbucket Y.
      7. Indexer treats the Y's StreamBegin as a duplicate and tries to repair vbucket Y by setting its repair state to RESTART_VB and vbucket state to CONN_ERR
      8. Indexer finds the projector addresses of vbucket Y using the terse-bucket endpoint (dcp/pool.go/RefreshBucket]). This talks to the local ns_server which returns A's address. [Delay in ownership update. Could be due to disk latency] (second race condition)
      9. Indexer sends vbucket Y shutdown request to A and receives a stream end from A, changing the vbucket Y status to SHUTDOWN_VB.
      10. Indexer's local ns_server is now updated with Y's ownership.
      11. Indexer now tries to start the vbucket Y but as the B's projector already sent the StreamBegin, the request is ignored. Y's SHUTDOWN_VB state prevents it from being picked again for shutdown thus resulting in the indexer trying to start the stream every 1 min and the projector ignoring it (repair loop).
      12. After 30 min, as there is no streamBegin, the Indexer uses the MTR for stream repair. But Y's state is still >= SHUTDOWN_VB. So a shutdown request is not issued and the projector ignores all further streambegin requests.

       

      Possible solutions:

      1. Send the vbucket mapping as seen by the indexer node to the projector during the MTR. Projector only sends the stream begins for the intersection of active vbucket mappings from its local ns_server and the mappings sent by the indexer. Missing stream begins should be retried later.
      2. Indexer to again change the vbucket status to CONN_ERR and repair state to RESTART_VB after a timeout.
      3. Have chronicle/ns_server support "Read committed" level on the terse bucket endpoint.

       

      Fixing the repair loop manually: 

      If such an issue occurs, an immediate fix would be to restart the projector to which the indexer is requesting for vbucket restart but the projector doesn't send a streamBegin.

      We see the below log continuously every minute as the indexer is on a retry loop due to state corruption. 

      indexer.log
      KVSender::sendRestartVbuckets Projector <projector> Topic MAINT_STREAM_TOPIC_<id> <bucket> <bucket>  
      ... every 1 min 

       projector.log
      FEED[<=>MAINT_STREAM_TOPIC_<id>(ip)] <> start-timestamp bucket: <bucket>, scope :, collectionIDs: [], vbuckets: 0 -      {vbno, vbuuid, manifest, seqno, snapshot-start, snapshot-end}
      ... every 1 min
      

       

      Restarting the projector should reset the state and allow for the stream to begin from the projector. Until the projector is restarted, the indexer node cannot build an index on the affected bucket.

       

      Related MB's: 

      1. https://issues.couchbase.com/browse/MB-54667 --stream repair stuck due to similar state corruption but with a different chain of events
      2. https://issues.couchbase.com/browse/MB-51636 --stream repair stuck due to similar state corruption. In this case, the shutdown is sent to the old kv node (same as this MB) but the stale cache was likely the issue which was later disabled as the fix. However, the core issue still remains – Indexer and projector talking to their local ns_servers which are not synced.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            harikishore.chaparala Hari Kishore (Inactive)
            harikishore.chaparala Hari Kishore (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty