Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-53763

[Intermittent] Indexes created without replica for serverless databases on Elixir environment

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Not a Bug
    • Elixir
    • Elixir
    • secondary-index
    • None
    • Elixir
    • Untriaged
    • 1
    • Unknown

    Description

      It's a fairly simple test. Create an index on a serverless DB (on dev env). At times we see that indexes get created without replicas. Log bundles can be accessed here -> 

       

       

      s3://cb-customers-secure/indexer_replica/2022-09-21/collectinfo-2022-09-21t160032-ns_1@auj6inhjnuke9uws.p1a6-oc9svjjxux.nonprod-project-avengers.com.zip
      s3://cb-customers-secure/indexer_replica/2022-09-21/collectinfo-2022-09-21t160032-ns_1@dicrvmpucktlvafe.p1a6-oc9svjjxux.nonprod-project-avengers.com.zip
      s3://cb-customers-secure/indexer_replica/2022-09-21/collectinfo-2022-09-21t160032-ns_1@gtmlliure9laseoe.p1a6-oc9svjjxux.nonprod-project-avengers.com.zip
      s3://cb-customers-secure/indexer_replica/2022-09-21/collectinfo-2022-09-21t160032-ns_1@izbskrqnl-ciuvfx.p1a6-oc9svjjxux.nonprod-project-avengers.com.zip
      s3://cb-customers-secure/indexer_replica/2022-09-21/collectinfo-2022-09-21t160032-ns_1@kkiircgwuw9vcdab.p1a6-oc9svjjxux.nonprod-project-avengers.com.zip
      s3://cb-customers-secure/indexer_replica/2022-09-21/collectinfo-2022-09-21t160032-ns_1@r9sjyyhha5cilp9h.p1a6-oc9svjjxux.nonprod-project-avengers.com.zip
      s3://cb-customers-secure/indexer_replica/2022-09-21/collectinfo-2022-09-21t160032-ns_1@vnyjczjipwgwhpwe.p1a6-oc9svjjxux.nonprod-project-avengers.com.zip
      s3://cb-customers-secure/indexer_replica/2022-09-21/collectinfo-2022-09-21t160032-ns_1@xhzvzsohhbahag7c.p1a6-oc9svjjxux.nonprod-project-avengers.com.zip
      s3://cb-customers-secure/indexer_replica/2022-09-21/collectinfo-2022-09-21t160032-ns_1@ypkrknok17lxc3.p1a6-oc9svjjxux.nonprod-project-avengers.com.zip 

       

       

      Excerpts from the log on the host dicrvmpucktlvafe.p1a6-oc9svjjxux.nonprod-project-avengers.com (Please use these timestamps) 

      [root@ip-10-0-18-121 logs]# grep -nr idx0_db0_503 indexer.log 
      88214:2022-09-21T14:27:36.185+00:00 [Info] clustMgrAgent::OnIndexCreate: Notification received for Create Index: instId 8089799397347427878, indexDefn DefnId: 5112599944789795479 Name: idx0_db0_503 Using: plasma Bucket: 772309c5-2ccb-42a5-b8cb-505cde519465-qmvhyc Scope/Id: _default/0 Collection/Id: _default/0 IsPrimary: false NumReplica: 1 InstVersion: 0 
      88222:	Defn: DefnId: 5112599944789795479 Name: idx0_db0_503 Using: plasma Bucket: 772309c5-2ccb-42a5-b8cb-505cde519465-qmvhyc Scope/Id: _default/0 Collection/Id: _default/0 IsPrimary: false NumReplica: 1 InstVersion: 0 
      88237:2022-09-21T14:27:36.188+00:00 [Info] LSS /var/cb/data/@2i/772309c5-2ccb-42a5-b8cb-505cde519465-qmvhyc_idx0_db0_503_8089799397347427878_0.index/docIndex(shard14) : LSSCleaner initialized
      88238:2022-09-21T14:27:36.188+00:00 [Info] Shard /var/cb/data/@2i/shards/shard14(14) : LSSCtx Created Successfully. Path=/var/cb/data/@2i/772309c5-2ccb-42a5-b8cb-505cde519465-qmvhyc_idx0_db0_503_8089799397347427878_0.index/docIndex
      88239:2022-09-21T14:27:36.190+00:00 [Info] LSS /var/cb/data/@2i/772309c5-2ccb-42a5-b8cb-505cde519465-qmvhyc_idx0_db0_503_8089799397347427878_0.index/docIndex/recovery(shard14) : LSSCleaner initialized for recovery
      88240:2022-09-21T14:27:36.190+00:00 [Info] Shard /var/cb/data/@2i/shards/shard14(14) : LSSCtx Created Successfully. Path=/var/cb/data/@2i/772309c5-2ccb-42a5-b8cb-505cde519465-qmvhyc_idx0_db0_503_8089799397347427878_0.index/docIndex/recovery
      88241:2022-09-21T14:27:36.190+00:00 [Info] Shard /var/cb/data/@2i/shards/shard14(14) : Map plasma instance (/var/cb/data/@2i/772309c5-2ccb-42a5-b8cb-505cde519465-qmvhyc_idx0_db0_503_8089799397347427878_0.index/docIndex) to LSS (/var/cb/data/@2i/772309c5-2ccb-42a5-b8cb-505cde519465-qmvhyc_idx0_db0_503_8089799397347427878_0.index/docIndex) and RecoveryLSS (/var/cb/data/@2i/772309c5-2ccb-42a5-b8cb-505cde519465-qmvhyc_idx0_db0_503_8089799397347427878_0.index/docIndex/recovery)
      88242:2022-09-21T14:27:36.190+00:00 [Info] 772309c5-2ccb-42a5-b8cb-505cde519465-qmvhyc/idx0_db0_503/Backstore#8089799397347427878:0 Plasma: create new dedicated controller 

       

      Screenshot of the Indexes tab attached. I've also checked the index metadata (on any of the other dataplane nodes) to confirm that there are indeed no replicas.

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          Pavan PB If there are two indexers in same server group, then replica will not be created. Indexer nodes will have to be in different server groups for replica to be created.

          varun.velamuri Varun Velamuri added a comment - Pavan PB If there are two indexers in same server group, then replica will not be created. Indexer nodes will have to be in different server groups for replica to be created.
          pavan.pb Pavan PB added a comment - - edited

          Varun Velamuri , they do have indexers in different server groups. I just reran the test. Take a look at these screenshots.

          Indexes are getting created without replicas even if there are indexer nodes on different groups. Amit Kulkarni, could you please someone to look into this?

          Thanks,
          Pavan

          pavan.pb Pavan PB added a comment - - edited Varun Velamuri , they do have indexers in different server groups. I just reran the test. Take a look at these screenshots. Indexes are getting created without replicas even if there are indexer nodes on different groups. Amit Kulkarni , could you please someone to look into this? Thanks, Pavan

          Pavan PB, indexer is going to create sub-clusters of 2 index nodes each. This is different from KV(which considers 3 nodes as a subcluster). Accordingly, control plane needs to allocate indexer nodes in multiples of 2. This cluster has 3 index service nodes. Please file a ticket with control plane to get the allocation policy fixed.

          So what happens when there is a situation where index nodes are not in multiple of 2. This can happen e.g. in case one node has failed over and control plane is yet to provide a replacement node. User can create an index during that duration. So indexer allows such create index to go through and when CP provides the next replacement node, replicas will be repaired during rebalance.

          deepkaran.salooja Deepkaran Salooja added a comment - Pavan PB , indexer is going to create sub-clusters of 2 index nodes each. This is different from KV(which considers 3 nodes as a subcluster). Accordingly, control plane needs to allocate indexer nodes in multiples of 2. This cluster has 3 index service nodes. Please file a ticket with control plane to get the allocation policy fixed. So what happens when there is a situation where index nodes are not in multiple of 2. This can happen e.g. in case one node has failed over and control plane is yet to provide a replacement node. User can create an index during that duration. So indexer allows such create index to go through and when CP provides the next replacement node, replicas will be repaired during rebalance.
          pavan.pb Pavan PB added a comment -

          An AV ticket has been filed to fix the provisioning part. https://couchbasecloud.atlassian.net/browse/AV-44397

          pavan.pb Pavan PB added a comment - An AV ticket has been filed to fix the provisioning part. https://couchbasecloud.atlassian.net/browse/AV-44397

          Closing the ticket as it's not a bug

          hemant.rajput Hemant Rajput added a comment - Closing the ticket as it's not a bug

          People

            deepkaran.salooja Deepkaran Salooja
            pavan.pb Pavan PB
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty