Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50775

All docs are not ingested into datasets post upgrade from 7.0.3

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Duplicate
    • 7.1.0
    • 7.1.0
    • analytics
    • Enterprise Edition 7.1.0 build 2215

    Description

      Steps for reproduction-

      1. Have a 4 node cluster, with 2 nodes having KV:N1QL:Index and 2 nodes having CBAS services.
      2. Create a bucket and load some data into it.
      3. Create multiple dataverses, datasets and indexes on cbas.
      4. Let ingestion complete into datasets.
      5. Now upgrade each node to 7.1.0-2215 build, by failing over each node, upgrading the failed over node and adding it back.
      6. Post upgrade rebalance again for CBAS to become active.
      7. Validate all data is present in datasets post upgrade.
      8. Set analytics replica to 3 and rebalance for it to take effect.
      9. Now load more docs in the existing bucket and verify they are ingested by datasets.
      10. Now delete all data from the existing bucket and verify that no data is present in datasets.
      11. Create more buckets with multiple scopes and collections, also create multiple scopes and collections in the existing bucket and load data into them
      12. Create more dataverse, datasets, indexes and synonyms.
      13. Data ingestion did not complete for one of the datasets created post upgrade.

      2022-02-03 15:34:23,497 | test | DEBUG | MainProcess | MainThread | [cbas_utils:execute_statement_on_cbas_util:78] Running query on cbas: create dataset lvBYYAeknEV.OVrtIEOGrM on bucket_3.`xzeoZI6KXga1oJ1d16a7tcnQUc595xsjCOms8Cgl-2O6PcCeQ-faBzSvGyoD5sUynfvR2S6GIPI6ljhw1NvS5lfe16-wGrItCuAhfJlvnHsqucerOGddhcuhVjC2rg5UB0_-55-136000`.`CfhQquKhOAJvlN75Qcgf8kX0S2oe4eEWRhuvHwfdipRl3M8TEd3uV1c5iIIGbgLMvEZtbfIgdqIiwveQD9u3u7KRmc9OB9aq-4IKIZMbEN2M8xvgw_sFnzlprqU%R%pu6I1H7NAuoYeYDrod4DuAZ0D-q0vwwkF2R4rT0Jd-55-339000`;

      14. Dataset lvBYYAeknEV.OVrtIEOGrM has only 9910 docs when it should have 10000.

      15 Ingestion API shows that the ingestion completed.

      {
          "links": [{
                  "name": "Local",
                  "scope": "default/_default",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317291,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                                  "name": "H3b673KZ5m3MToCCIm13EsF"
                              },
                              {
                                  "name": "_default"
                              },
                              {
                                  "name": "bBhc3exunbBzDzUDAtiMrt9C1FX"
                              }
                          ],
                          "name": "default/_default"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "bucket_3/IuUFx01KAF3FtrwJbZBbjYelE-BYo%sp7KjSKScCPqtDq7-vz%9gRlGEZdiGtKO2EA5%kSgNfmI-VdsDsjcLlf4tLyTvg6ZooJgWhSHR_OzzepWyoTCuS7oNTEgs2mlO8y4o7BBFhRM10SELdM5ZJW7DoX1YZw_6BtIUtLcBqhe5D9RyC0VxBFlF8VGzr00QgtxXwkvfLO2MT1ziDbSVAw_Q6bV26GRzRUI4Yd%bN-7EU-54-323000",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317288,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                                  "name": "9dvzu8"
                              },
                              {
                                  "name": "JUEK4NGM-oOaosgo_opzlnI5Lbt3G82%lhzJfTpHeagJh2Zg%HJ57ZJieuvKJ%wT8y7zB4UNS2v-KuByBdFgEX0CCpUpBaSG22vD9RRd1ueAQrqGCs6fK9SpS-54-484000"
                              },
                              {
                                  "name": "Wh3bh3B3kyOsgklEB6crFLyxBy1"
                              }
                          ],
                          "name": "bucket_3/IuUFx01KAF3FtrwJbZBbjYelE-BYo%sp7KjSKScCPqtDq7-vz%9gRlGEZdiGtKO2EA5%kSgNfmI-VdsDsjcLlf4tLyTvg6ZooJgWhSHR_OzzepWyoTCuS7oNTEgs2mlO8y4o7BBFhRM10SELdM5ZJW7DoX1YZw_6BtIUtLcBqhe5D9RyC0VxBFlF8VGzr00QgtxXwkvfLO2MT1ziDbSVAw_Q6bV26GRzRUI4Yd%bN-7EU-54-323000"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "bucket_1/_default",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317291,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                                  "name": "epDDISFRVR5o6CITU9xZ6I"
                              },
                              {
                                  "name": "_default"
                              },
                              {
                                  "name": "tfG1gJiwLRG7Fk"
                              }
                          ],
                          "name": "bucket_1/_default"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "bucket_2/_default",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317288,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                                  "name": "2oZrcl5LpvHfP2HivLRhu"
                              },
                              {
                                  "name": "0JnDkOlUKoXzZpPf"
                              },
                              {
                                  "name": "_default"
                              }
                          ],
                          "name": "bucket_2/_default"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "FCZuN9joiDooVSBPg",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317291,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                              "name": "2gJhSxlLRw74j4eoSTqpgdbwfbh"
                          }],
                          "name": "FCZuN9joiDooVSBPg"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "Fox0nJm5/Ag6",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317286,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                              "name": "bTYU8jAeXrHODOSlNdqFVXnWQmi"
                          }],
                          "name": "Fox0nJm5/Ag6"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "bucket_3/_default",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317288,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                                  "name": "_default"
                              },
                              {
                                  "name": "NtBfjx4DS"
                              }
                          ],
                          "name": "bucket_3/_default"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "0MqRPUW8myT",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317291,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                                  "name": "gDu4yX5uluwbiCZ1fXLxh"
                              },
                              {
                                  "name": "nbNDRV61hMRgfYVEku4ERJwslg68"
                              },
                              {
                                  "name": "4hbsMhpkXhmodEf9NliE"
                              }
                          ],
                          "name": "0MqRPUW8myT"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "W26mxaofTPSp",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317291,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                                  "name": "OvfnPz5WEHfO4aQqOksE"
                              },
                              {
                                  "name": "IkVcUBmt35tOa6hEjqVKS"
                              }
                          ],
                          "name": "W26mxaofTPSp"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "SRPXXZTdX4JAu",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317291,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                              "name": "bZ56s9GTXKgbvw"
                          }],
                          "name": "SRPXXZTdX4JAu"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "h3qWzwynl/Jw7z5YnC9nQRAl",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317288,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                                  "name": "OASPVmboOKAOuO"
                              },
                              {
                                  "name": "ms"
                              },
                              {
                                  "name": "Fnl"
                              }
                          ],
                          "name": "h3qWzwynl/Jw7z5YnC9nQRAl"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "Default",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317291,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                                  "name": "IHypeKqEA3mxAVzjxyHaV4GirBv9UY"
                              },
                              {
                                  "name": "9ZXIu4"
                              },
                              {
                                  "name": "o3"
                              },
                              {
                                  "name": "M1R9Kz20i2PDGekA9q7ohvgbESgtyN"
                              },
                              {
                                  "name": "YJBVJQQsSPggfWL"
                              },
                              {
                                  "name": "yYHnesRgM3sEMINS0dgKf2yk53"
                              },
                              {
                                  "name": "kmXBcuCLIg"
                              }
                          ],
                          "name": "Default"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "bucket_2/-V_%dyayQgCAAI2BncdidMoBgTG0s8ikVD9A-XBnVRrae5LZfMZur2dr3V2A1Ur9oMMLuf6sxZ7pAPzQ55-NOGRhmteizvpdfwZuVSLxalPnGnq3rhORhNf02Xos-VP-IWoIrUDD_m8IHGc-ub7DD9KpX3oA7uNAAfYOPfKD1uE7H%-3-286000",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317284,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                              "name": "6COwRsLVTHrKtQ0kSUbbM7qu9O6YnRlUCUOcKAD9_POd8RH_zeLaHqen9Su9LvkooNUOnY-3-482000"
                          }],
                          "name": "bucket_2/-V_%dyayQgCAAI2BncdidMoBgTG0s8ikVD9A-XBnVRrae5LZfMZur2dr3V2A1Ur9oMMLuf6sxZ7pAPzQ55-NOGRhmteizvpdfwZuVSLxalPnGnq3rhORhNf02Xos-VP-IWoIrUDD_m8IHGc-ub7DD9KpX3oA7uNAAfYOPfKD1uE7H%-3-286000"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "bucket_1/XRXvODt93NE2Uvg3JdK8UHSm9Au2i04LA_XvglcYl8WTbjz%q2iHYN5smIovapzlrRLuGAeOxjTxDI16mfFlmfVC6dtHTOuflY9p62A2AYmw3o6Zymy0X%L0Vap2dmUtgHqlHIE-FL0vB48OUfp7fKAeMJzCsDLbm%uyMRmIQuvpqZXaqdw3KknjA0yAIgPCqkOUIHFSZZzUcjZbS39y-19-684000",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317288,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                                  "name": "0yml8jTsrbb25GM1cije0EaXH"
                              },
                              {
                                  "name": "M-_XaM-xdnOq2-7msiIVCzm8At%2jdf5%RSN6icqRhlFItythxmVkFpZfgSWkAESfrbpPey9gR2YHLSvhnR90O7mKX%-I0vW3rEcyD_zYKZ0dPq9mf%6aQQhv-20-170000"
                              }
                          ],
                          "name": "bucket_1/XRXvODt93NE2Uvg3JdK8UHSm9Au2i04LA_XvglcYl8WTbjz%q2iHYN5smIovapzlrRLuGAeOxjTxDI16mfFlmfVC6dtHTOuflY9p62A2AYmw3o6Zymy0X%L0Vap2dmUtgHqlHIE-FL0vB48OUfp7fKAeMJzCsDLbm%uyMRmIQuvpqZXaqdw3KknjA0yAIgPCqkOUIHFSZZzUcjZbS39y-19-684000"
                      }]
                  }]
              },
              {
                  "name": "Local",
                  "scope": "lvBYYAeknEV",
                  "status": "healthy",
                  "state": [{
                      "timestamp": 1643883317291,
                      "progress": 1.0,
                      "scopes": [{
                          "collections": [{
                                  "name": "4xbAkGi7oZaxeNwxwN67T3t2"
                              },
                              {
                                  "name": "J8ID8dEh2i1nqMA3uQdxcxbE"
                              },
                              {
                                  "name": "OVrtIEOGrM"
                              }
                          ],
                          "name": "lvBYYAeknEV"
                      }]
                  }]
              }
          ]
      }

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Build couchbase-server-7.1.0-2289 contains cbas-core commit b3ccad9 with commit message:
            MB-50775: Log indexes when primary index is not found

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-2289 contains cbas-core commit b3ccad9 with commit message: MB-50775 : Log indexes when primary index is not found

            The missing data issue is most likely caused by MB-50873. I also found in the logs a couple of unexpected failures that might be related to HA, but those failures happened due to a failover that happened after the steps described in the tests. Unfortunately, the logs we had in there weren't enough to trace the issues. I have added more logs in case they happen again. Here are the relevant logs:

            2022-02-03T02:35:46.369-08:00 FATA CBAS.context.PrimaryIndexOperationTracker [Executor-167:cfcd88c38c3c406c4f3b5f0774399fe5] Primary index not found in dataset 118 and partition 0 open indexes [{"class" : "LSMBTree", "dir" : "/opt/couchbase/var/lib/couchbase/data/@analytics/v_iodevice_0/storage/partition_0/W26mxaofTPSp/OvfnPz5WEHfO4aQqOksE/0/XsbEd", "memory" : [{"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}, {"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}], "disk" : 0, "num-scheduled-flushes":0, "current-memory-component":0}]; halting to clear memory state
            

            and

            2022-02-03T02:39:23.716-08:00 ERRO CBAS.message.RegistrationTasksResponseMessage [Executor-12:cfcd88c38c3c406c4f3b5f0774399fe5] Failed during startup task
            org.apache.hyracks.api.exceptions.HyracksDataException: org.apache.asterix.common.exceptions.ACIDException: Primary index not found
            	at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:49) ~[hyracks-api-7.1.0-2215.jar:7.1.0-2215]
            	at org.apache.asterix.app.nc.task.LocalRecoveryTask.perform(LocalRecoveryTask.java:47) ~[asterix-app-7.1.0-2215.jar:7.1.0-2215]
            	at org.apache.asterix.app.replication.message.RegistrationTasksResponseMessage.handle(RegistrationTasksResponseMessage.java:63) ~[asterix-app-7.1.0-2215.jar:7.1.0-2215]
            	at org.apache.asterix.messaging.NCMessageBroker.lambda$receivedMessage$0(NCMessageBroker.java:108) ~[asterix-app-7.1.0-2215.jar:7.1.0-2215]
            	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
            	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
            	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
            	at java.lang.Thread.run(Thread.java:829) [?:?]
            Caused by: org.apache.asterix.common.exceptions.ACIDException: Primary index not found
            	at com.couchbase.analytics.bootstrap.AnalyticsLocalRecoveryManager.getPrimary(AnalyticsLocalRecoveryManager.java:193) ~[cbas-server-7.1.0-2215.jar:7.1.0-2215]
            	at com.couchbase.analytics.bootstrap.AnalyticsLocalRecoveryManager.recover(AnalyticsLocalRecoveryManager.java:114) ~[cbas-server-7.1.0-2215.jar:7.1.0-2215]
            	at com.couchbase.analytics.bootstrap.AnalyticsLocalRecoveryManager.cleanUp(AnalyticsLocalRecoveryManager.java:92) ~[cbas-server-7.1.0-2215.jar:7.1.0-2215]
            	at com.couchbase.analytics.bootstrap.AnalyticsLocalRecoveryManager.startLocalRecovery(AnalyticsLocalRecoveryManager.java:54) ~[cbas-server-7.1.0-2215.jar:7.1.0-2215]
            	at org.apache.asterix.app.nc.task.LocalRecoveryTask.perform(LocalRecoveryTask.java:45) ~[asterix-app-7.1.0-2215.jar:7.1.0-2215]
            	... 7 more
            

            murtadha.hubail Murtadha Hubail added a comment - The missing data issue is most likely caused by MB-50873 . I also found in the logs a couple of unexpected failures that might be related to HA, but those failures happened due to a failover that happened after the steps described in the tests. Unfortunately, the logs we had in there weren't enough to trace the issues. I have added more logs in case they happen again. Here are the relevant logs: 2022-02-03T02:35:46.369-08:00 FATA CBAS.context.PrimaryIndexOperationTracker [Executor-167:cfcd88c38c3c406c4f3b5f0774399fe5] Primary index not found in dataset 118 and partition 0 open indexes [{"class" : "LSMBTree", "dir" : "/opt/couchbase/var/lib/couchbase/data/@analytics/v_iodevice_0/storage/partition_0/W26mxaofTPSp/OvfnPz5WEHfO4aQqOksE/0/XsbEd", "memory" : [{"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}, {"class":"LSMBTreeMemoryComponent", "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null"}], "disk" : 0, "num-scheduled-flushes":0, "current-memory-component":0}]; halting to clear memory state and 2022-02-03T02:39:23.716-08:00 ERRO CBAS.message.RegistrationTasksResponseMessage [Executor-12:cfcd88c38c3c406c4f3b5f0774399fe5] Failed during startup task org.apache.hyracks.api.exceptions.HyracksDataException: org.apache.asterix.common.exceptions.ACIDException: Primary index not found at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:49) ~[hyracks-api-7.1.0-2215.jar:7.1.0-2215] at org.apache.asterix.app.nc.task.LocalRecoveryTask.perform(LocalRecoveryTask.java:47) ~[asterix-app-7.1.0-2215.jar:7.1.0-2215] at org.apache.asterix.app.replication.message.RegistrationTasksResponseMessage.handle(RegistrationTasksResponseMessage.java:63) ~[asterix-app-7.1.0-2215.jar:7.1.0-2215] at org.apache.asterix.messaging.NCMessageBroker.lambda$receivedMessage$0(NCMessageBroker.java:108) ~[asterix-app-7.1.0-2215.jar:7.1.0-2215] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by: org.apache.asterix.common.exceptions.ACIDException: Primary index not found at com.couchbase.analytics.bootstrap.AnalyticsLocalRecoveryManager.getPrimary(AnalyticsLocalRecoveryManager.java:193) ~[cbas-server-7.1.0-2215.jar:7.1.0-2215] at com.couchbase.analytics.bootstrap.AnalyticsLocalRecoveryManager.recover(AnalyticsLocalRecoveryManager.java:114) ~[cbas-server-7.1.0-2215.jar:7.1.0-2215] at com.couchbase.analytics.bootstrap.AnalyticsLocalRecoveryManager.cleanUp(AnalyticsLocalRecoveryManager.java:92) ~[cbas-server-7.1.0-2215.jar:7.1.0-2215] at com.couchbase.analytics.bootstrap.AnalyticsLocalRecoveryManager.startLocalRecovery(AnalyticsLocalRecoveryManager.java:54) ~[cbas-server-7.1.0-2215.jar:7.1.0-2215] at org.apache.asterix.app.nc.task.LocalRecoveryTask.perform(LocalRecoveryTask.java:45) ~[asterix-app-7.1.0-2215.jar:7.1.0-2215] ... 7 more

            I'm resolving this as a duplicate of MB-50873 but we should keep an eye of the possible issues above.

            murtadha.hubail Murtadha Hubail added a comment - I'm resolving this as a duplicate of MB-50873 but we should keep an eye of the possible issues above.
            umang.agrawal Umang added a comment -

            Closing this as it is covered in another issue.

            umang.agrawal Umang added a comment - Closing this as it is covered in another issue.

            People

              umang.agrawal Umang
              umang.agrawal Umang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty