Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-38108

cbbackupmgr restore (sqlite) throughput heavy DGM test failing in cc build (post 7.0.0-1180)

    XMLWordPrintable

Details

    • Untriaged
    • Unknown

    Description

      cbbackupmgr restore (sqlite) throughput heavy DGM test failing in cc build (post 7.0.0-1180) while same ccbacupmgr restore non DGM test runs fine for all the builds 

       

       

       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Patrick Varley apparently these tests are failing when the test starts taking backup of 400M docs . Looking at the logs backup seem to be hung for a very long time . Can someone take a look at it while I triage it further (to see what is happening during backup ). 

            sharath.sulochana Sharath Sulochana (Inactive) added a comment - Patrick Varley  apparently these tests are failing when the test starts taking backup of 400M docs . Looking at the logs backup seem to be hung for a very long time . Can someone take a look at it while I triage it further (to see what is happening during backup ). 

            I believe this is a known issue James Lee has investigated and tracked down the issue to a gocbcore bug, and he is waiting to upgrade to the new release of gocbcore.

            carlos.gonzalez Carlos Gonzalez Betancort (Inactive) added a comment - I believe this is a known issue James Lee has investigated and tracked down the issue to a gocbcore bug, and he is waiting to upgrade to the new release of gocbcore.
            owend Daniel Owen added a comment -

            James Lee Could you confirm? - thanks

            owend Daniel Owen added a comment - James Lee Could you confirm? - thanks
            james.lee James Lee added a comment -

            Carlos Gonzalez Betancort is correct, I've had a glance at some of the failures on Leto, they don't successfully perform the first backup (which is the one we restore). This is due to the collection aware buffer acknowledgment being broken (GOCBC-774). Although this issue has been fixed we are blocked on upgrading to gocbcore.v8 (MB-37751) due to GOCBC-799 being an API breaking change (it won't make it into the a v8 release).

            james.lee James Lee added a comment - Carlos Gonzalez Betancort is correct, I've had a glance at some of the failures on Leto, they don't successfully perform the first backup (which is the one we restore). This is due to the collection aware buffer acknowledgment being broken ( GOCBC-774 ). Although this issue has been fixed we are blocked on upgrading to gocbcore.v8 ( MB-37751 ) due to GOCBC-799 being an API breaking change (it won't make it into the a v8 release).
            james.lee James Lee added a comment -

            I'm just working on getting a patch into both backup/manifest which will update gocbcore to the latest tagged release (v8.0.3). Once merged I will kick of a 400M DGM backup/restore to verify that the tests run successfully.

            james.lee James Lee added a comment - I'm just working on getting a patch into both backup/manifest which will update gocbcore to the latest tagged release (v8.0.3). Once merged I will kick of a 400M DGM backup/restore to verify that the tests run successfully.
            james.lee James Lee added a comment -

            Hi Sharath Sulochana,

            I've opened CBPS-750. It's related to this issue because at the moment, we have no way telling why cbbackupmgr is crashing when doing the 400M restore; I am unable to reproduce the behavior locally with a scaled dataset. Please could you provide the logs for an unsuccessful 400M DGM restore.

            Thanks in advance,
            James

            james.lee James Lee added a comment - Hi Sharath Sulochana , I've opened CBPS-750. It's related to this issue because at the moment, we have no way telling why cbbackupmgr is crashing when doing the 400M restore; I am unable to reproduce the behavior locally with a scaled dataset. Please could you provide the logs for an unsuccessful 400M DGM restore. Thanks in advance, James
            james.lee James Lee added a comment -

            Marking as fixed and closing since we now have a successful run; which was expected after the update to the master of gocbcore.

            james.lee James Lee added a comment - Marking as fixed and closing since we now have a successful run; which was expected after the update to the master of gocbcore.

            People

              james.lee James Lee
              sharath.sulochana Sharath Sulochana (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty