Uploaded image for project: 'Couchbase Lite'
  1. Couchbase Lite
  2. CBL-1920

Crash when a query is destructed and unregistered from the database

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.8.0
    • 3.0
    • iOS
    • Security Level: Public
    • None
    • Jay 51, Jay 52, Jay 53
    • 3

    Description

      There are a few users reporting the same issue that they experience multiple crashes in the query when the query is released. All of them said that they don't close the database in their application either. Currently, they don't have reproducible steps and mostly the issue happens to their end-users.

      As this happens to multiple users, I think we should at least review the issue and code from our side.

      Issue: https://github.com/couchbase/couchbase-lite-ios/issues/2781

      Crash Log: https://pastebin.com/raw/dBwFGR4d

      Exception : EXC_BAD_ACCESS KERN_INVALID_ADDRESS

      Forum Post: https://forums.couchbase.com/t/c4error-code-2-ios-application-crash/29921/5

      Attachments

        1. bt-all.txt
          13 kB
        2. swift.zip
          32.50 MB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            pasin Pasin Suriyentrakorn created issue -
            pasin Pasin Suriyentrakorn made changes -
            Field Original Value New Value
            Description There are a few users reporting the same issue that they experience multiple crashes in the query when the query is released. All of them said that they don't close the database in their application either. Currently, they don't have reproducible steps and mostly the issue happens to their end-users.

            As this happens to multiple users, I think we should at least review the issue and code from our side.

            Issue: [https://github.com/couchbase/couchbase-lite-ios/issues/2781]

            Crash Log: [https://pastebin.com/raw/dBwFGR4d
            ]

            Exception : EXC_BAD_ACCESS KERN_INVALID_ADDRESS

            Forum Post: [https://forums.couchbase.com/t/c4error-code-2-ios-application-crash/29921/5]
            There are a few users reporting the same issue that they experience multiple crashes in the query when the query is released. All of them said that they don't close the database in their application either. Currently, they don't have reproducible steps and mostly the issue happens to their end-users.

            As this happens to multiple users, I think we should at least review the issue and code from our side.

            Issue: [https://github.com/couchbase/couchbase-lite-ios/issues/2781]

            Crash Log: [https://pastebin.com/raw/dBwFGR4d]

            Exception : EXC_BAD_ACCESS KERN_INVALID_ADDRESS

            Forum Post: [https://forums.couchbase.com/t/c4error-code-2-ios-application-crash/29921/5]

            From Jim Borden

            > It looks like what happened is that _keyStore->dataFile() returned null

            > Nevermind about the null there, it looks like someone in the middle was posting a different issue
             
            > It's still a possibility though since that last frame is inlined
             
            > It's a reference so it cannot exactly become null but it might have been deleted for some reason....)

            > I get a similar error by calling unregisterQuery on a null data file

            pasin Pasin Suriyentrakorn added a comment - From Jim Borden > It looks like what happened is that  _keyStore->dataFile()  returned null > Nevermind about the null there, it looks like someone in the middle was posting a different issue   > It's still a possibility though since that last frame is inlined   > It's a reference so it cannot exactly become null but it might have been deleted for some reason....) > I get a similar error by calling unregisterQuery on a null data file

            One user suspects that the issue might have something related to the closing of the database own by the replicator. From my understanding is that the replicator has its own copy (reopenned) of the database so the closing of the database by the replicator shouldn't be related. BUT I want to post the info here as I don't know if there is anything that could be shared between the original database and the copied version. 

            pasin Pasin Suriyentrakorn added a comment - One user suspects that the issue might have something related to the closing of the database own by the replicator. From my understanding is that the replicator has its own copy (reopenned) of the database so the closing of the database by the replicator shouldn't be related. BUT I want to post the info here as I don't know if there is anything that could be shared between the original database and the copied version. 
            Jayahari.Vavachan Jay Vavachan made changes -
            Attachment swift.zip [ 139548 ]
            Jayahari.Vavachan Jay Vavachan added a comment - - edited

            Attached the todo app swift.zip to reproduce the issue reported.

            • only on device
            • disable logs
            • if ran via debug build, change the optimizations to match the release build.

            1. logged in
            2. create a list
            3. create some tasks.
            4. keep the tasks list page open

            Jayahari.Vavachan Jay Vavachan added a comment - - edited Attached the todo app swift.zip to reproduce the issue reported. only on device disable logs if ran via debug build, change the optimizations to match the release build. 1. logged in 2. create a list 3. create some tasks. 4. keep the tasks list page open
            Jayahari.Vavachan Jay Vavachan made changes -
            Attachment bt-all.txt [ 139581 ]
            Jayahari.Vavachan Jay Vavachan added a comment -

            Attached the back trace with details bt-all.txt

            Jayahari.Vavachan Jay Vavachan added a comment - Attached the back trace with details bt-all.txt
            Jayahari.Vavachan Jay Vavachan made changes -
            Link This issue relates to CBL-1898 [ CBL-1898 ]
            Jayahari.Vavachan Jay Vavachan made changes -
            Component/s iOS [ 15348 ]
            pasin Pasin Suriyentrakorn added a comment - - edited

            As the crash also happens in the unordered_set when creating a new query as well. I feel like the issue happens due to concurrent access to the unordered_set which is not thread-safe by default.

            Jay Vavachan You can try to synchronize this line with the database's lock and see if that helps. If that works, we may need to look to see if there is a better way to fix this or not.

            pasin Pasin Suriyentrakorn added a comment - - edited As the crash also happens in the unordered_set when creating a new query as well. I feel like the issue happens due to concurrent access to the unordered_set which is not thread-safe by default. Jay Vavachan You can try to synchronize this line with the database's lock and see if that helps. If that works, we may need to look to see if there is a better way to fix this or not.
            Jayahari.Vavachan Jay Vavachan made changes -
            Link This issue relates to CBL-1961 [ CBL-1961 ]
            Jayahari.Vavachan Jay Vavachan made changes -
            Link This issue relates to CBL-2012 [ CBL-2012 ]
            jimb Jim Borden made changes -
            Rank Ranked higher
            daniel.petersen Daniel Petersen made changes -
            Assignee The Lite [ cbgto ] Jay Vavachan [ jayahari.vavachan ]
            pasin Pasin Suriyentrakorn made changes -
            Component/s iOS [ 15348 ]
            daniel.petersen Daniel Petersen made changes -
            Story Points 3
            daniel.petersen Daniel Petersen made changes -
            Sprint Jay 51 [ 1687 ]
            daniel.petersen Daniel Petersen made changes -
            Rank Ranked lower
            Jayahari.Vavachan Jay Vavachan made changes -
            Component/s LiteCore [ 15347 ]

            I have tried adding a small delay when calling the `performDBQuery` from within, which seems to be resolving the issue.

                private func performDBQuery() {
                     DBManager.shared.runQuery()
             
            //// add a small delay 0.01
                    DispatchQueue.global().asyncAfter(deadline: .now() + 0.01, execute: {
                        self.performDBQuery()
                    })
                 }
            

            Jayahari.Vavachan Jay Vavachan added a comment - I have tried adding a small delay when calling the `performDBQuery` from within, which seems to be resolving the issue. private func performDBQuery() { DBManager.shared.runQuery()   //// add a small delay 0.01 DispatchQueue.global().asyncAfter(deadline: .now() + 0.01, execute: { self.performDBQuery() }) }
            Jayahari.Vavachan Jay Vavachan made changes -
            Resolution Not a Bug [ 10200 ]
            Status Open [ 1 ] Resolved [ 5 ]
            Jayahari.Vavachan Jay Vavachan made changes -
            Status Resolved [ 5 ] Closed [ 6 ]

            Adding performDBQuery() into the queue without delay should be OK for this test. I think it helps to prevent the crash because it makes the calls to performDBQuery() become serial.

            pasin Pasin Suriyentrakorn added a comment - Adding performDBQuery() into the queue without delay should be OK for this test. I think it helps to prevent the crash because it makes the calls to performDBQuery() become serial.

            From the crash log, I think the problem is about concurrent calls to DataFile::registerQuery() and DataFile::unregisterQuery(). Currently, we lock when creating c4query and when releasing c4query so those two operations should be fine. I'm suspecting that we will need to lock when releasing the query's enumerator (https://github.com/couchbase/couchbase-lite-ios/blob/master/Objective-C/CBLQueryResultSet.mm#L46) as well as the C4QueryEnumeratorImpl also retains the Query object.

            pasin Pasin Suriyentrakorn added a comment - From the crash log, I think the problem is about concurrent calls to DataFile::registerQuery() and DataFile::unregisterQuery(). Currently, we lock when creating c4query and when releasing c4query so those two operations should be fine. I'm suspecting that we will need to lock when releasing the query's enumerator ( https://github.com/couchbase/couchbase-lite-ios/blob/master/Objective-C/CBLQueryResultSet.mm#L46 ) as well as the C4QueryEnumeratorImpl also retains the Query object.

            Jay Vavachan I'm reopening this issue as I think this is the real issue not because of the user's test code. Also, there are multiple people getting this issue.

            pasin Pasin Suriyentrakorn added a comment - Jay Vavachan I'm reopening this issue as I think this is the real issue not because of the user's test code. Also, there are multiple people getting this issue.
            pasin Pasin Suriyentrakorn made changes -
            Resolution Not a Bug [ 10200 ]
            Status Closed [ 6 ] Reopened [ 4 ]
            jimb Jim Borden made changes -
            Sprint Jay 51 [ 1687 ] Jay 51, Sandy 53 [ 1687, 1709 ]
            pasin Pasin Suriyentrakorn added a comment - - edited

            From Thread Safety Doc (https://github.com/couchbase/couchbase-lite-core/wiki/Thread-Safety) c4queryenum_release is not included in the database exclusive list. Given that the enumerator also retains the query object, I suspect that c4queryenum_release() might need to be in the database exclusive list as well to prevent concurrent access to DataFile's queries (unsorted_set) object when DataFile::registerQuery() and DataFile::unregisterQuery() are called.

            pasin Pasin Suriyentrakorn added a comment - - edited From Thread Safety Doc ( https://github.com/couchbase/couchbase-lite-core/wiki/Thread-Safety ) c4queryenum_release is not included in the database exclusive list. Given that the enumerator also retains the query object, I suspect that c4queryenum_release() might need to be in the database exclusive list as well to prevent concurrent access to DataFile's queries (unsorted_set) object when DataFile::registerQuery() and DataFile::unregisterQuery() are called.
            Jayahari.Vavachan Jay Vavachan made changes -
            Status Reopened [ 4 ] In Progress [ 3 ]
            daniel.petersen Daniel Petersen made changes -
            Sprint Jay 51, Jay 52 [ 1687, 1709 ] Jay 51, Jay 52, Sandy 54 [ 1687, 1709, 1730 ]
            jianmin.zhao Jianmin Zhao added a comment -

            Agree with Pasin Suriyentrakorn: the exclusive list in the doc should include c4queryenum_release. Consequently, 

            "c4queryenum_release(_enumerator)" needs to be protected.

            jianmin.zhao Jianmin Zhao added a comment - Agree with Pasin Suriyentrakorn : the exclusive list in the doc should include c4queryenum_release. Consequently,  "c4queryenum_release(_enumerator)" needs to be protected.
            Automated transition triggered when Jayahari Vavachan created pull request #2873 in GitHub -
            Status In Progress [ 3 ] In Review [ 10107 ]
            Jayahari.Vavachan Jay Vavachan made changes -
            Status In Review [ 10107 ] In Progress [ 3 ]
            Jayahari.Vavachan Jay Vavachan made changes -
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Resolved [ 5 ]
            Jayahari.Vavachan Jay Vavachan made changes -
            Status Resolved [ 5 ] Closed [ 6 ]

            Build couchbase-lite-ios-3.0.0-238 contains couchbase-lite-ios commit bce4217 with commit message:
            CBL-1920: Protect the c4queryenum_release from multiple threads accessing it (#2873)

            build-team Couchbase Build Team added a comment - Build couchbase-lite-ios-3.0.0-238 contains couchbase-lite-ios commit bce4217 with commit message: CBL-1920 : Protect the c4queryenum_release from multiple threads accessing it (#2873)
            Jayahari.Vavachan Jay Vavachan made changes -
            Resolution Fixed [ 1 ]
            Status Closed [ 6 ] Reopened [ 4 ]
            Jayahari.Vavachan Jay Vavachan made changes -
            Affects Version/s 2.8.0 [ 16189 ]
            Affects Version/s Lithium [ 16190 ]
            Jayahari.Vavachan Jay Vavachan made changes -
            Issue Type Task [ 3 ] Bug [ 1 ]
            Jayahari.Vavachan Jay Vavachan made changes -
            Required Mobile Fields Mandatory:
             - CBL / SG Version:
               - SG Config:
             - Steps to Reproduce:
             - Actual Result:
             - Expected Result:
             - Logs :
                  SGW LOGS: sgcollect info
                  CBL LOGS:
                  Logcat LOGS: for Android tickets
             - Github link for the code:
             - Jenkins job failure link:
             - Pytest Command
             - What is the last build this test passed:
            Jayahari.Vavachan Jay Vavachan made changes -
            Resolution Fixed [ 1 ]
            Status Reopened [ 4 ] Resolved [ 5 ]
            Jayahari.Vavachan Jay Vavachan made changes -
            Status Resolved [ 5 ] Closed [ 6 ]

            People

              Jayahari.Vavachan Jay Vavachan
              pasin Pasin Suriyentrakorn
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty