Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.6.0
-
M1 Pro (arm64), macOS 12.6.5
-
Untriaged
-
0
-
Yes
Description
After updating tlm deps recently and picking up the new version of OpenSSL via MB-57839, I am no longer able to run the debugger on any programs linking to libcrypto.3.dylib, such as KV-Engine unit tests (ep-engine_ep_unit_tests).
When I try, the debugger stops with a EXC_BAD_INSTRUCTION instruction:
$ lldb -- ./ep-engine_ep_unit_tests --gtest_filter=*stream_request_uid* -v -v
|
(lldb) target create "./ep-engine_ep_unit_tests"
|
Current executable set to '/Users/dave/repos/couchbase/server/source/build-debug-arm64/kv_engine/ep-engine_ep_unit_tests' (arm64).
|
(lldb) settings set -- target.run-args "--gtest_filter=*stream_request_uid*" "-v" "-v"
|
(lldb) b DcpConsumer::handleNoop
|
Breakpoint 1: where = ep-engine_ep_unit_tests`DcpConsumer::handleNoop(DcpMessageProducersIface&) + 32 at consumer.cc:1522:9, address = 0x00000001000a4db4
|
(lldb) r
|
Process 41446 launched: '/Users/dave/repos/couchbase/server/source/build-debug-arm64/kv_engine/ep-engine_ep_unit_tests' (arm64)
|
Process 41446 stopped
|
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=1, subcode=0x4a03000)
|
frame #0: 0x000000010d23d008 libcrypto.3.dylib` _armv8_sve_probe
|
libcrypto.3.dylib`:
|
-> 0x10d23d008 <+0>: eor z0.d, z0.d, z0.d
|
0x10d23d00c <+4>: ret
|
|
libcrypto.3.dylib`:
|
0x10d23d010 <+0>: xar z0.d, z0.d, z0.d, #0x20
|
0x10d23d014 <+4>: ret
|
Target 0: (ep-engine_ep_unit_tests) stopped.
|
(lldb) bt
|
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=1, subcode=0x4a03000)
|
* frame #0: 0x000000010d23d008 libcrypto.3.dylib` _armv8_sve_probe
|
frame #1: 0x000000010d23d7a4 libcrypto.3.dylib` OPENSSL_cpuid_setup + 924
|
frame #2: 0x000000010bf9df4c dyld` invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 164
|
frame #3: 0x000000010bfc7784 dyld` invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 340
|
frame #4: 0x000000010bfbded8 dyld` invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 528
|
frame #5: 0x000000010bf89f98 dyld` dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const + 168
|
frame #6: 0x000000010bfbdc80 dyld` dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 192
|
frame #7: 0x000000010bfc71d4 dyld` dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 516
|
frame #8: 0x000000010bf9de8c dyld` dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 172
|
frame #9: 0x000000010bf9e038 dyld` dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 216
|
frame #10: 0x000000010bf9e014 dyld` dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 180
|
frame #11: 0x000000010bf9e014 dyld` dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 180
|
frame #12: 0x000000010bf9e014 dyld` dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 180
|
frame #13: 0x000000010bf9e104 dyld` dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 124
|
frame #14: 0x000000010bfb33ac dyld` dyld4::APIs::runAllInitializersForMain() + 312
|
frame #15: 0x000000010bf8ddbc dyld` dyld4::prepare(dyld4::APIs&, dyld3::MachOAnalyzer const*) + 3136
|
frame #16: 0x000000010bf8d06c dyld` start + 488
|
i.e. it appears the symbol _armv8_sve_probe in libcrypto.3.dylib is emitting illegal instructions which stop the debugger.
If I rollback to the previous version of OpenSSL (git revert ac4049c) and rebuild, the problem goes away.
There's some discussion on SO about this here. The summary is that latest versions of OpenSSL attempt to use some ARMv8 extension instructions (e.g. SVE2) which are not supported on Apple M1. OpenSSL sets up an signal handler to catch the invalid instruction exception (note it's not supported) and continue.
That links to GitHub issue 20753: EXC_BAD_INSTRUCTION in lib crypto.3.dylib when v3.1.0 is run under the debugger. v3.0.8 does not generate problem lib. (Apple M1/M2).
Following the path through the repo, OpenSSL have disabled this sigill-style feature detection for Apple Silicon on the openssl-3.1 branch as of Jun 25 - https://github.com/openssl/openssl/commit/50af7294e514a2aba19c5248a4ed612ba3ba4c1b
However that has not yet been included in a release - OpenSSL 3.1.1 (latest release) was on 30th May, however OpenSSL 3.1.2 is scheduled for 1st August (https://mta.openssl.org/pipermail/openssl-announce/2023-July/000266.html).
Roll back to the previous version of OpenSSL (3.0.7) - from the top-level of a checkout:
cd tlm
|
git revert ac4049c
|
<rebuild as normal>
|
Potential Workaround (or not...)
According to the above SO post, if we configure lldb to ignore this exception type then we can still debug programs using OpenSSL 3.1:
settings set platform.plugin.darwin.ignored-exceptions EXC_BAD_INSTRUCTION
|
process handle SIGILL -n false -p true -s false
|
However that doesn't work in my environment (macOS 12.6.5, lldb-1400.0.38.17), I get an error when setting the ...ignored-exceptions:
error: invalid value path 'platform.plugin.darwin.ignored-exceptions'
|
Attachments
For Gerrit Dashboard: MB-58046 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
194525,2 | MB-58046: OpenSSL 3.1.2 pre-release build | master | manifest | Status: ABANDONED | -1 | 0 |
194527,4 | MB-58046: Rebuild Erlang (openssl 3.1.2 pre-release build) | master | build-tools | Status: ABANDONED | -1 | 0 |
194528,1 | MB-58046: Bump openssl to 3.1.2 pre-release | master | tlm | Status: ABANDONED | -1 | -1 |
194781,2 | MB-58046, openssl 3.1.2 | master | manifest | Status: MERGED | +2 | +1 |
194827,7 | MB-58046: openssl changes * bumped openssl to 3.1.2 * rebuilt erlang | master | tlm | Status: MERGED | +2 | +1 |
194838,2 | MB-58046, rebuild erlang with openssl 3.1.2. | master | build-tools | Status: MERGED | +2 | +1 |