Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-52493

Linker errors under Clang+UBsan for Golang targets

    XMLWordPrintable

Details

    • Bug
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • master
    • master
    • build
    • None
    • Untriaged
    • 1
    • Unknown

    Description

      When attempting to compile with Clang-9 and UndefefinedSanitizer enabled (related to issues with GCC-10 + UBSan with latest folly - see MB-52475), we see the following linker errors for Golang targets which link native shares libraries:

      TERM='dumb' /usr/bin/clang-9 -I /home/couchbase/server/goproj/src/github.com/couchbase/indexing/secondary/fdb -fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=$WORK/b197=/tmp/go-build -gno-record-gcc-switches -o $WORK/b197/_cgo_.o $WORK/b197/_cgo_main.o $WORK/b197/_x001.o $WORK/b197/_x002.o $WORK/b197/_x003.o $WORK/b197/_x004.o $WORK/b197/_x005.o $WORK/b197/_x006.o $WORK/b197/_x007.o $WORK/b197/_x008.o $WORK/b197/_x009.o $WORK/b197/_x010.o $WORK/b197/_x011.o -L/home/couchbase/server/build/forestdb -L/home/couchbase/server/build/sigar/src -L/home/couchbase/server/build/platform -L/home/couchbase/server/install/lib -L/home/couchbase/server/install/lib -L/home/couchbase/server/install/lib -Wl,-rpath-link=/home/couchbase/server/build/forestdb -Wl,-rpath-link=/home/couchbase/server/build/sigar/src -Wl,-rpath-link=/home/couchbase/server/build/platform -Wl,-rpath-link=/home/couchbase/server/install/lib -Wl,-rpath-link=/home/couchbase/server/install/lib -Wl,-rpath-link=/home/couchbase/server/install/lib -fsanitize=address -fsanitize=undefined -Wl,-rpath=$ORIGIN/../lib -lforestdb -lforestdb -lforestdb
      # github.com/couchbase/indexing/secondary/fdb
      /home/couchbase/server/build/forestdb/libforestdb.so: undefined reference to `__ubsan_handle_function_type_mismatch_v1_abort'
      clang: error: linker command failed with exit code 1 (use -v to see invocation)
      

      What appears to be happening is while the linker is told to enable ASan and UBSan (-fsanitize=address -fsanitize=undefined), the UBSan runtime library is not getting linked and hence missing symbols are seen.

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-52493
          # Subject Branch Project Status CR V

          Activity

            drigby Dave Rigby added a comment - - edited

            This seems similar to an issue seen with Clang+TSan previously, where clang links the static version of the sanitizer runtime library, compared to GCC which links the shared one by default - see https://issues.couchbase.com/browse/MB-41896?focusedCommentId=441054&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-441054

            Indeed we observe the same behaviour with UBSan - when compiling with clang-9 we see the static library used:

            $ echo "int main() { return 0;}" > foo.cc
            $ clang++-9 -fsanitize=undefined -v foo.cc 2>&1 |  grep "/usr/bin/ld" | tr ' ' '\n' | grep ubsan
            /usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone-x86_64.a
            --dynamic-list=/usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone-x86_64.a.syms
            /usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone_cxx-x86_64.a
            --dynamic-list=/usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone_cxx-x86_64.a.syms
            

            Compare this with GCC which uses the dynamic library:

            $ /opt/gcc-10.2.0/bin/g++ -fsanitize=undefined -v foo.cc 2>&1 | tr ' ' '\n' | grep ubsan
            -lubsan
            

            And can be confirmed by looking at the which dynamic libraries are linked - GCC:

             ldd a.out  | grep ubsan
                    libubsan.so.1 => /opt/gcc-10.2.0/lib64/libubsan.so.1 (0x00007f8262b8e000)
            

            clang-9:

            $ ldd a.out  | grep ubsan
            <EOF>
            

            I don't exactly understand why this is a problem just with Golang programs; we do use clang to link and hence one might assume it would also correctly link the UBSan runtime (statically), however that doesn't seem to be the case.

            Applying the same fix as TSan - forcing UBSan to be linked dynamically appears to address the problem fixes the symbol issue but then I run into issues with multiple versions of libubsan being linked.

            drigby Dave Rigby added a comment - - edited This seems similar to an issue seen with Clang+TSan previously, where clang links the static version of the sanitizer runtime library, compared to GCC which links the shared one by default - see https://issues.couchbase.com/browse/MB-41896?focusedCommentId=441054&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-441054 Indeed we observe the same behaviour with UBSan - when compiling with clang-9 we see the static library used: $ echo "int main() { return 0;}" > foo.cc $ clang++-9 -fsanitize=undefined -v foo.cc 2>&1 | grep "/usr/bin/ld" | tr ' ' '\n' | grep ubsan /usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone-x86_64.a --dynamic-list=/usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone-x86_64.a.syms /usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone_cxx-x86_64.a --dynamic-list=/usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone_cxx-x86_64.a.syms Compare this with GCC which uses the dynamic library: $ /opt/gcc-10.2.0/bin/g++ -fsanitize=undefined -v foo.cc 2>&1 | tr ' ' '\n' | grep ubsan -lubsan And can be confirmed by looking at the which dynamic libraries are linked - GCC: ldd a.out | grep ubsan libubsan.so.1 => /opt/gcc-10.2.0/lib64/libubsan.so.1 (0x00007f8262b8e000) clang-9: $ ldd a.out | grep ubsan <EOF> I don't exactly understand why this is a problem just with Golang programs; we do use clang to link and hence one might assume it would also correctly link the UBSan runtime (statically), however that doesn't seem to be the case. Applying the same fix as TSan - forcing UBSan to be linked dynamically appears to address the problem fixes the symbol issue but then I run into issues with multiple versions of libubsan being linked.
            drigby Dave Rigby added a comment - - edited

            So the issue here is different to the TSan problem; although the solution could be the same.

            Clang on Linux uses static linking by default for the santizer runtime libraries, including libubsan. Additionally. the static sanitizer runtimes are split into separate .a files for C and C++ code - for example:

            $ ls -l /usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone*x86_64.a
            -rw-r--r-- 1 root root  19510 Jan 31  2020 /usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone_cxx-x86_64.a
            -rw-r--r-- 1 root root 841588 Jan 31  2020 /usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone-x86_64.a
            

            The cxx one contains C++ specific support functions (e.g. related to vtable checks). When linking a C program, clang will just link to ibclang_rt.ubsan_standalone-x86_64.a, when linking to a C++ program it also links to libclang_rt.ubsan_standalone_cxx-x86_64.a.

            When using Clang and linking Golang programs which use C++ libraries (e.g. indexer linking to libforestdb.so), Clang only links to the C runtime library, because Golang just invokes ENV(CC). As such, it fails to find the C++ specific symbols such as __ubsan_handle_function_type_mismatch_v1_abort listed in the original error. This can be seen by examining where that symbol is (and is not) present:

            $ nm --print-file-name libclang_rt.ubsan_standalone-x86_64.a libclang_rt.ubsan_standalone_cxx-x86_64.a | grep __ubsan_handle_function_type_mismatch_v1_abort
            libclang_rt.ubsan_standalone_cxx-x86_64.a:ubsan_handlers_cxx.cc.o:0000000000000000 T __ubsan_handle_function_type_mismatch_v1_abort
            

            The symbol is present in the "cxx" library, not the C one.

            drigby Dave Rigby added a comment - - edited So the issue here is different to the TSan problem; although the solution could be the same. Clang on Linux uses static linking by default for the santizer runtime libraries, including libubsan. Additionally. the static sanitizer runtimes are split into separate .a files for C and C++ code - for example: $ ls -l /usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone*x86_64.a -rw-r--r-- 1 root root 19510 Jan 31 2020 /usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone_cxx-x86_64.a -rw-r--r-- 1 root root 841588 Jan 31 2020 /usr/lib/llvm-9/lib/clang/9.0.0/lib/linux/libclang_rt.ubsan_standalone-x86_64.a The cxx one contains C++ specific support functions (e.g. related to vtable checks). When linking a C program, clang will just link to ibclang_rt.ubsan_standalone-x86_64.a , when linking to a C++ program it also links to libclang_rt.ubsan_standalone_cxx-x86_64.a . When using Clang and linking Golang programs which use C++ libraries (e.g. indexer linking to libforestdb.so), Clang only links to the C runtime library, because Golang just invokes ENV(CC) . As such, it fails to find the C++ specific symbols such as __ubsan_handle_function_type_mismatch_v1_abort listed in the original error. This can be seen by examining where that symbol is (and is not) present: $ nm --print-file-name libclang_rt.ubsan_standalone-x86_64.a libclang_rt.ubsan_standalone_cxx-x86_64.a | grep __ubsan_handle_function_type_mismatch_v1_abort libclang_rt.ubsan_standalone_cxx-x86_64.a:ubsan_handlers_cxx.cc.o:0000000000000000 T __ubsan_handle_function_type_mismatch_v1_abort The symbol is present in the "cxx" library, not the C one.
            drigby Dave Rigby added a comment -

            There is some more discussion of essentially the same issue here: https://github.com/google/oss-fuzz/issues/713

            Solution stated is to link with CXX when using -fsanitize=undefined if any of the linked objects contain C++ code.

            drigby Dave Rigby added a comment - There is some more discussion of essentially the same issue here: https://github.com/google/oss-fuzz/issues/713 Solution stated is to link with CXX when using -fsanitize=undefined if any of the linked objects contain C++ code.

            People

              drigby Dave Rigby
              drigby Dave Rigby
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty