Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-45800

ASan-UBSan CV jobs failing due to out-of-memory killer on recent "large" CV machines

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.0.0
    • Cheshire-Cat
    • build
    • Triaged
    • 1
    • Yes

    Description

      Recent ASan-UBSan CV jobs running on the recently-added "large" CV machines have been failing due to the compilation running out of memory - for example:

      05:40:02  [1261/1337] Linking CXX executable kv_engine/memcached_mcbp_test
      05:40:02  FAILED: kv_engine/memcached_mcbp_test 
      ...
      05:40:02  clang: error: unable to execute command: Killed
      05:40:02  clang: error: linker command failed due to signal (use -v to see invocation)
      

      URL: http://cv.jenkins.couchbase.com/job/kv_engine.ASan-UBSan/job/master/17886/consoleFull

      Looking at the machine in question (ubuntu18-cv-large-07), it has 16 cores but only 16GB RAM. As such, with PARALLELISM env var set to 16 there is potentially 16 ld link processes running concurrently. The oom-killer dmesg logs show mulitple instances where the killed linker processes had >1.5GB RSS at the point they were killed.

      Attachments

        For Gerrit Dashboard: MB-45800
        # Subject Branch Project Status CR V

        Activity

          People

            drigby Dave Rigby (Inactive)
            drigby Dave Rigby (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty