Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-47591

Incorrect cgroup resource detection (core/memory) when running in Container / K8S

    XMLWordPrintable

Details

    • Triaged
    • Yes

    Description

      We need to calculate the right number of cores configured by the container environment and ensure all services use it. Since NS-SERVER is the bootstrapping all services, we'll need to add logic to detect and pass on the proper args/env as appropriate to the type of runtime a service is using.

      Most of our services uses a some flavor of managed virtual machine (Java, Golang, Erlang). These are the actual components that automatically detects number of cores, based on whichever calls they make to the OS. Some may offer a way to manually force number of cores avail to the virtual machine. Each service will need to provide bootstrapping guidance how to limit theirs. Then, we can add a wrapper to the bootstrapping that detects whether or not, cgroups is in use and calculate the cores made available to the container and pass it on to the startup script of each service and use the proper arguments to restrict the number of cores. Since most services are started by the babysitter, this can be done centrally by ns_server.

      Java : command line -XX:ActiveProcessorCount=<cores>
      Golang : environment variable GOMAXPROCS=<cores>
      Erlang: command line +S <cores>:<cores>
      C/C++: No Virtual Machine. cgroup detection logic needs to be prescribed in code.

      Important to note, as the number of threads created by each service may, or may not be derived by the number of available cores, each service will need to adjust their logic of thread allocation. In the case of Erlang based processes (NS_SERVER, COUCHDB), there will be one thread per core by default. So adjusting number of cores will also impact number of threads.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Build couchbase-server-7.2.0-1120 contains ns_server commit 4878689 with commit message:
            MB-47591: [sigar] Take care of paddings when...

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.2.0-1120 contains ns_server commit 4878689 with commit message: MB-47591 : [sigar] Take care of paddings when...

            Build couchbase-server-7.2.0-1120 contains ns_server commit 739ae47 with commit message:
            MB-47591: [BP] Set the number of schedulers for ...

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.2.0-1120 contains ns_server commit 739ae47 with commit message: MB-47591 : [BP] Set the number of schedulers for ...

            Build couchbase-server-7.2.0-1120 contains ns_server commit af54ca2 with commit message:
            MB-47591: [BP] [babysitter] Set COUCHBASE_CPU_COUNT when starting

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.2.0-1120 contains ns_server commit af54ca2 with commit message: MB-47591 : [BP] [babysitter] Set COUCHBASE_CPU_COUNT when starting

            Build couchbase-server-7.2.0-1120 contains ns_server commit b203ea1 with commit message:
            MB-47591: [BP] Move basic sigar functions to sigar.erl ...

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.2.0-1120 contains ns_server commit b203ea1 with commit message: MB-47591 : [BP] Move basic sigar functions to sigar.erl ...

            Closing it based on the weekly-build tests' results. 

            sumedh.basarkod Sumedh Basarkod (Inactive) added a comment - Closing it based on the weekly-build tests' results. 

            People

              sumedh.basarkod Sumedh Basarkod (Inactive)
              meni.hillel Meni Hillel (Inactive)
              Votes:
              3 Vote for this issue
              Watchers:
              37 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty