Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-8376

[Doc'd] Separate service babysitting into own erlang VM, so that memcached (and user data) survives crash of ns_server/couchdb

    Details

      Description

      "Currently ns_server spawns memcached and if ns_server crashes memcached will die losing data.

      It's possible to have separate OS process that will spawn ns_server and memcached and moxi. That process will do minimal work, thus it's much less probable to crash.

      Also it might allow ns_server to change it's erlang node name on windows."

      http://www.pivotaltracker.com/story/show/15956287

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        kzeller kzeller added a comment -

        See questions above.

        Show
        kzeller kzeller added a comment - See questions above.
        Hide
        dipti Dipti Borkar added a comment -

        Its the same as the 2 beam processes. but I did not see that being explained.

        All the details are in the links above. re-pasting them here: https://github.com/couchbase/ns_server/blob/master/CHANGES#L32 and here https://github.com/couchbase/ns_server/blob/2.0.2/doc/some-babysitting-details.txt

        I found the second link quite helpful but the content cannot be used as is. needs to be simplified.

        Show
        dipti Dipti Borkar added a comment - Its the same as the 2 beam processes. but I did not see that being explained. All the details are in the links above. re-pasting them here: https://github.com/couchbase/ns_server/blob/master/CHANGES#L32 and here https://github.com/couchbase/ns_server/blob/2.0.2/doc/some-babysitting-details.txt I found the second link quite helpful but the content cannot be used as is. needs to be simplified.
        Hide
        kzeller kzeller added a comment -

        Got it. Updating now.

        Show
        kzeller kzeller added a comment - Got it. Updating now.
        Hide
        kzeller kzeller added a comment -

        Changed RN to:

        Previously, there was only one process that was responsible for monitoring
        and managing all the other underlying server processes. This includes Moxi and memcached, and also statistics gathering.

        Now there are two processes. One is responsible for just Moxi/Memcached and the other is responsible for monitoring all other processes.
        This should help prevent the max_restart_intensity seen when timeouts start and temporarily disrupted the server. The most noticeable
        change you see with this fix is that there are now two beam.smp processes running on Linux and two erl.exe running
        on Windows. For more details, see <xref linkend="couchbase-underlying-processes" />.

        Updated Content in "Monitoring Couchbase: Underlying Server Processes" to:

        There are several different server processes that constantly run in Couchbase Server whether or not the server is actively handling reads/writes or handling other operations from a client application. Right after you start up a node, you may notice a spike in CPU utilization, and the utilization rate will plateau at some level greater than zero. The following describes the ongoing processes that are running on your node:
        beam.smp on Linux: erl.exe on Windows
        These processes are responsible for monitoring and managing all other underlying server processes such as ongoing XDCR replications, cluster operations, and views. Prior to 2.1.0 we had a single process for memcached, Moxi and to monitor all server processes. This resulted in server disruption and crashes due to lack of memory.
        As of Couchbase Server 2.1.0+ there is a separate monitoring/babysitting process running on each node. The process is small and simple and therefore unlikely to crash due to lack of memory. It is responsible for spawning and monitoring the second, larger process for cluster management, XDCR and views. It also spawns and monitor the processes for Moxi and memcached. If any of these three processes fail, the monitoring process will re-spawn them.
        The main benefit of this approach is that an Erlang VM crash will not cause the Moxi and memcached processes to also crash. You will also see two beam.smp or erl.exe processes running on Linux or Windows respectively.
        The set of log files for this monitoring process is ns_server.babysitter.log which you can collect with cbcollect_info. See .
        memcached: This process is responsible for caching items in RAM and persisting them to disk.
        moxi: This process enables third-party memcached clients to connect to the server.

        Show
        kzeller kzeller added a comment - Changed RN to: Previously, there was only one process that was responsible for monitoring and managing all the other underlying server processes. This includes Moxi and memcached, and also statistics gathering. Now there are two processes. One is responsible for just Moxi/Memcached and the other is responsible for monitoring all other processes. This should help prevent the max_restart_intensity seen when timeouts start and temporarily disrupted the server. The most noticeable change you see with this fix is that there are now two beam.smp processes running on Linux and two erl.exe running on Windows. For more details, see <xref linkend="couchbase-underlying-processes" />. Updated Content in "Monitoring Couchbase: Underlying Server Processes" to: There are several different server processes that constantly run in Couchbase Server whether or not the server is actively handling reads/writes or handling other operations from a client application. Right after you start up a node, you may notice a spike in CPU utilization, and the utilization rate will plateau at some level greater than zero. The following describes the ongoing processes that are running on your node: beam.smp on Linux: erl.exe on Windows These processes are responsible for monitoring and managing all other underlying server processes such as ongoing XDCR replications, cluster operations, and views. Prior to 2.1.0 we had a single process for memcached, Moxi and to monitor all server processes. This resulted in server disruption and crashes due to lack of memory. As of Couchbase Server 2.1.0+ there is a separate monitoring/babysitting process running on each node. The process is small and simple and therefore unlikely to crash due to lack of memory. It is responsible for spawning and monitoring the second, larger process for cluster management, XDCR and views. It also spawns and monitor the processes for Moxi and memcached. If any of these three processes fail, the monitoring process will re-spawn them. The main benefit of this approach is that an Erlang VM crash will not cause the Moxi and memcached processes to also crash. You will also see two beam.smp or erl.exe processes running on Linux or Windows respectively. The set of log files for this monitoring process is ns_server.babysitter.log which you can collect with cbcollect_info. See . memcached: This process is responsible for caching items in RAM and persisting them to disk. moxi: This process enables third-party memcached clients to connect to the server.
        Show
        kzeller kzeller added a comment - In production for publish: RN: http://www.couchbase.com/docs/couchbase-manual-2.1.0/couchbase-server-rn_2-1-0a.html and http://www.couchbase.com/docs/couchbase-manual-2.1.0/couchbase-underlying-processes.html

          People

          • Assignee:
            kzeller kzeller
            Reporter:
            dipti Dipti Borkar
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes