Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-4765

Erlang dump created on membase server restart

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.7.2, 1.8.0
    • Fix Version/s: 1.8.1
    • Component/s: None
    • Security Level: Public
    • Environment:
      1.7.2

      Description

      After a recent OS update and reboot, the 2-server clusters did not reload from disk successfully, and a membase-server restart results in an erlang error.

      When the cluster was warming up, both nodes are responding to a membase-server restart command with "{"init terminating in do_boot",badmatch,{error,{shutdown,{ns_server,start,[normal,[]]}},[

      {init,start_it,1}

      ,

      {init,start_em,1}

      ]}}

      Erlang has closed /opt/membase/lib/erlang/lib/os_mon-2.2.6/priv/bin/memsup: Erlang has closed.
      Crash dump was written to: erl_crash.dump

      Once the warmup finished the cluster was back in a working state.

      Logs and crash file attached.

      1. erl_crash.dump
        487 kB
        James Mauss
      2. JDPCachePrimary_01112012a.log
        11.88 MB
        James Mauss
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Pivotal link is broken.

        Anyway. This is done. 1.8.1 has reliable shutdown backported from 2.0

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Pivotal link is broken. Anyway. This is done. 1.8.1 has reliable shutdown backported from 2.0
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        added pivotal story to address that for 1.8.1 https://www.pivotaltracker.com/projects/212245

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - added pivotal story to address that for 1.8.1 https://www.pivotaltracker.com/projects/212245
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        another copy of erlang was still running.

        We have known issue fixed in 2.0 that initscript stop action merely sends shutdown signal to ns_server without waiting for actual shutdown. Actual shutdown waits until memcached ends persisting it's data. So may take time. Thus initscript restart doesn't really work in most real world cases in 1.7 and current 1.8.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - another copy of erlang was still running. We have known issue fixed in 2.0 that initscript stop action merely sends shutdown signal to ns_server without waiting for actual shutdown. Actual shutdown waits until memcached ends persisting it's data. So may take time. Thus initscript restart doesn't really work in most real world cases in 1.7 and current 1.8.
        Hide
        james.mauss James Mauss added a comment -

        The customer would like to know why the erlang dump was being created.

        Show
        james.mauss James Mauss added a comment - The customer would like to know why the erlang dump was being created.

          People

          • Assignee:
            alkondratenko Aleksey Kondratenko (Inactive)
            Reporter:
            james.mauss James Mauss
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes