Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-4765

Erlang dump created on membase server restart

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.7.2, 1.8.0
    • Fix Version/s: 1.8.1
    • Component/s: None
    • Security Level: Public
    • Environment:
      1.7.2

      Description

      After a recent OS update and reboot, the 2-server clusters did not reload from disk successfully, and a membase-server restart results in an erlang error.

      When the cluster was warming up, both nodes are responding to a membase-server restart command with "{"init terminating in do_boot",badmatch,{error,{shutdown,{ns_server,start,[normal,[]]}},[

      {init,start_it,1}

      ,

      {init,start_em,1}

      ]}}

      Erlang has closed /opt/membase/lib/erlang/lib/os_mon-2.2.6/priv/bin/memsup: Erlang has closed.
      Crash dump was written to: erl_crash.dump

      Once the warmup finished the cluster was back in a working state.

      Logs and crash file attached.

      1. erl_crash.dump
        487 kB
        James Mauss
      2. JDPCachePrimary_01112012a.log
        11.88 MB
        James Mauss
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        james.mauss James Mauss created issue -
        james.mauss James Mauss made changes -
        Field Original Value New Value
        Assignee Aleksey Kondratenko [ alkondratenko ]
        Hide
        james.mauss James Mauss added a comment -

        The customer would like to know why the erlang dump was being created.

        Show
        james.mauss James Mauss added a comment - The customer would like to know why the erlang dump was being created.
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        another copy of erlang was still running.

        We have known issue fixed in 2.0 that initscript stop action merely sends shutdown signal to ns_server without waiting for actual shutdown. Actual shutdown waits until memcached ends persisting it's data. So may take time. Thus initscript restart doesn't really work in most real world cases in 1.7 and current 1.8.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - another copy of erlang was still running. We have known issue fixed in 2.0 that initscript stop action merely sends shutdown signal to ns_server without waiting for actual shutdown. Actual shutdown waits until memcached ends persisting it's data. So may take time. Thus initscript restart doesn't really work in most real world cases in 1.7 and current 1.8.
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        added pivotal story to address that for 1.8.1 https://www.pivotaltracker.com/projects/212245

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - added pivotal story to address that for 1.8.1 https://www.pivotaltracker.com/projects/212245
        dipti Dipti Borkar made changes -
        Project Couchbase Support/Engineering Tasks [ 10061 ] Couchbase Server [ 10010 ]
        Key CBSE-93 MB-4765
        Security Public [ 10011 ]
        dipti Dipti Borkar made changes -
        Fix Version/s 1.8.1 [ 10249 ]
        Affects Version/s 1.8.0 [ 10248 ]
        Affects Version/s 1.7.2 [ 10203 ]
        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Pivotal link is broken.

        Anyway. This is done. 1.8.1 has reliable shutdown backported from 2.0

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Pivotal link is broken. Anyway. This is done. 1.8.1 has reliable shutdown backported from 2.0
        alkondratenko Aleksey Kondratenko (Inactive) made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 1.8.1 [ 10295 ]
        Fix Version/s 1.8.2 [ 10249 ]
        Resolution Fixed [ 1 ]
        farshid Farshid Ghods (Inactive) made changes -
        Labels 1.8.1-release-notes

          People

          • Assignee:
            alkondratenko Aleksey Kondratenko (Inactive)
            Reporter:
            james.mauss James Mauss
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes