Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7596

[system test] memsup crashed due to os_mon_sysinfo timeout

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 2.0.1
    • Fix Version/s: 2.0.1
    • Component/s: ns_server
    • Security Level: Public
    • Labels:
    • Environment:
      Windows 2008 R2 64bit

      Description

      Environment:

      • 9 windows 2008 R2 64bit.
      • Each server has 4 CPU, 8GB RAM and SSD disk
      • Cluster has 2 buckets, default and sasl bucket with consistent view enable.
      • Load 26 million items to default bucket and 16 million items to sasl bucket. Each key has size from 128 to 512 bytes
      • Each bucket has one doc and 2 views for each doc.
      • Rebalance out 2 nodes 10.3.121.173 and 10.3.121.243
      • Rebalance failed when rebalance done with default bucket and move to second sasl bucket
      • File bug MB-7590
      • Rebalance again. Rebalance done.
      • Add node 10.3.121.243 back to cluster and rebalane. Rebalance failed again with error "Resetting rebalance status since it's not really running"

      Check diags, see a lot crash in memsup

      =========================CRASH REPORT=========================
      crasher:
      initial call: memsup:init/1
      pid: <0.23937.236>
      registered_name: memsup
      exception exit: {timeout,{gen_server,call,[os_mon_sysinfo,get_mem_info]}}
      in function gen_server:terminate/6
      ancestors: [os_mon_sup,<0.31883.45>]
      messages: []
      links: [<0.31884.45>]
      dictionary: []
      trap_exit: true
      status: running
      heap_size: 377
      stack_size: 24
      reductions: 199
      neighbours:

      Link to manifest file http://builds.hq.northscale.net/latestbuilds/couchbase-server-enterprise_x86_64_2.0.1-140-rel.setup.exe.manifest.xml

      Link to collect_info of all nodes https://s3.amazonaws.com/packages.couchbase/collect_info/2_0_1/201301/9nodes-col-201-140-rebalance-failed-not-really-running-20130124-141814.tgz

        Issue Links

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

          Hide
          farshid Farshid Ghods (Inactive) added a comment -

          per bug scrub with Alk and Siri,

          Tony is going to set up a 4 node windows physical cluster to see if these issues are reproducible on that environment

          Show
          farshid Farshid Ghods (Inactive) added a comment - per bug scrub with Alk and Siri, Tony is going to set up a 4 node windows physical cluster to see if these issues are reproducible on that environment
          Hide
          farshid Farshid Ghods (Inactive) added a comment -
          Show
          farshid Farshid Ghods (Inactive) added a comment - MB-7658
          Hide
          maria Maria McDuff (Inactive) added a comment -

          closing as dupes.

          Show
          maria Maria McDuff (Inactive) added a comment - closing as dupes.

            People

            • Assignee:
              ketaki Ketaki Gangal
              Reporter:
              thuan Thuan Nguyen
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Gerrit Reviews

                There are no open Gerrit changes