Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6873

exhausted limit of open fds during system tests

    XMLWordPrintable

Details

    • Bug
    • Resolution: Incomplete
    • Major
    • 2.0
    • 2.0
    • ns_server
    • Security Level: Public
    • build 1808

    Description

      Running 300 queries/sec against 2 buckets in parallel during rebalance. Rebalance fails and the following is in logs along with Mnesia core dumps at time of crash.

      =========================CRASH REPORT=========================
      crasher:
      initial call: mochiweb_acceptor:init/3
      pid: <0.32223.614>
      registered_name: []
      exception exit:

      {error,accept_failed}

      in function mochiweb_acceptor:init/3
      ancestors: [couch_httpd,couch_secondary_services,couch_server_sup,
      cb_couch_sup,ns_server_cluster_sup,<0.59.0>]
      messages: []
      links: [<0.6398.0>]
      dictionary: []
      trap_exit: false
      status: running
      heap_size: 377
      stack_size: 24
      reductions: 218
      neighbours:

      [couchdb:error,2012-10-09T18:29:03.298,ns_1@10.6.2.68:<0.32453.614>:couch_log:error:42]Set view `default`, main group `_design/d1`, doc loader error
      error: {case_clause,{error,emfile}}
      stacktrace: [

      {couch_db,fast_reads,2},
      {couch_set_view_updater,'-load_changes/7-fun-2-',6},
      {lists,foldl,3},
      {couch_set_view_updater,load_changes,7},
      {couch_set_view_updater,'-update/7-fun-2-',10}]

      [couchdb:error,2012-10-09T18:29:03.299,ns_1@10.6.2.68:<0.32498.614>:couch_log:error:42]Set view `saslbucket`, main group `_design/d11`, doc loader error
      error: {case_clause,{error,emfile}}
      stacktrace: [{couch_db,fast_reads,2}

      ,

      {couch_set_view_updater,'-load_changes/7-fun-2-',6}

      ,

      {lists,foldl,3}

      ,

      {couch_set_view_updater,load_changes,7}

      ,

      {couch_set_view_updater,'-update/7-fun-2-',10}

      ]

      [couchdb:error,2012-10-09T18:29:03.303,ns_1@10.6.2.68:<0.6803.0>:couch_log:error:42]Set view `default`, main group `_design/d1`, received error from updater: {case_clause,
      {error,
      emfile}}

      .....

      Atop at time of crash beam is at 5.0G (attached)
      atop -m -r /var/log/atop/atop_20121009 -b 18:30 -e 18:30

      ATOP - pine-11803 2012/10/09 18:30:02 ------ 10m0s elapsed
      PRC | sys 12m08s | user 17m59s | #proc 136 | #zombie 0 | #exit 378 |
      CPU | sys 81% | user 148% | irq 20% | idle 108% | wait 7% |
      cpu | sys 23% | user 28% | irq 17% | idle 20% | cpu000 w 2% |
      cpu | sys 20% | user 40% | irq 1% | idle 29% | cpu002 w 2% |
      cpu | sys 20% | user 40% | irq 1% | idle 29% | cpu001 w 2% |
      cpu | sys 20% | user 40% | irq 1% | idle 30% | cpu003 w 2% |
      CPL | avg1 1.79 | avg5 2.06 | avg15 2.56 | csw 29981103 | intr 13722e3 |
      MEM | tot 31.0G | free 152.2M | cache 6.3G | buff 132.7M | slab 614.4M |
      SWP | tot 2.0G | free 2.0G | | vmcom 25.2G | vmlim 17.5G |
      PAG | scan 1469e3 | stall 0 | | swin 0 | swout 0 |
      LVM | Group02-Data | busy 15% | read 18653 | write 393987 | avio 0.27 ms |
      LVM | roup01-Index | busy 1% | read 3048 | write 34554 | avio 0.10 ms |
      LVM | roup-lv_root | busy 0% | read 281 | write 59417 | avio 0.05 ms |
      DSK | xvdc | busy 15% | read 18653 | write 390125 | avio 0.27 ms |
      DSK | xvdb | busy 1% | read 3048 | write 5084 | avio 0.47 ms |
      DSK | xvda | busy 0% | read 281 | write 11485 | avio 0.25 ms |
      NET | transport | tcpi 4961710 | tcpo 5569073 | udpi 0 | udpo 0 |
      NET | network | ipi 4961710 | ipo 5687386 | ipfrw 0 | deliv 4962e3 |
      NET | eth0 ---- | pcki 4397953 | pcko 5122937 | si 28 Mbps | so 52 Mbps |
      NET | lo ---- | pcki 564448 | pcko 564448 | si 5227 Kbps | so 5227 Kbps |

      PID MINFLT MAJFLT VSTEXT VSIZE RSIZE VGROW RGROW MEM CMD
      21836 1405 5 132K 18.3G 18.0G 4112K 4692K 58% memcached
      21800 3124e3 12 1876K 8.2G 5.4G 5.0G 4.8G 17% beam.smp
      21835 399 2 423K 412.2M 6264K 0K 36K 0% moxi
      25706 608 0 148K 17720K 5440K 0K 0K 0% atop
      1253 0 0 139K 12268K 2380K 0K 0K 0% udevd
      1255 0 0 139K 12268K 2372K 0K 0K 0% udevd
      9447 0 0 216K 78704K 1816K 0K 0K 0% pickup
      976 553 0 845K 11560K 1732K 0K 0K 0% xe-daemon
      1210 0 0 141K 78624K 1724K 0K 0K 0% master
      1219 0 0 286K 78876K 1688K 0K 0K 0% qmgr
      21827 214 0 845K 103.6M 1328K 0K 4K 0% sh
      955 2 1 307K 249.1M 1288K 0K 12K 0% rsyslogd
      1232 56 0 52K 114.4M 1288K 0K 0K 0% crond
      21831 0 0 53K 38968K 1132K 0K 0K 0% ssl_esock
      370 0 0 139K 11084K 1032K 0K 0K 0% udevd
      1 0 0 131K 19216K 980K 0K 0K 0% init
      1130 0 0 503K 64024K 968K 0K 0K 0% sshd
      939 0 0 160K 27632K 768K 0K 0K 0% auditd
      887 0 0 543K 9112K 756K 0K 0K 0% dhclient
      1246 0 0 15K 4068K 616K 0K 0K 0% agetty
      21846 0 0 2K 6260K 576K 0K 0K 0% sigar_port
      1249 0 0 11K 4056K 572K 0K 0K 0% mingetty

      Doing netstat I see other services still up except web acceptor on port 8091:

      [root@pine-11803 couchbase]# netstat -t -l
      Active Internet connections (only servers)
      Proto Recv-Q Send-Q Local Address Foreign Address State
      tcp 0 0 :21100 *: LISTEN
      tcp 0 0 :epmd *: LISTEN
      tcp 0 0 :ssh *: LISTEN
      tcp 0 0 localhost:smtp : LISTEN
      tcp 0 0 :8092 *: LISTEN
      tcp 0 0 localhost:44318 : LISTEN
      tcp 0 0 :ssh *: LISTEN
      tcp 0 0 localhost:smtp : LISTEN

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            vmx Volker Mische
            tommie Tommie McAfee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty