Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.1.0-dp5
    • Fix Version/s: None
    • Component/s: library
    • Security Level: Public
    • Labels:
      None
    • Environment:
      CentOS 5.5, Apache 2.2, mod_php, PHP5.3, Couchbase 2.0dp5

      Description

      (See discussion @ https://groups.google.com/forum/#!topic/couchbase/z5PWM5lS7gU)

      I've been getting a "Too many files open" warning while accessing pages in my site. I tracked it down to this process using LSOF

      httpd 25992 httpduser 98u unix 0xffff81006cbcf4c0 0t0 6019192 socket
      httpd 25992 httpduser 99u unix 0xffff810074b160c0 0t0 6019193 socket
      httpd 25992 httpduser 100r 0000 0,11 0 6063613 eventpoll
      httpd 25992 httpduser 101u unix 0xffff8100701f66c0 0t0 6063614 socket
      httpd 25992 httpduser 102u unix 0xffff8100701f7a00 0t0 6063615 socket
      httpd 25992 httpduser 103r 0000 0,11 0 6064737 eventpoll
      httpd 25992 httpduser 104u unix 0xffff8100666eb6c0 0t0 6064738 socket
      httpd 25992 httpduser 105u unix 0xffff8100666eb980 0t0 6064739 socket
      httpd 25992 httpduser 108r 0000 0,11 0 6064793 eventpoll

      Everytime I refresh a page, two new sockets and one new eventpoll shows up. These socket/eventpoll's only show up on pages that make use of Coucbase – in fact, in the email thread I narrowed it down to a single command page, so I'm pretty sure it's CB here. These FD's stick around until the apache thread is restarted (due to MaxRequestsPerChild limit reached) or an httpd restart. Eventually, the number of FD's opened causes things to go very poorly for the server.

      Utilizing persistent connections lessened the problem, but the number never went down/static, just increased more slowly.

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        chuyskywalker Jeff Minard created issue -
        Hide
        mnunberg Mark Nunberg added a comment - - edited

        I can confirm this is happening even on a CLI script. Run this script (php 5.4, debian wheezy), and lsof -p on the php process.

        This seems to be an issue in libcouchbase itself and not in the php code.

        Upon deeper analysis, the leaking fds seem to come from some odd socketpair call that libevent makes for no apparently good reason.

        It would seem libcouchbase never properly destroys the 'event_base' object, and the socket pairs keep persisting until the application exits.

        (gdb) bt
        #0 socketpair () at ../sysdeps/unix/syscall-template.S:82
        #1 0x00007ffff3568181 in evsig_init () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5
        #2 0x00007ffff35677c0 in ?? () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5
        #3 0x00007ffff3556645 in event_base_new_with_config () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5
        #4 0x00007ffff3556765 in event_base_new () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5
        #5 0x00007ffff378b605 in lcb_create_libevent_io_opts ()
        from /sources/libcouchbase/inst/lib/libcouchbase_libevent.so.2
        #6 0x00007ffff3bb9dcc in lcb_create_io_ops () from /sources/libcouchbase/inst/lib/libcouchbase.so.2
        #7 0x00007ffff3dd463d in php_couchbase_create_impl (ht=<optimized out>, return_value=0x7ffff7fbf078,
        this_ptr=0x7ffff7fbd2e8, oo=1,
        return_value_used=<error reading variable: Unhandled dwarf expression opcode 0xfa>,
        return_value_ptr=<error reading variable: Unhandled dwarf expression opcode 0xfa>)
        at /sources/lcb2.0-repo/php/couchbase.c:1708
        #8 0x00007ffff3ff2edc in xdebug_execute_internal (current_execute_data=0x7ffff7f86060, return_value_used=0)
        at /srv/debian_developer/xdebug/xdebug-2.2.1/build-php5/xdebug.c:1483
        #9 0x000000000074688e in zend_do_fcall_common_helper_SPEC (execute_data=0x7ffff7f86060)
        at /tmp/buildd/php5-5.4.4/Zend/zend_vm_execute.h:644
        #10 0x0000000000700297 in execute (op_array=0x7ffff7fbdbc8) at /tmp/buildd/php5-5.4.4/Zend/zend_vm_execute.h:410
        #11 0x00007ffff3ff2a81 in xdebug_execute (op_array=0x7ffff7fbdbc8)
        at /srv/debian_developer/xdebug/xdebug-2.2.1/build-php5/xdebug.c:1391
        #12 0x00000000006a00de in zend_execute_scripts (type=8, retval=0x7ffff7fbdb90, file_count=3)
        at /tmp/buildd/php5-5.4.4/Zend/zend.c:1279
        #13 0x000000000063f6b3 in php_execute_script (primary_file=0x7fffffff9f50)
        at /tmp/buildd/php5-5.4.4/main/main.c:2473
        #14 0x0000000000749003 in do_cli (argc=0, argv=0x7fffffffe40c) at /tmp/buildd/php5-5.4.4/sapi/cli/php_cli.c:988
        #15 0x00000000004310ca in main (argc=32767, argv=0xdb9210) at /tmp/buildd/php5-5.4.4/sapi/cli/php_cli.c:1361

        Show
        mnunberg Mark Nunberg added a comment - - edited I can confirm this is happening even on a CLI script. Run this script (php 5.4, debian wheezy), and lsof -p on the php process. This seems to be an issue in libcouchbase itself and not in the php code. Upon deeper analysis, the leaking fds seem to come from some odd socketpair call that libevent makes for no apparently good reason. It would seem libcouchbase never properly destroys the 'event_base' object, and the socket pairs keep persisting until the application exits. (gdb) bt #0 socketpair () at ../sysdeps/unix/syscall-template.S:82 #1 0x00007ffff3568181 in evsig_init () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5 #2 0x00007ffff35677c0 in ?? () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5 #3 0x00007ffff3556645 in event_base_new_with_config () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5 #4 0x00007ffff3556765 in event_base_new () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5 #5 0x00007ffff378b605 in lcb_create_libevent_io_opts () from /sources/libcouchbase/inst/lib/libcouchbase_libevent.so.2 #6 0x00007ffff3bb9dcc in lcb_create_io_ops () from /sources/libcouchbase/inst/lib/libcouchbase.so.2 #7 0x00007ffff3dd463d in php_couchbase_create_impl (ht=<optimized out>, return_value=0x7ffff7fbf078, this_ptr=0x7ffff7fbd2e8, oo=1, return_value_used=<error reading variable: Unhandled dwarf expression opcode 0xfa>, return_value_ptr=<error reading variable: Unhandled dwarf expression opcode 0xfa>) at /sources/lcb2.0-repo/php/couchbase.c:1708 #8 0x00007ffff3ff2edc in xdebug_execute_internal (current_execute_data=0x7ffff7f86060, return_value_used=0) at /srv/debian_developer/xdebug/xdebug-2.2.1/build-php5/xdebug.c:1483 #9 0x000000000074688e in zend_do_fcall_common_helper_SPEC (execute_data=0x7ffff7f86060) at /tmp/buildd/php5-5.4.4/Zend/zend_vm_execute.h:644 #10 0x0000000000700297 in execute (op_array=0x7ffff7fbdbc8) at /tmp/buildd/php5-5.4.4/Zend/zend_vm_execute.h:410 #11 0x00007ffff3ff2a81 in xdebug_execute (op_array=0x7ffff7fbdbc8) at /srv/debian_developer/xdebug/xdebug-2.2.1/build-php5/xdebug.c:1391 #12 0x00000000006a00de in zend_execute_scripts (type=8, retval=0x7ffff7fbdb90, file_count=3) at /tmp/buildd/php5-5.4.4/Zend/zend.c:1279 #13 0x000000000063f6b3 in php_execute_script (primary_file=0x7fffffff9f50) at /tmp/buildd/php5-5.4.4/main/main.c:2473 #14 0x0000000000749003 in do_cli (argc=0, argv=0x7fffffffe40c) at /tmp/buildd/php5-5.4.4/sapi/cli/php_cli.c:988 #15 0x00000000004310ca in main (argc=32767, argv=0xdb9210) at /tmp/buildd/php5-5.4.4/sapi/cli/php_cli.c:1361
        Hide
        mnunberg Mark Nunberg added a comment - - edited

        I was right about the problem, but not about the real cause.

        The real cause is that the php extension creates the iops structure manually, and does not manually clean it up. Therefore the explicit creation of the iops structure causes a memory leak as well as a fd leak.

        Ideally our extension code should never be dealing with the iops structure directly, which will be done in subsequent refactors.

        This was not the case in the 1.0.x versions of libcouchbase where lcb would just assume the iops structure needed to be destroyed and freed it.

        Show
        mnunberg Mark Nunberg added a comment - - edited I was right about the problem, but not about the real cause. The real cause is that the php extension creates the iops structure manually, and does not manually clean it up. Therefore the explicit creation of the iops structure causes a memory leak as well as a fd leak. Ideally our extension code should never be dealing with the iops structure directly, which will be done in subsequent refactors. This was not the case in the 1.0.x versions of libcouchbase where lcb would just assume the iops structure needed to be destroyed and freed it.
        Hide
        mnunberg Mark Nunberg added a comment -

        http://review.couchbase.org/22255

        You can apply this patch manually if you'd like as a temporary workaround.

        Show
        mnunberg Mark Nunberg added a comment - http://review.couchbase.org/22255 You can apply this patch manually if you'd like as a temporary workaround.
        mnunberg Mark Nunberg made changes -
        Field Original Value New Value
        Assignee Matt Ingenthron [ ingenthr ] Mark Nunberg [ mnunberg ]
        mnunberg Mark Nunberg made changes -
        Status Open [ 1 ] In Progress [ 3 ]
        Hide
        mnunberg Mark Nunberg added a comment -

        For the record, in output like this:

        httpd 25992 httpduser 105u unix 0xffff8100666eb980 0t0 6064739 socket

        The fifth column indicates the underlying fd type. In this case, it's a unix socket.

        If it were a TCP socket it'd say "IPv4" (or "IPv6") - and that gave me the hint, as libcouchbase on its own does not utilize unix domain sockets.

        Show
        mnunberg Mark Nunberg added a comment - For the record, in output like this: httpd 25992 httpduser 105u unix 0xffff8100666eb980 0t0 6064739 socket The fifth column indicates the underlying fd type. In this case, it's a unix socket. If it were a TCP socket it'd say "IPv4" (or "IPv6") - and that gave me the hint, as libcouchbase on its own does not utilize unix domain sockets.
        Hide
        ingenthr Matt Ingenthron added a comment -

        Does a similar fix need to be made in 1.0.x? I peeked at the code briefly, but enough has changed that I couldn't quite tell.

        Show
        ingenthr Matt Ingenthron added a comment - Does a similar fix need to be made in 1.0.x? I peeked at the code briefly, but enough has changed that I couldn't quite tell.
        Hide
        mnunberg Mark Nunberg added a comment -

        No, this issue is specific to the lcb 2.0 iops refactor

        Show
        mnunberg Mark Nunberg added a comment - No, this issue is specific to the lcb 2.0 iops refactor
        Hide
        ingenthr Matt Ingenthron added a comment -

        Good news, thanks!

        Show
        ingenthr Matt Ingenthron added a comment - Good news, thanks!
        Hide
        mnunberg Mark Nunberg added a comment -

        I'm marking this as resolved/fixed. Reopen if not working.

        Show
        mnunberg Mark Nunberg added a comment - I'm marking this as resolved/fixed. Reopen if not working.
        mnunberg Mark Nunberg made changes -
        Status In Progress [ 3 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        brett19 Brett Lawson made changes -
        Workflow jira [ 21536 ] Couchbase SDK Workflow [ 44349 ]

          People

          • Assignee:
            mnunberg Mark Nunberg
            Reporter:
            chuyskywalker Jeff Minard
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes