Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7602

restarting couchbase-server when node is out of disk space results in having two beam.smp processes because old one refuses to shut down ( happens in key-value + xdcr use case on source cluster)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 2.1.0
    • 2.0
    • ns_server
    • Security Level: Public

    Description

      reproducing this issu on 2.0 cluster where xdcr is actively replicating data from 2.0 2 node cluster to another cluster.

      8395 couchbas 20 0 1085m 609m 6288 S 77 15.4 320:54.16 beam.smp
      8433 couchbas 20 0 357m 196m 2928 S 25 5.0 49:11.12 memcached

      memcached and beam process are both using cpu and disk and running dd command to fill up the disk space.
      dd if=/dev/zero of=~/10g.img bs=1000000 count=18000
      after disk space is full attempt to stop couchbase-server

      when node runs out of disk space ( 0 byte available according to df -h )

      node goes to pending but beam.smp is still running
      8395 couchbas 20 0 832m 484m 6288 S 10 12.2 321:31.58 beam.smp

      farshid@ubuntu-1004-x64-307:/opt/couchbase/var/lib/couchbase/data$ sudo /etc/init.d/couchbase-server restart
      NOTE: shutdown failed
      {badrpc,
      {'EXIT',
      {{badmatch,
      {error,

      {file_error,"/opt/couchbase/var/lib/couchbase/logs/info", enospc}

      }},
      [

      {'ale_logger-ns_server',info,4}

      ,

      {ns_bootstrap,stop,0}

      ,

      {rpc,'-handle_call_call/6-fun-0-',5}

      ]}}}

      • Stopped couchbase-server
      • Started couchbase-server

      upon restarting couchbase-server we end up with two beam and two memcached process but after couple of seconds beam.smp crash and the old instance keeps running,
      Farshid Ghods added a comment - 25/Jan/13 11:44 AM - edited
      Swap: 1477624k total, 0k used, 1477624k free, 2488744k cached

      PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      17399 couchbas 20 0 162m 57m 1568 S 97 1.5 0:03.23 memcached
      8395 couchbas 20 0 773m 416m 6288 S 21 10.5 321:48.19 beam.smp

      as you can see for a while two beam.smp process will run

      1000 8395 9.3 10.4 775648 425548 ? Sl Jan23 322:08 /opt/couchbase/lib/erlang/erts-5.8.5/bin/beam.smp -S 16:16 -sbt u -P 327680 -K true – -root /opt/couchbase/lib/erlang -progname erl – -home /home/couchbase – -smp enable -setcookie nocookie -kernel inet_dist_listen_min 21100 inet_dist_listen_max 21299 error_logger false -sasl sasl_error_logger false -noshell -noinput -noshell -noinput -run ns_bootstrap – -couch_ini /opt/couchbase/etc/couchdb/default.ini /opt/couchbase/etc/couchdb/default.d/capi.ini /opt/couchbase/etc/couchdb/default.d/geocouch.ini /opt/couchbase/etc/couchdb/local.ini -ns_server config_path "/opt/couchbase/etc/couchbase/static_config" -ns_server pidfile "/opt/couchbase/var/lib/couchbase/couchbase-server.pid" -ns_server nodefile "/opt/couchbase/var/lib/couchbase/couchbase-server.node" -ns_server cookiefile "/opt/couchbase/var/lib/couchbase/couchbase-server.cookie"
      root 17487 0.1 0.2 78540 10064 pts/2 Sl+ 09:43 0:00 /opt/couchbase/lib/erlang/erts-5.8.5/bin/beam.smp – -root /opt/couchbase/lib/erlang -progname erl – -home /home/farshid – -name executioner@executioner -noshell -hidden -setcookie qmepghfxkbeatnkq -eval ns_bootstrap:remote_stop('ns_1@10.1.3.108')

      i didnt run collect info since ns_server was not running and there was not enough disk space so just zipped whatever is under logs folder

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            andreibaranouski Andrei Baranouski
            farshid Farshid Ghods (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty