Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7748

[system test] online upgrade added node 2.0.1 down during swap rebalance

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • 2.0.1
    • 2.0.1
    • ns_server
    • Security Level: Public
    • unix

    Description

      Environment:

      • Both source and destination cluster are in 2.0.0 GA
      • 2 nodes cluster at source with 2 buckets, one doc and 3 views for each doc
      • 2 nodes cluster at source with 2 buckets

      Load 1 M items to both buckets
      Do online upgrade at source cluster by using swap rebalance.
      Add node ubu-2509 with buid 2.0.1-152 to source cluster and remove one 2.0.0 node.
      Rebalance. Failed.
      Node ubu-2509 was down due to operating system killed beam.smp

      Message from dmesg on node ubu-2509

      [2509410.019520] lowmem_reserve[]: 0 0 0 0
      [2509410.019524] Node 0 DMA: 1*4kB 0*8kB 2*16kB 3*32kB 2*64kB 2*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15876kB
      [2509410.019535] Node 0 DMA32: 1017*4kB 216*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 9892kB
      [2509410.019545] Node 0 Normal: 486*4kB 4*8kB 3*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2024kB
      [2509410.019556] 10742 total pagecache pages
      [2509410.019558] 717 pages in swap cache
      [2509410.019560] Swap cache stats: add 121992, delete 121275, find 11066/15116
      [2509410.019563] Free swap = 6085400kB
      [2509410.019564] Total swap = 6430712kB
      [2509410.041429] 1048560 pages RAM
      [2509410.041432] 33577 pages reserved
      [2509410.041447] 24046 pages shared
      [2509410.041449] 997455 pages non-shared
      [2509410.041454] Out of memory: kill process 19376 (beam.smp) score 9026 or a child
      [2509410.045160] Killed process 19376 (beam.smp)
      [2524506.554641] process `sysctl' is using deprecated sysctl (syscall) net.ipv6.neigh.default.retrans_time; Use net.ipv6.neigh.default.retrans_time_ms instead.

      Collect info of all nodes from source cluster
      https://s3.amazonaws.com/packages.couchbase/collect_info/2_0_1/201302/3nodes-online-upgrade-src-os-kill-beam-201-node.tgz

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ketaki Ketaki Gangal (Inactive)
            thuan Thuan Nguyen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty