Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-40354

[CBM] Segfault in cbbackupmgr if there is an error before snapshot persisted

    XMLWordPrintable

    Details

    • Triage:
      Untriaged
    • Operating System:
      Centos 64-bit
    • Story Points:
      1
    • Is this a Regression?:
      Yes

      Description

      Install Couchbase server 6.6.0-7865 on 2 centos 7.6 servers
      Create default bucket and load data.
      While running backup using cbbackupmgr, kill all beam.smp processes and restart couchbase server.
      cbbackupmgr thows out segfault.

      2020-07-08T16:27:50.469-07:00 (Gocbcore) Failed to dispatch DCP buffer ack: write tcp 172.23.106.210:59492->172.23.121.224:11210: write: broken pipe
      2020-07-08T16:27:50.469-07:00 (DCP) (default) (vb 392) Stream closed because all items were streamed, last sequence number: 65
      2020-07-08T16:27:50.472-07:00 (DCP) (default) (vb 391) Stream closed because all items were streamed, last sequence number: 69
      2020-07-08T16:27:50.472-07:00 (Gocbcore) memdClient read failure: EOF
      2020-07-08T16:27:50.472-07:00 (DCP) (default) (vb 394) Creating DCP stream with start seqno 0, end seqno 29, vbuuid 24073527553445, snap start seqno 0, snap end seqno 0
      2020-07-08T16:27:50.472-07:00 (Gocbcore) Pipeline client `172.23.121.224:11210/0xc0000d8320` failed to shut down errored client socket (close tcp 172.23.106.210:59492->172.23.121.224:11210: use of closed network connectio
      n)
      2020-07-08T16:27:50.472-07:00 WARN: (DCP) (default) (vb 394) Received error 'request canceled' on stream -- couchbase.(*DCPAsyncWorker).openStream() at dcp_async_worker.go:170
      2020-07-08T16:27:50.475-07:00 WARN: (DCP) (default) (vb 393) Stream closed due to unexpected error 'EOF | {"bucket":"default","last_dispatched_to":"172.23.121.224:11210","last_dispatched_from":"172.23.106.210:59492","last
      _connection_id":"c0aef5ee32ab2452/802fe4ffd9b93b0d"}' -- couchbase.(*DCPAsyncWorker).End() at dcp_async_worker.go:439
      panic: runtime error: invalid memory address or nil pointer dereference
      [signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0xa54b86]
       
      goroutine 3103 [running]:
      github.com/couchbase/backup/storage.(*SnapshotFile).Close(0x0, 0xc0000125a0, 0xc0000125a0)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/backup/storage/metafiles.go:252 +0x26
      github.com/couchbase/backup/storage.(*VBucketBackupWriter).closeVBucket(0xc0001fe000, 0xc000330189, 0x1, 0xc000335f90)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/backup/storage/vbucket_backup_writer.go:517 +0x79
      github.com/couchbase/backup/storage.(*VBucketBackupWriter).CloseVBuckets.func1(0xc0002cbc00, 0xc000012540, 0xc0001fe000, 0xc0002fcf00)
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/backup/storage/vbucket_backup_writer.go:163 +0x9c
      created by github.com/couchbase/backup/storage.(*VBucketBackupWriter).CloseVBuckets
      	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/backup/storage/vbucket_backup_writer.go:159 +0x161
      [root@localhost ~]# 
      
      

      I will find the last stable build. This test is similar to test failed in ticket MB-39632

        Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

          Hide
          build-team Couchbase Build Team added a comment -

          Build couchbase-server-7.0.0-2573 contains couchbase-cli commit fe403a4 with commit message:
          MB-40354 Look for the CV reports under the correct path

          Show
          build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-2573 contains couchbase-cli commit fe403a4 with commit message: MB-40354 Look for the CV reports under the correct path
          Hide
          build-team Couchbase Build Team added a comment -

          Build couchbase-server-6.6.1-9044 contains backup commit 00aec51 with commit message:
          MB-40354 Check that snapshot is not nil before closing

          Show
          build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.1-9044 contains backup commit 00aec51 with commit message: MB-40354 Check that snapshot is not nil before closing
          Hide
          build-team Couchbase Build Team added a comment -

          Build couchbase-server-7.0.0-3080 contains backup commit 00aec51 with commit message:
          MB-40354 Check that snapshot is not nil before closing

          Show
          build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-3080 contains backup commit 00aec51 with commit message: MB-40354 Check that snapshot is not nil before closing
          Hide
          arunkumar Arunkumar Senthilnathan added a comment -

          Verified in 6.6.1-9151 - panic not seen anymore - CBM exits gracefully with EOF error and on resume works fine:

          020-10-31T14:28:06.241-07:00 (DCP) (default) (vb 375) Creating DCP stream {"uuid":222375565211617,"start_seqno":0,"end_seqno":538,"snap_start":0,"snap_end":0,"retries":0}
          2020-10-31T14:28:06.289-07:00 (DCP) (default) (vb 374) Stream closed because all items were streamed | {"uuid":155487839953017,"snap_start":532,"snap_end":532,"snap_complete":true,"last_seqno":532,"retries":0}
          2020-10-31T14:28:07.179-07:00 (Gocbcore) memdClient read failure: read tcp 10.112.194.101:49816->10.112.194.101:11210: read: connection reset by peer
          2020-10-31T14:28:07.179-07:00 WARN: (DCP) (default) (vb 375) Received unexpected error 'EOF | {"bucket":"default","last_dispatched_to":"10.112.194.101:11210","last_dispatched_from":"10.112.194.101:49816","last_connection_id":"2317e23b51efd63f/6957df53e67cdca4"}' on stream | {"uuid":222375565211617,"snap_start":0,"snap_end":0,"snap_complete":true,"last_seqno":0,"retries":-1} -- couchbase.(*DCPAsyncWorker).openStream() at dcp_async_worker.go:186
          2020-10-31T14:28:07.179-07:00 WARN: (DCP) (default) (vb 375) Received an unexpected error whilst streaming, beginning teardown: failed to open stream: failed to stream vBucket 375: client received unexpected error 'EOF | {"bucket":"default","last_dispatched_to":"10.112.194.101:11210","last_dispatched_from":"10.112.194.101:49816","last_connection_id":"2317e23b51efd63f/6957df53e67cdca4"}' -- couchbase.(*DCPAsyncWorker).handleDCPError() at dcp_async_worker.go:550
          2020-10-31T14:28:07.403-07:00 (Gocbcore) memdClient read failure: read tcp 10.112.194.101:49820->10.112.194.101:11210: read: connection reset by peer
          2020-10-31T14:28:07.403-07:00 (Gocbcore) Failed to close authentication client (close tcp 10.112.194.101:49820->10.112.194.101:11210: use of closed network connection)
          2020-10-31T14:28:07.403-07:00 (Stats) Stopping stat collection
          2020-10-31T14:28:07.404-07:00 (Cmd) Error backing up cluster: failed to execute cluster operations: failed to execute bucket operations: failed to transfer bucket data for bucket 'default': failed to transfer key value data: failed to transfer key value data: failed to open stream: failed to stream vBucket 375: client received unexpected error 'EOF | {"bucket":"default","last_dispatched_to":"10.112.194.101:11210","last_dispatched_from":"10.112.194.101:49816","last_connection_id":"2317e23b51efd63f/6957df53e67cdca4"}'
          2020-10-31T14:28:07.404-07:00 (Cmd) Backed up bucket "default" failed
          2020-10-31T14:28:07.404-07:00 (Cmd) Mutations backed up: 183108, Mutations failed to backup: 0
          2020-10-31T14:28:07.404-07:00 (Cmd) Deletions backed up: 0, Deletions failed to backup: 0
          2020-10-31T14:28:07.404-07:00 (Cmd) Skipped due to purge number or conflict resolution: Mutations: 0 Deletions: 0
          2020-10-31T14:28:25.169-07:00 (Cmd) cbbackupmgr version 6.6.1-9151 Hostname: node1-mad-hatter-testing-centos7.vagrants OS: linux Version: 3.10.0-1127.19.1.el7.x86_64 x86_64 Arch: amd64 vCPU: 1 Memory: 3880736
          2020-10-31T14:28:25.169-07:00 (Cmd) backup -a /tmp/entbackup/ -r backup -c 10.112.194.101 -u <ud>Administrator</ud> -p ***** --resume
          2020-10-31T14:28:25.169-07:00 (Cmd) mounted archive with id: 5530511f-f7b1-4a68-ab12-a5b97eb057b0
          2020-10-31T14:28:25.171-07:00 (Rest) GET http://10.112.194.101:8091/pools 200
          2020-10-31T14:28:25.180-07:00 (Rest) GET http://10.112.194.101:8091/pools/default/buckets 200
          2020-10-31T14:28:25.184-07:00 (Rest) GET http://10.112.194.101:8091/pools/default/buckets/default 200
          

          Show
          arunkumar Arunkumar Senthilnathan added a comment - Verified in 6.6.1-9151 - panic not seen anymore - CBM exits gracefully with EOF error and on resume works fine: 020-10-31T14:28:06.241-07:00 (DCP) (default) (vb 375) Creating DCP stream {"uuid":222375565211617,"start_seqno":0,"end_seqno":538,"snap_start":0,"snap_end":0,"retries":0} 2020-10-31T14:28:06.289-07:00 (DCP) (default) (vb 374) Stream closed because all items were streamed | {"uuid":155487839953017,"snap_start":532,"snap_end":532,"snap_complete":true,"last_seqno":532,"retries":0} 2020-10-31T14:28:07.179-07:00 (Gocbcore) memdClient read failure: read tcp 10.112.194.101:49816->10.112.194.101:11210: read: connection reset by peer 2020-10-31T14:28:07.179-07:00 WARN: (DCP) (default) (vb 375) Received unexpected error 'EOF | {"bucket":"default","last_dispatched_to":"10.112.194.101:11210","last_dispatched_from":"10.112.194.101:49816","last_connection_id":"2317e23b51efd63f/6957df53e67cdca4"}' on stream | {"uuid":222375565211617,"snap_start":0,"snap_end":0,"snap_complete":true,"last_seqno":0,"retries":-1} -- couchbase.(*DCPAsyncWorker).openStream() at dcp_async_worker.go:186 2020-10-31T14:28:07.179-07:00 WARN: (DCP) (default) (vb 375) Received an unexpected error whilst streaming, beginning teardown: failed to open stream: failed to stream vBucket 375: client received unexpected error 'EOF | {"bucket":"default","last_dispatched_to":"10.112.194.101:11210","last_dispatched_from":"10.112.194.101:49816","last_connection_id":"2317e23b51efd63f/6957df53e67cdca4"}' -- couchbase.(*DCPAsyncWorker).handleDCPError() at dcp_async_worker.go:550 2020-10-31T14:28:07.403-07:00 (Gocbcore) memdClient read failure: read tcp 10.112.194.101:49820->10.112.194.101:11210: read: connection reset by peer 2020-10-31T14:28:07.403-07:00 (Gocbcore) Failed to close authentication client (close tcp 10.112.194.101:49820->10.112.194.101:11210: use of closed network connection) 2020-10-31T14:28:07.403-07:00 (Stats) Stopping stat collection 2020-10-31T14:28:07.404-07:00 (Cmd) Error backing up cluster: failed to execute cluster operations: failed to execute bucket operations: failed to transfer bucket data for bucket 'default': failed to transfer key value data: failed to transfer key value data: failed to open stream: failed to stream vBucket 375: client received unexpected error 'EOF | {"bucket":"default","last_dispatched_to":"10.112.194.101:11210","last_dispatched_from":"10.112.194.101:49816","last_connection_id":"2317e23b51efd63f/6957df53e67cdca4"}' 2020-10-31T14:28:07.404-07:00 (Cmd) Backed up bucket "default" failed 2020-10-31T14:28:07.404-07:00 (Cmd) Mutations backed up: 183108, Mutations failed to backup: 0 2020-10-31T14:28:07.404-07:00 (Cmd) Deletions backed up: 0, Deletions failed to backup: 0 2020-10-31T14:28:07.404-07:00 (Cmd) Skipped due to purge number or conflict resolution: Mutations: 0 Deletions: 0 2020-10-31T14:28:25.169-07:00 (Cmd) cbbackupmgr version 6.6.1-9151 Hostname: node1-mad-hatter-testing-centos7.vagrants OS: linux Version: 3.10.0-1127.19.1.el7.x86_64 x86_64 Arch: amd64 vCPU: 1 Memory: 3880736 2020-10-31T14:28:25.169-07:00 (Cmd) backup -a /tmp/entbackup/ -r backup -c 10.112.194.101 -u <ud>Administrator</ud> -p ***** --resume 2020-10-31T14:28:25.169-07:00 (Cmd) mounted archive with id: 5530511f-f7b1-4a68-ab12-a5b97eb057b0 2020-10-31T14:28:25.171-07:00 (Rest) GET http://10.112.194.101:8091/pools 200 2020-10-31T14:28:25.180-07:00 (Rest) GET http://10.112.194.101:8091/pools/default/buckets 200 2020-10-31T14:28:25.184-07:00 (Rest) GET http://10.112.194.101:8091/pools/default/buckets/default 200
          Hide
          pvarley Patrick Varley added a comment -

          Description for release notes:

          There is a rare case where cbbackupmgr backup would crash instead of exiting gracefully and reporting the error. This could only happen at the start of a backup if the connection to the Data Service was lost. This has now been fixed in 6.6.1.

          Workaround:
          To rerun the backup with the --purge option

          Show
          pvarley Patrick Varley added a comment - Description for release notes: There is a rare case where cbbackupmgr backup would crash instead of exiting gracefully and reporting the error. This could only happen at the start of a backup if the connection to the Data Service was lost. This has now been fixed in 6.6.1. Workaround: To rerun the backup with the --purge option

            People

            Assignee:
            carlos.gonzalez Carlos Gonzalez Betancort
            Reporter:
            thuan Thuan Nguyen
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Gerrit Reviews

                There are no open Gerrit changes

                  PagerDuty