Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-29952

CLONE (Backport MB-29764) - Indexer crashes with goroutine stack exceeds 1000000000-byte limit

    XMLWordPrintable

Details

    • Triaged
    • Unknown
    • Storage-Sprint-End-Jun-29-2018, Storage-Sprint-End-Jul-13-2018, Storage-Sprint-End-Jul-27-2018

    Description

      Indexer process goroutine stack exceeds 1000000000-byte limit fatal error: stack overflow and cause the indexer to crash.

      This is the stack

      StorageMgr::handleCreateSnapshot Added New Snapshot Index: 2954511192241179090 PartitionId: 0 SliceId: 0 Crc64: 3092221115143794419 (SnapshotInfo: count:10889206 committed:false) SnapCreateDur 62.255µs SnapOpenDur 1.078635ms
      runtime: goroutine stack exceeds 1000000000-byte limit
      fatal error: stack overflow
      runtime stack:
      runtime.throw(0xe730bc, 0xe)
      /home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/panic.go:566 +0x95 fp=0x7f317bffeb88 sp=0x7f317bffeb68
      runtime.newstack()
      /home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/stack.go:1061 +0x416 fp=0x7f317bffed08 sp=0x7f317bffeb88
      runtime.morestack()
      /home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/asm_amd64.s:366 +0x7f fp=0x7f317bffed10 sp=0x7f317bffed08
      goroutine 10783 [running]:
      github.com/couchbase/plasma.(*item).getPtrKeyItem(0xc4a6b64003, 0x0)
      /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/item.go:100 fp=0xc5894cc2b8 sp=0xc5894cc2b0
      github.com/couchbase/plasma.(*item).Size(0xc4a6b64003, 0x0)

       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            tai.tran Tai Tran (Inactive) created issue -
            tai.tran Tai Tran (Inactive) made changes -
            Field Original Value New Value
            Link This issue blocks CBSE-5247 [ CBSE-5247 ]
            tai.tran Tai Tran (Inactive) made changes -
            Fix Version/s vulcan [ 14610 ]
            tai.tran Tai Tran (Inactive) made changes -
            Link This issue relates to MB-29764 [ MB-29764 ]

            separate ticket for 5.1.2 backport.

            tai.tran Tai Tran (Inactive) added a comment - separate ticket for 5.1.2 backport.
            tai.tran Tai Tran (Inactive) made changes -
            Sprint Storage-Sprint-End-Jun-29-2019 [ 591 ]
            tai.tran Tai Tran (Inactive) made changes -
            Rank Ranked higher
            tai.tran Tai Tran (Inactive) added a comment - - edited

            The plan activities for issue are:

            1. to backport the changes for MB-29800 and MB-29939 to 5.1.2, run system test to ensure that the problem for MB-29800 and MB-29939 no longer occur (the theory is that changes in MB-29800 and MB-29939 remedy/prevent this memory corrupted reported in this ticket)
            2. create unit tests to cause page merge (CBSS-74) i.e.to independently try to re-produce the problem when run without fixes for MB-29800 and MB-29939, then run with those fixes to verify that the problem does not occur.
            tai.tran Tai Tran (Inactive) added a comment - - edited The plan activities for issue are: to backport the changes for MB-29800 and MB-29939 to 5.1.2, run system test to ensure that the problem for MB-29800 and MB-29939 no longer occur (the theory is that changes in MB-29800 and MB-29939 remedy/prevent this memory corrupted reported in this ticket) create unit tests to cause page merge (CBSS-74) i.e.to independently try to re-produce the problem when run without fixes for MB-29800 and MB-29939 , then run with those fixes to verify that the problem does not occur.
            tai.tran Tai Tran (Inactive) made changes -
            Epic Link CBSS-85 [ 86760 ]
            tai.tran Tai Tran (Inactive) made changes -
            Due Date 04/Jun/18 06/Jul/18
            sundar Sundar Sridharan (Inactive) made changes -
            Link This issue blocks MB-29966 [ MB-29966 ]
            wayne Wayne Siu made changes -
            Summary CLONE (backport) - Indexer crashes with goroutine stack exceeds 1000000000-byte limit CLONE (Backport MB-29764) - Indexer crashes with goroutine stack exceeds 1000000000-byte limit
            tai.tran Tai Tran (Inactive) made changes -
            Sprint Storage-Sprint-End-Jun-29-2018 [ 591 ] Storage-Sprint-End-Jun-29-2018, Storage-Sprint-End-Jul-13-2018 [ 591, 602 ]
            tai.tran Tai Tran (Inactive) made changes -
            Sprint Storage-Sprint-End-Jun-29-2018, Storage-Sprint-End-Jul-13-2018 [ 591, 602 ] Storage-Sprint-End-Jun-29-2018, Storage-Sprint-End-Jul-13-2018, Storage-Sprint-End-Jul-27-2018 [ 591, 602, 603 ]
            sundar Sundar Sridharan (Inactive) made changes -
            Assignee Srinath Duvuru [ srinath.duvuru ] Sundar Sridharan [ sundar ]
            tai.tran Tai Tran (Inactive) made changes -
            Due Date 06/Jul/18 20/Jul/18
            tai.tran Tai Tran (Inactive) made changes -
            Rank Ranked lower
            sundar Sundar Sridharan (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]

            Build couchbase-server-5.1.2-5901 contains plasma commit 80f3368 with commit message:
            MB-29952 [BP] item: Add corruption check for item data

            build-team Couchbase Build Team added a comment - Build couchbase-server-5.1.2-5901 contains plasma commit 80f3368 with commit message: MB-29952 [BP] item: Add corruption check for item data
            sundar Sundar Sridharan (Inactive) made changes -
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Resolved [ 5 ]

            Verified on 5.1.2-6026. The GSI component system test was run for 3+ days and did not see any issue. Also functional testing hasn't shown any regressions.
            System testing job : https://issues.couchbase.com/browse/MB-31096

            mihir.kamdar Mihir Kamdar (Inactive) added a comment - Verified on 5.1.2-6026. The GSI component system test was run for 3+ days and did not see any issue. Also functional testing hasn't shown any regressions. System testing job : https://issues.couchbase.com/browse/MB-31096
            mihir.kamdar Mihir Kamdar (Inactive) made changes -
            Status Resolved [ 5 ] Closed [ 6 ]

            People

              sundar Sundar Sridharan (Inactive)
              krishna.doddi Krishna Doddi
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty