Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-56477

[System test upgrade] :- "GOXDCR.P2PManager: Unable to respond to caller given type VBMasterCheck" errors seen during online upgrade using swap rebalance

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 0
    • Unknown

    Description

      Steps to repro
      1. Run neo longevity on 7.1.4 for 4 days

      ./sequoia -client 172.23.104.27:2375 -provider file:centos_pine.yml -test tests/integration/neo/test_neo.yml -scope tests/integration/neo/scope_neo_magma.yml -scale 3 -repeat 0 -log_level 0 -version 7.1.4-3601 -skip_setup=false -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=604800 -show_topology=true
      

      2. Started doing online upgrade to 7.2.0-5309 using swap rebalance.

      172.23.96.148 2:10:00 AM 15 Apr, 2023

      Starting rebalance, KeepNodes = ['ns_1@172.23.104.137','ns_1@172.23.104.155',
      'ns_1@172.23.104.5','ns_1@172.23.104.67',
      'ns_1@172.23.104.69','ns_1@172.23.104.70',
      'ns_1@172.23.105.107','ns_1@172.23.105.168',
      'ns_1@172.23.106.100','ns_1@172.23.106.188',
      'ns_1@172.23.107.131','ns_1@172.23.107.95',
      'ns_1@172.23.108.103','ns_1@172.23.120.107',
      'ns_1@172.23.120.245','ns_1@172.23.121.117',
      'ns_1@172.23.121.86','ns_1@172.23.123.28',
      'ns_1@172.23.96.148','ns_1@172.23.96.252',
      'ns_1@172.23.97.119','ns_1@172.23.97.121',
      'ns_1@172.23.97.122','ns_1@172.23.97.239',
      'ns_1@172.23.99.20','ns_1@172.23.99.21'], EjectNodes = ['ns_1@172.23.104.157',
      'ns_1@172.23.99.25',
      'ns_1@172.23.96.253',
      'ns_1@172.23.105.111',
      'ns_1@172.23.96.192',
      'ns_1@172.23.99.11'], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = ae692fdfbeb6b28551dd864c8c8d2ef4 hide
      

      Saw the error messages like "Unable to respond to caller given type VBMasterCheck opaque 3674406912 timed out chan 0xc0c9443048" on all the kv nodes.

      172.23.104.5 : xdcr
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log.1.gz:2023-04-16T04:28:11.792-07:00 ERRO GOXDCR.P2PManager: Unable to respond to caller given type VBMasterCheck opaque 2644246528 timed out chan 0xc09871c650
       
      172.23.106.100 : xdcr
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log.1.gz:2023-04-16T04:45:25.255-07:00 ERRO GOXDCR.P2PManager: Unable to respond to caller given type VBMasterCheck opaque 3666083840 timed out chan 0xc042b2ae20
       
       
      172.23.108.103 : xdcr
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log.1.gz:2023-04-16T04:46:06.095-07:00 ERRO GOXDCR.P2PManager: Unable to respond to caller given type VBMasterCheck opaque 448724992 timed out chan 0xc1d5ee5a80
       
      172.23.121.117 : xdcr
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log.1.gz:2023-04-16T04:45:03.633-07:00 ERRO GOXDCR.P2PManager: Unable to respond to caller given type VBMasterCheck opaque 789315584 timed out chan 0xc033b3fb98
       
      172.23.96.148 : xdcr
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log:2023-04-16T04:44:54.774-07:00 ERRO GOXDCR.P2PManager: Unable to respond to caller given type VBMasterCheck opaque 2167275520 timed out chan 0xc09b926488
       
      172.23.97.119 : xdcr
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log:2023-04-16T04:45:22.053-07:00 ERRO GOXDCR.P2PManager: Unable to respond to caller given type VBMasterCheck opaque 3866099712 timed out chan 0xc01a33b410
       
      172.23.97.121 : xdcr
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log:2023-04-16T04:45:06.683-07:00 ERRO GOXDCR.P2PManager: Unable to respond to caller given type VBMasterCheck opaque 3619225600 timed out chan 0xc09617e8c0
       
      172.23.97.122 : xdcr
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log.1.gz:2023-04-16T04:45:38.251-07:00 ERRO GOXDCR.P2PManager: Unable to respond to caller given type VBMasterCheck opaque 1847853056 timed out chan 0xc00add9288
       
      172.23.99.20 : xdcr
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log:2023-04-16T04:45:38.390-07:00 ERRO GOXDCR.P2PManager: Unable to respond to caller given type VBMasterCheck opaque 2450653184 timed out chan 0xc07aa36dc8
       
      172.23.99.21 : xdcr
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log.1.gz:2023-04-16T04:26:20.205-07:00 ERRO GOXDCR.P2PManager: Unable to respond to caller given type VBMasterCheck opaque 2109145088 timed out chan 0xc087f2afd8
       
      172.23.99.25 : xdcr
      /opt/couchbase/var/lib/couchbase/logs/goxdcr.log:2023-04-16T05:06:13.709-07:00 ERRO GOXDCR.P2PManager: Unable to respond to caller given type VBMasterCheck opaque 4019126272 timed out chan 0xc03030d650
      

      cbcollect_info attached.

      Attachments

        For Gerrit Dashboard: MB-56477
        # Subject Branch Project Status CR V

        Activity

          People

            sumukh.bhat Sumukh Bhat
            Balakumaran.Gopal Balakumaran Gopal
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty