Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-37123

Data mismatch due to COUCHBASE_ECANTGETPORT and CAS mismatch with n1ql

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Cannot Reproduce
    • 6.5.0
    • 6.5.0
    • qe
    • Untriaged
    • No

    Description

      build: 6.5.0-4908, CAS mismatch was happening since couple of weeks but COUCHBASE_ECANTGETPORT is showing up this week only. COUCHBASE_ECANTGETPORT was not observed on 6.5.0-4874

      ./testrunner -i /tmp/testexec.6433.ini -p get-cbcollect-info=True,GROUP=n1ql_op_with_timers -t eventing.eventing_rebalance.EventingRebalance.test_erl_crash_on_kv_and_eventing_node_during_eventing_rebalance,doc-per-day=20,dataset=default,nodes_init=5,services_init=kv-kv-eventing-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=n1ql_op_with_timers,replicas=1,GROUP=n1ql_op_with_timers
       
       
      Exception: Bucket operations from handler code took lot of time to complete or didn't go through. Current : 6261 Expected : 0  dcp_backlog : 0  TIMERS_IN_PAST : 0 lcb_exceptions : {'172.23.105.238': {u'47': 64047, u'12009': 254663}, '172.23.105.24': {u'47': 5420, u'12009': 75680}, '172.23.105.242': {u'12009': 356} 

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          vikas.chaudhary Vikas Chaudhary created issue -
          vikas.chaudhary Vikas Chaudhary made changes -
          Field Original Value New Value
          Description build: 6.5.0-4908, it was happening since couple of weeks 
          {noformat}
          ./testrunner -i /tmp/testexec.6433.ini -p get-cbcollect-info=True,GROUP=n1ql_op_with_timers -t eventing.eventing_rebalance.EventingRebalance.test_erl_crash_on_kv_and_eventing_node_during_eventing_rebalance,doc-per-day=20,dataset=default,nodes_init=5,services_init=kv-kv-eventing-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=n1ql_op_with_timers,replicas=1,GROUP=n1ql_op_with_timers


          Exception: Bucket operations from handler code took lot of time to complete or didn't go through. Current : 6261 Expected : 0 dcp_backlog : 0 TIMERS_IN_PAST : 0 lcb_exceptions : {'172.23.105.238': {u'47': 64047, u'12009': 254663}, '172.23.105.24': {u'47': 5420, u'12009': 75680}, '172.23.105.242': {u'12009': 356} {noformat}
          build: 6.5.0-4908, CAS missmatch was happening since couple of weeks but 
          {noformat}./testrunner -i /tmp/testexec.6433.ini -p get-cbcollect-info=True,GROUP=n1ql_op_with_timers -t eventing.eventing_rebalance.EventingRebalance.test_erl_crash_on_kv_and_eventing_node_during_eventing_rebalance,doc-per-day=20,dataset=default,nodes_init=5,services_init=kv-kv-eventing-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=n1ql_op_with_timers,replicas=1,GROUP=n1ql_op_with_timers


          Exception: Bucket operations from handler code took lot of time to complete or didn't go through. Current : 6261 Expected : 0 dcp_backlog : 0 TIMERS_IN_PAST : 0 lcb_exceptions : {'172.23.105.238': {u'47': 64047, u'12009': 254663}, '172.23.105.24': {u'47': 5420, u'12009': 75680}, '172.23.105.242': {u'12009': 356} {noformat}
          vikas.chaudhary Vikas Chaudhary made changes -
          Description build: 6.5.0-4908, CAS missmatch was happening since couple of weeks but 
          {noformat}./testrunner -i /tmp/testexec.6433.ini -p get-cbcollect-info=True,GROUP=n1ql_op_with_timers -t eventing.eventing_rebalance.EventingRebalance.test_erl_crash_on_kv_and_eventing_node_during_eventing_rebalance,doc-per-day=20,dataset=default,nodes_init=5,services_init=kv-kv-eventing-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=n1ql_op_with_timers,replicas=1,GROUP=n1ql_op_with_timers


          Exception: Bucket operations from handler code took lot of time to complete or didn't go through. Current : 6261 Expected : 0 dcp_backlog : 0 TIMERS_IN_PAST : 0 lcb_exceptions : {'172.23.105.238': {u'47': 64047, u'12009': 254663}, '172.23.105.24': {u'47': 5420, u'12009': 75680}, '172.23.105.242': {u'12009': 356} {noformat}
          build: 6.5.0-4908, CAS missmatch was happening since couple of weeks but COUCHBASE_ECANTGETPORT is showing up this week only. COUCHBASE_ECANTGETPORT was not observed on 6.5.0-4874
          {noformat}./testrunner -i /tmp/testexec.6433.ini -p get-cbcollect-info=True,GROUP=n1ql_op_with_timers -t eventing.eventing_rebalance.EventingRebalance.test_erl_crash_on_kv_and_eventing_node_during_eventing_rebalance,doc-per-day=20,dataset=default,nodes_init=5,services_init=kv-kv-eventing-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=n1ql_op_with_timers,replicas=1,GROUP=n1ql_op_with_timers


          Exception: Bucket operations from handler code took lot of time to complete or didn't go through. Current : 6261 Expected : 0 dcp_backlog : 0 TIMERS_IN_PAST : 0 lcb_exceptions : {'172.23.105.238': {u'47': 64047, u'12009': 254663}, '172.23.105.24': {u'47': 5420, u'12009': 75680}, '172.23.105.242': {u'12009': 356} {noformat}
          jeelan.poola Jeelan Poola made changes -
          Assignee Jeelan Poola [ jeelan.poola ] Gautham Banasandra [ gautham.banasandra ]
          Gautham.Banasandra Gautham Banasandra (Inactive) made changes -
          Summary Data missmatch due to COUCHBASE_ECANTGETPORT and CAS mismatch with n1ql Data mismatch due to COUCHBASE_ECANTGETPORT and CAS mismatch with n1ql
          Gautham.Banasandra Gautham Banasandra (Inactive) made changes -
          Description build: 6.5.0-4908, CAS missmatch was happening since couple of weeks but COUCHBASE_ECANTGETPORT is showing up this week only. COUCHBASE_ECANTGETPORT was not observed on 6.5.0-4874
          {noformat}./testrunner -i /tmp/testexec.6433.ini -p get-cbcollect-info=True,GROUP=n1ql_op_with_timers -t eventing.eventing_rebalance.EventingRebalance.test_erl_crash_on_kv_and_eventing_node_during_eventing_rebalance,doc-per-day=20,dataset=default,nodes_init=5,services_init=kv-kv-eventing-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=n1ql_op_with_timers,replicas=1,GROUP=n1ql_op_with_timers


          Exception: Bucket operations from handler code took lot of time to complete or didn't go through. Current : 6261 Expected : 0 dcp_backlog : 0 TIMERS_IN_PAST : 0 lcb_exceptions : {'172.23.105.238': {u'47': 64047, u'12009': 254663}, '172.23.105.24': {u'47': 5420, u'12009': 75680}, '172.23.105.242': {u'12009': 356} {noformat}
          build: 6.5.0-4908, CAS mismatch was happening since couple of weeks but COUCHBASE_ECANTGETPORT is showing up this week only. COUCHBASE_ECANTGETPORT was not observed on 6.5.0-4874
          {noformat}./testrunner -i /tmp/testexec.6433.ini -p get-cbcollect-info=True,GROUP=n1ql_op_with_timers -t eventing.eventing_rebalance.EventingRebalance.test_erl_crash_on_kv_and_eventing_node_during_eventing_rebalance,doc-per-day=20,dataset=default,nodes_init=5,services_init=kv-kv-eventing-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=n1ql_op_with_timers,replicas=1,GROUP=n1ql_op_with_timers


          Exception: Bucket operations from handler code took lot of time to complete or didn't go through. Current : 6261 Expected : 0 dcp_backlog : 0 TIMERS_IN_PAST : 0 lcb_exceptions : {'172.23.105.238': {u'47': 64047, u'12009': 254663}, '172.23.105.24': {u'47': 5420, u'12009': 75680}, '172.23.105.242': {u'12009': 356} {noformat}
          jeelan.poola Jeelan Poola added a comment -

          Suspect some environment issue. Request to retest in the same environment with build 4874. And then repeat the test with 4908/4910.

          jeelan.poola Jeelan Poola added a comment - Suspect some environment issue. Request to retest in the same environment with build 4874. And then repeat the test with 4908/4910.
          jeelan.poola Jeelan Poola made changes -
          Assignee Gautham Banasandra [ gautham.banasandra ] Vikas Chaudhary [ vikas.chaudhary ]

          As discussed with Gautham Banasandra , Jeelan Poola its not an environment issue, with 6.5.0-4908 we start adding lcb error + n1ql error in the stats. Previously if we get n1ql error we only increment for those. Hence these error start showing suddenly. 

          vikas.chaudhary Vikas Chaudhary added a comment - As discussed with Gautham Banasandra  , Jeelan Poola  its not an environment issue, with 6.5.0-4908 we start adding lcb error + n1ql error in the stats. Previously if we get n1ql error we only increment for those. Hence these error start showing suddenly. 
          Gautham.Banasandra Gautham Banasandra (Inactive) added a comment - Vikas Chaudhary Jeelan Poola I just ran this test on dev cluster and it's passing - http://qa.sc.couchbase.com/job/dev_testbed_blr2/31/
          jeelan.poola Jeelan Poola made changes -
          Component/s qe [ 12728 ]
          Component/s eventing [ 14026 ]
          jeelan.poola Jeelan Poola added a comment -

          Vikas Chaudhary I changed the component to qe for the time being. Please change it back if you still see a product issue. Thanks.

          jeelan.poola Jeelan Poola added a comment - Vikas Chaudhary I changed the component to qe for the time being. Please change it back if you still see a product issue. Thanks.

          not seeing COUCHBASE_ECANTGETPORT on 6.5.0-4917

          vikas.chaudhary Vikas Chaudhary added a comment - not seeing COUCHBASE_ECANTGETPORT on 6.5.0-4917
          vikas.chaudhary Vikas Chaudhary made changes -
          Resolution Cannot Reproduce [ 5 ]
          Status Open [ 1 ] Closed [ 6 ]

          People

            vikas.chaudhary Vikas Chaudhary
            vikas.chaudhary Vikas Chaudhary
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty