Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Duplicate
-
6.0.0
-
6.0.0 build 1567
-
Untriaged
-
-
No
Description
Title
Original was: Eventing API's detect wrong version after upgrade when upgraded using online upgrade using failover
Updated to reflect it is seen that in /pools/default listing itself and so not specific to eventing.
Script to Repro
./testrunner -i /tmp/upgrade3.ini -p get-cbcollect-info=True -t eventing.eventing_upgrade.EventingUpgrade.test_online_upgrade_with_failover_rebalance_with_eventing,nodes_init=4,dataset=default,groups=simple,skip_cleanup=True,initial_version=5.5.0-2958,doc-per-day=2,upgrade_version=6.0.0-1567
|
Steps to Repro
- Create a 4 node cluster kv-eventing-index-n1ql in 5.5.0-2958
- Deploy a bucket op function
- Add 4 alice nodes kv-eventing-index-n1ql
- Failover all the old vulcan nodes and rebalance out all the nodes.
- Deploy a timer function using the following API which fails with "Function requires 6.0 but cluster is at 5.5"
Request
2018-08-2802:30:11,
|
061- root - ERROR - POST http://172.23.104.91:8091/_p/event/setApplication/?name=test_import_function_2 body:{
|
"depcfg":{
|
"buckets":[
|
{
|
"alias":"dst_bucket",
|
"bucket_name":"dst_bucket1"
|
}
|
],
|
"source_bucket":"src_bucket",
|
"metadata_bucket":"metadata"
|
},
|
"appcode":"function OnUpdate(doc,meta) {\n var expiry = new Date();\n expiry.setSeconds(expiry.getSeconds() + 5);\n\n var context = {docID : meta.id};\n createTimer(NDtimerCallback, expiry, meta.id, context);\n}\nfunction NDtimerCallback(context) {\n dst_bucket[context.docID] = 'from NDtimerCallback';\n}",
|
"id":0,
|
"settings":{
|
"enable_recursive_mutation":false,
|
"app_log_max_files":10,
|
"curl_timeout":500,
|
"skip_timer_threshold":86400,
|
"dcp_stream_boundary":"everything",
|
"use_memory_manager":true,
|
"persist_interval":5000,
|
"sock_batch_size":100,
|
"dcp_num_connections":1,
|
"enable_snapshot_smr":false,
|
"log_level":"TRACE",
|
"min_page_items":50,
|
"fuzz_offset":0,
|
"max_delta_chain_len":200,
|
"xattr_doc_timer_entry_prune_threshold":100,
|
"worker_feedback_queue_cap":10000,
|
"tick_duration":60000,
|
"deadline_timeout":3,
|
"app_log_max_size":10485760,
|
"max_page_items":400,
|
"worker_count":3,
|
"lss_read_ahead_size":1048576,
|
"deployment_status":true,
|
"lss_cleaner_threshold":30,
|
"description":"",
|
"dcp_gen_chan_size":10000,
|
"lss_cleaner_max_threshold":70,
|
"feedback_batch_size":100,
|
"auto_swapper":true,
|
"worker_queue_cap":100000,
|
"cpp_worker_thread_count":2,
|
"cron_timers_per_doc":1000,
|
"feedback_read_buffer_size":65536,
|
"execution_timeout":1,
|
"processing_status":true,
|
"cleanup_timers":false,
|
"timer_processing_tick_interval":500,
|
"breakpad_on":true,
|
"lcb_inst_capacity":5,
|
"vb_ownership_giveup_routine_count":3,
|
"data_chan_size":10000,
|
"vb_ownership_takeover_routine_count":3,
|
"checkpoint_interval":10000
|
},
|
"appname":"test_import_function_2"
|
}
|
Response
headers:{
|
'Content-type':'application/json',
|
'Authorization':'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==\n'
|
}error:406reason:unknown{
|
"name":"ERR_CLUSTER_VERSION",
|
"code":42,
|
"description":"This function syntax is unsupported on current cluster version",
|
"attributes":null,
|
"runtime_info":{
|
"code":42,
|
"info":"Function requires 6.0 but cluster is at 5.5"
|
}
|
}
|
However entire cluster is already in 6.0.0 before we run the cluster. This doesn't seem to happen from UI but only through this API which we extensively use in automation.
At the same time this API works fine for upgrade when we upgrade through swap rebalance and regular rebalance.
Logs attached.
Automation Log :Â http://qa.sc.couchbase.com/job/test_bala_upgrade_new1/56/consoleText
Attachments
Issue Links
- relates to
-
MB-22002 Metadata store that is immediately consistent
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Summary | Eventing API's detect wrong version after upgrade when upgrade through online upgrade using failover | Eventing API's detect wrong version after upgrade when upgraded using online upgrade using failover |
Assignee | Jeelan Poola [ jeelan.poola ] | Sriram Melkote [ siri ] |
Labels | functional-test |
Due Date | 26/Sep/18 |
Due Date | 26/Sep/18 | 05/Sep/18 |
Assignee | Sriram Melkote [ siri ] | Balakumaran Gopal [ balakumaran.gopal ] |
Component/s | test-execution [ 10231 ] | |
Component/s | eventing [ 14026 ] |
Attachment | consoleText-2018-09-04040534 [ 58178 ] |
Component/s | test-execution [ 10231 ] |
Is this a Regression? | Unknown [ 10452 ] | No [ 10451 ] |
Component/s | eventing [ 14026 ] |
Component/s | eventing [ 14026 ] |
Summary | Eventing API's detect wrong version after upgrade when upgraded using online upgrade using failover | After cluster upgrade, clusterCompatibility still reports old number |
Summary | After cluster upgrade, clusterCompatibility still reports old number | After cluster upgrade, clusterCompatibility still at old value |
Description |
+Script to Repro+
{noformat}./testrunner -i /tmp/upgrade3.ini -p get-cbcollect-info=True -t eventing.eventing_upgrade.EventingUpgrade.test_online_upgrade_with_failover_rebalance_with_eventing,nodes_init=4,dataset=default,groups=simple,skip_cleanup=True,initial_version=5.5.0-2958,doc-per-day=2,upgrade_version=6.0.0-1567 {noformat} +Steps to Repro+ * Create a 4 node cluster kv-eventing-index-n1ql in 5.5.0-2958 * Deploy a bucket op function * Add 4 alice nodes kv-eventing-index-n1ql * Failover all the old vulcan nodes and rebalance out all the nodes. * Deploy a timer function using the following API which fails with "*Function requires 6.0 but cluster is at 5.5"* +Request+ {noformat}2018-08-2802:30:11, 061- root - ERROR - POST http://172.23.104.91:8091/_p/event/setApplication/?name=test_import_function_2 body:{ "depcfg":{ "buckets":[ { "alias":"dst_bucket", "bucket_name":"dst_bucket1" } ], "source_bucket":"src_bucket", "metadata_bucket":"metadata" }, "appcode":"function OnUpdate(doc,meta) {\n var expiry = new Date();\n expiry.setSeconds(expiry.getSeconds() + 5);\n\n var context = {docID : meta.id};\n createTimer(NDtimerCallback, expiry, meta.id, context);\n}\nfunction NDtimerCallback(context) {\n dst_bucket[context.docID] = 'from NDtimerCallback';\n}", "id":0, "settings":{ "enable_recursive_mutation":false, "app_log_max_files":10, "curl_timeout":500, "skip_timer_threshold":86400, "dcp_stream_boundary":"everything", "use_memory_manager":true, "persist_interval":5000, "sock_batch_size":100, "dcp_num_connections":1, "enable_snapshot_smr":false, "log_level":"TRACE", "min_page_items":50, "fuzz_offset":0, "max_delta_chain_len":200, "xattr_doc_timer_entry_prune_threshold":100, "worker_feedback_queue_cap":10000, "tick_duration":60000, "deadline_timeout":3, "app_log_max_size":10485760, "max_page_items":400, "worker_count":3, "lss_read_ahead_size":1048576, "deployment_status":true, "lss_cleaner_threshold":30, "description":"", "dcp_gen_chan_size":10000, "lss_cleaner_max_threshold":70, "feedback_batch_size":100, "auto_swapper":true, "worker_queue_cap":100000, "cpp_worker_thread_count":2, "cron_timers_per_doc":1000, "feedback_read_buffer_size":65536, "execution_timeout":1, "processing_status":true, "cleanup_timers":false, "timer_processing_tick_interval":500, "breakpad_on":true, "lcb_inst_capacity":5, "vb_ownership_giveup_routine_count":3, "data_chan_size":10000, "vb_ownership_takeover_routine_count":3, "checkpoint_interval":10000 }, "appname":"test_import_function_2" } {noformat} +Response+ {noformat}headers:{ 'Content-type':'application/json', 'Authorization':'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==\n' }error:406reason:unknown{ "name":"ERR_CLUSTER_VERSION", "code":42, "description":"This function syntax is unsupported on current cluster version", "attributes":null, "runtime_info":{ "code":42, "info":"Function requires 6.0 but cluster is at 5.5" } } {noformat} However entire cluster is already in 6.0.0 before we run the cluster. This doesn't seem to happen from UI but only through this API which we extensively use in automation. At the same time this API works fine for upgrade when we upgrade through swap rebalance and regular rebalance. Logs attached. Automation Log : http://qa.sc.couchbase.com/job/test_bala_upgrade_new1/56/consoleText |
+Title+
Original was: Eventing API's detect wrong version after upgrade when upgraded using online upgrade using failover Updated to reflect it is seen that in /pools/default listing itself and so not specific to eventing. +Script to Repro+ {noformat}./testrunner -i /tmp/upgrade3.ini -p get-cbcollect-info=True -t eventing.eventing_upgrade.EventingUpgrade.test_online_upgrade_with_failover_rebalance_with_eventing,nodes_init=4,dataset=default,groups=simple,skip_cleanup=True,initial_version=5.5.0-2958,doc-per-day=2,upgrade_version=6.0.0-1567 {noformat} +Steps to Repro+ * Create a 4 node cluster kv-eventing-index-n1ql in 5.5.0-2958 * Deploy a bucket op function * Add 4 alice nodes kv-eventing-index-n1ql * Failover all the old vulcan nodes and rebalance out all the nodes. * Deploy a timer function using the following API which fails with "*Function requires 6.0 but cluster is at 5.5"* +Request+ {noformat}2018-08-2802:30:11, 061- root - ERROR - POST http://172.23.104.91:8091/_p/event/setApplication/?name=test_import_function_2 body:{ "depcfg":{ "buckets":[ { "alias":"dst_bucket", "bucket_name":"dst_bucket1" } ], "source_bucket":"src_bucket", "metadata_bucket":"metadata" }, "appcode":"function OnUpdate(doc,meta) {\n var expiry = new Date();\n expiry.setSeconds(expiry.getSeconds() + 5);\n\n var context = {docID : meta.id};\n createTimer(NDtimerCallback, expiry, meta.id, context);\n}\nfunction NDtimerCallback(context) {\n dst_bucket[context.docID] = 'from NDtimerCallback';\n}", "id":0, "settings":{ "enable_recursive_mutation":false, "app_log_max_files":10, "curl_timeout":500, "skip_timer_threshold":86400, "dcp_stream_boundary":"everything", "use_memory_manager":true, "persist_interval":5000, "sock_batch_size":100, "dcp_num_connections":1, "enable_snapshot_smr":false, "log_level":"TRACE", "min_page_items":50, "fuzz_offset":0, "max_delta_chain_len":200, "xattr_doc_timer_entry_prune_threshold":100, "worker_feedback_queue_cap":10000, "tick_duration":60000, "deadline_timeout":3, "app_log_max_size":10485760, "max_page_items":400, "worker_count":3, "lss_read_ahead_size":1048576, "deployment_status":true, "lss_cleaner_threshold":30, "description":"", "dcp_gen_chan_size":10000, "lss_cleaner_max_threshold":70, "feedback_batch_size":100, "auto_swapper":true, "worker_queue_cap":100000, "cpp_worker_thread_count":2, "cron_timers_per_doc":1000, "feedback_read_buffer_size":65536, "execution_timeout":1, "processing_status":true, "cleanup_timers":false, "timer_processing_tick_interval":500, "breakpad_on":true, "lcb_inst_capacity":5, "vb_ownership_giveup_routine_count":3, "data_chan_size":10000, "vb_ownership_takeover_routine_count":3, "checkpoint_interval":10000 }, "appname":"test_import_function_2" } {noformat} +Response+ {noformat}headers:{ 'Content-type':'application/json', 'Authorization':'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==\n' }error:406reason:unknown{ "name":"ERR_CLUSTER_VERSION", "code":42, "description":"This function syntax is unsupported on current cluster version", "attributes":null, "runtime_info":{ "code":42, "info":"Function requires 6.0 but cluster is at 5.5" } } {noformat} However entire cluster is already in 6.0.0 before we run the cluster. This doesn't seem to happen from UI but only through this API which we extensively use in automation. At the same time this API works fine for upgrade when we upgrade through swap rebalance and regular rebalance. Logs attached. Automation Log : http://qa.sc.couchbase.com/job/test_bala_upgrade_new1/56/consoleText |
Component/s | eventing [ 14026 ] |
Component/s | eventing [ 14026 ] |
Component/s | ns_server [ 10019 ] |
Resolution | Duplicate [ 3 ] | |
Status | Open [ 1 ] | Resolved [ 5 ] |
Status | Resolved [ 5 ] | Closed [ 6 ] |
Bala, I checked our code and we fetch /pools/default and examine node version each time we deploy. So we should eliminate the possibility that /pools/default was itself behind cluster status (which would be then a ns_server issue). As we discussed, could you please log the output of /pools/default before doing the deployment? Thank you so much for this change, I'm requesting as it is timing related.