Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-44931

Eventing function with feed boundary (from now) sometimes deploy from everything

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown

    Description

      Occasionally I see an Eventing function with a "deploy from now" function actually deploy form everything.  This issue is hard to reproduce but seems to happen about 1 out of every 20 or 30 undeploy/re-deploy or pause/resumes. When I am importing and working on a lot of functions.

      I had an Eventing source bucket with 10M docs and was working with an Eventing function that is set/fixed to have a feed boundary "from now" yet occasionally when I resume that function "curl_loc2" it processes all 10 Million docs when it should do nothing.

      I triggered this issue again on 3/11/2021 a bit prior to 5pm PST (I had just two function one undeployed and the other I was pausing and resuming)

      The time range when this happened was sometime between 2021-03-11T16:34:58.349-08:00 to 2021-03-11T16:40:23.393-08:00

      I took a CB collect when this happened 

      ls -ltr cbcollect_MB-44931_deploying_from_wrong_feed_boundary.zip
      -rw-r--r-- 1 linuxbrew linuxbrew 55111489 Mar 11 16:46 cbcollect_deploying_from_wrong_feed_boundary.zip

      This has been uploaded via

      curl --upload-file cbcollect_MB-44931_deploying_from_wrong_feed_boundary.zip https://s3.amazonaws.com/cb-engineering/
       
      

       The couchbase-server was a 7.0 source build created on Mar 11, 2021 14:00 PST running on debian 10 (I have seen this issue for a week or so now)

      Linux couch01 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux

      The odd thing is that I only have two (2) functions, I have 3,073 items in my metadata Eventing scratchpad (I should only have 2048 (with one active deployed function "curl_loc2" and one undeployed function "basicCurlGet_auth_bearer_mockserver"). Looking at the bucket 

      CREATE PRIMARY INDEX midx ON metadata

      SELECT count
      FROM metadata
      WHERE assigned_worker LIKE "worker_curl_loc2%"

      1024

      SELECT count
      FROM metadata
      WHERE assigned_worker LIKE "worker_basicCurlGet_auth_bearer_mockserver%"

      0

       

      SELECT COUNT
      FROM metadata
      WHERE ( assigned_worker NOT LIKE "worker_basicCurlGet%" )
      AND ( assigned_worker NOT LIKE "worker_curl_loc2%" )

      2048
      So what are these other 2048 items I seem to have 1024 with keys like

      "id": "eventing::2266648328::curl_loc2::vb::991"

      {
       "assigned_worker": "",
       "bootstrap_stream_req_done": false,
       "current_vb_owner": "",
       "dcp_stream_status": "stopped",
       "dcp_stream_requested": false,
       "last_checkpoint_time": "2021-03-11 16:34:10.61623231 -0800 PST m=+244.472199355",
       "last_doc_timer_feedback_seqno": 0,
       "last_processed_seq_no": 53056,
       "node_uuid": "",
       "node_requested_vb_stream": "",
       "node_uuid_requested_vb_stream": "",
       "ownership_history": [
       {
       "assigned_worker": "worker_curl_loc2_34",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "bootstrap",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:33:01.923556352 -0800 PST m=+175.779523414"
       },
       {
       "assigned_worker": "worker_curl_loc2_34",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "stream_requested",
       "seq_no": 53056,
       "timestamp": "2021-03-11 16:33:01.938840315 -0800 PST m=+175.794807336"
       },
       {
       "assigned_worker": "worker_curl_loc2_34",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "running",
       "seq_no": 53056,
       "timestamp": "2021-03-11 16:33:02.935518042 -0800 PST m=+176.791485082"
       },
       {
       "assigned_worker": "worker_curl_loc2_34",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "stopped",
       "seq_no": 53056,
       "timestamp": "2021-03-11 16:34:10.615778418 -0800 PST m=+244.471745452"
       }
       ],
       "previous_assigned_worker": "worker_curl_loc2_34",
       "previous_node_uuid": "36a90a493d8526d635cc5a111a316c48",
       "previous_vb_owner": "127.0.0.1:8096",
       "vb_id": 991,
       "vb_uuid": 132371710274914,
       "worker_requested_vb_stream": "",
       "manifest_id": "0",
       "currently_processed_doc_id_timer": "2021-03-12T00:33:01Z",
       "last_cleaned_up_doc_id_timer_event": "2021-03-12T00:33:01Z",
       "last_doc_id_timer_sent_to_worker": "2021-03-12T00:33:01Z",
       "last_processed_doc_id_timer_event": "2021-03-12T00:33:01Z",
       "next_doc_id_timer_to_process": "2021-03-12T00:33:02Z",
       "currently_processed_cron_timer": "2021-03-12T00:33:02Z",
       "last_processed_cron_timer_event": "",
       "next_cron_timer_to_process": "2021-03-12T00:33:02Z",
       "version": "evt-7.0.0-0000-ee"
      }

      And I have unexpected cruft of 1024 documents with keys like the following: 

      "id": "eventing::3624865054::zzz::vb::1000"

      FYI I did have a function zzz at one time but I deleted that and refreshed the browser I only have the function "curl_loc2" deployed and one undeployed function "basicCurlGet_auth_bearer_mockserver" - nothing else.

      {
       "assigned_worker": "worker_zzz_23",
       "bootstrap_stream_req_done": true,
       "current_vb_owner": "127.0.0.1:8096",
       "dcp_stream_status": "running",
       "dcp_stream_requested": false,
       "last_checkpoint_time": "2021-03-11 16:29:19.732910148 -0800 PST m=+78.377885359",
       "last_doc_timer_feedback_seqno": 0,
       "last_processed_seq_no": 53101,
       "node_uuid": "36a90a493d8526d635cc5a111a316c48",
       "node_requested_vb_stream": "",
       "node_uuid_requested_vb_stream": "",
       "ownership_history": [
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "bootstrap",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:16:16.311672471 -0800 PST m=+6479.557751949"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "stream_requested",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:16:16.326739434 -0800 PST m=+6479.572818896"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "running",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:16:17.335642099 -0800 PST m=+6480.581721535"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.31426233 -0800 PST m=+7144.560341796"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.315333343 -0800 PST m=+7144.561412826"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.316378507 -0800 PST m=+7144.562458002"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.317401001 -0800 PST m=+7144.563480524"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.318378674 -0800 PST m=+7144.564458192"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.319371167 -0800 PST m=+7144.565450657"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.320185194 -0800 PST m=+7144.566264718"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.321020453 -0800 PST m=+7144.567100008"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.321882216 -0800 PST m=+7144.567961683"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.322935644 -0800 PST m=+7144.569015146"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.324032428 -0800 PST m=+7144.570111958"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.325231755 -0800 PST m=+7144.571311253"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.326297044 -0800 PST m=+7144.572376574"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.327542117 -0800 PST m=+7144.573621592"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.328411111 -0800 PST m=+7144.574490617"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.329478293 -0800 PST m=+7144.575557805"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.330677267 -0800 PST m=+7144.576756726"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.331860633 -0800 PST m=+7144.577940079"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "metadata_recreated",
       "seq_no": 0,
       "timestamp": "2021-03-11 16:27:21.332928961 -0800 PST m=+7144.579008453"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "stream_requested",
       "seq_no": 53101,
       "timestamp": "2021-03-11 16:28:14.716363288 -0800 PST m=+13.361338490"
       },
       {
       "assigned_worker": "worker_zzz_23",
       "current_vb_owner": "127.0.0.1:8096",
       "operation": "running",
       "seq_no": 53101,
       "timestamp": "2021-03-11 16:28:15.712662048 -0800 PST m=+14.357637261"
       }
       ],
       "previous_assigned_worker": "worker_zzz_23",
       "previous_node_uuid": "36a90a493d8526d635cc5a111a316c48",
       "previous_vb_owner": "127.0.0.1:8096",
       "vb_id": 1000,
       "vb_uuid": 224369975207543,
       "worker_requested_vb_stream": "",
       "manifest_id": "0",
       "currently_processed_doc_id_timer": "2021-03-12T00:28:14Z",
       "last_cleaned_up_doc_id_timer_event": "2021-03-12T00:28:14Z",
       "last_doc_id_timer_sent_to_worker": "2021-03-12T00:28:14Z",
       "last_processed_doc_id_timer_event": "2021-03-12T00:27:21Z",
       "next_doc_id_timer_to_process": "2021-03-12T00:28:15Z",
       "currently_processed_cron_timer": "2021-03-12T00:28:15Z",
       "last_processed_cron_timer_event": "",
       "next_cron_timer_to_process": "2021-03-12T00:28:15Z",
       "version": "evt-7.0.0-0000-ee"
      }

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              jon.strabala Jon Strabala
              jon.strabala Jon Strabala
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty