Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6462

Race between set_vbucket_state commands and the TAP replication establishment

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • 2.0
    • 2.0-beta
    • couchbase-bucket, ns_server
    • Security Level: Public
    • None
    • centos 64 bits with build-1645

    Description

      1. Set up the three node cluster
      2. Create the default bucket

      After that, I saw the race between set_vbucket_state commands and TAP replication connection establishments on 10.1.3.235:

      ...
      [ns_server:info,2012-08-27T17:25:09.479,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 15 state to active
      [ns_server:info,2012-08-27T17:25:09.480,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 14 state to active
      [ns_server:info,2012-08-27T17:25:09.481,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 13 state to active
      [ns_server:info,2012-08-27T17:25:09.482,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 12 state to active
      [ns_server:info,2012-08-27T17:25:09.483,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 11 state to active
      [ns_server:info,2012-08-27T17:25:09.483,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 10 state to active
      [ns_server:info,2012-08-27T17:25:09.484,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 9 state to active
      [ns_server:info,2012-08-27T17:25:09.485,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 8 state to active
      [views:debug,2012-08-27T17:25:09.485,ns_1@10.1.3.235:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:452]Got set_vbucket event for default/773. Updated state: replica
      [ns_server:info,2012-08-27T17:25:09.485,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 7 state to active
      [ns_server:info,2012-08-27T17:25:09.486,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 6 state to active
      [ns_server:info,2012-08-27T17:25:09.486,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 5 state to active
      [ns_server:info,2012-08-27T17:25:09.487,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 4 state to active
      [ns_server:info,2012-08-27T17:25:09.487,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 3 state to active
      [ns_server:info,2012-08-27T17:25:09.488,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 2 state to active
      [ns_server:info,2012-08-27T17:25:09.493,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 1 state to active
      [ns_server:info,2012-08-27T17:25:09.494,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 0 state to active
      ...
      [ns_server:info,2012-08-27T17:25:09.541,ns_1@10.1.3.235:ns_port_memcached:ns_port_server:log:169]
      memcached<0.403.0>: Tue Aug 28 00:28:26.811685 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 0 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811723 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 1 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811738 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 2 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811749 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 3 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811758 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 4 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811768 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 5 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811781 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 6 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811791 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 7 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811802 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 8 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811813 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 9 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811825 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 10 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811964 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 11 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811980 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 12 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811991 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 13 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.812001 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 14 not found for TAP cursor. Skip it...
      ...

      --> Those ep-engine warnings indicated that the ep-engine received the TAP replication connection requests even before receiving set_vbucket_state commands from the ns-server and creating vbucket instances in memory hash table according to their states. This race condition consequently breaks the TAP replication.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            alkondratenko Aleksey Kondratenko (Inactive)
            chiyoung Chiyoung Seo (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty