Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6462

Race between set_vbucket_state commands and the TAP replication establishment

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • 2.0
    • 2.0-beta
    • couchbase-bucket, ns_server
    • Security Level: Public
    • None
    • centos 64 bits with build-1645

    Description

      1. Set up the three node cluster
      2. Create the default bucket

      After that, I saw the race between set_vbucket_state commands and TAP replication connection establishments on 10.1.3.235:

      ...
      [ns_server:info,2012-08-27T17:25:09.479,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 15 state to active
      [ns_server:info,2012-08-27T17:25:09.480,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 14 state to active
      [ns_server:info,2012-08-27T17:25:09.481,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 13 state to active
      [ns_server:info,2012-08-27T17:25:09.482,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 12 state to active
      [ns_server:info,2012-08-27T17:25:09.483,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 11 state to active
      [ns_server:info,2012-08-27T17:25:09.483,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 10 state to active
      [ns_server:info,2012-08-27T17:25:09.484,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 9 state to active
      [ns_server:info,2012-08-27T17:25:09.485,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 8 state to active
      [views:debug,2012-08-27T17:25:09.485,ns_1@10.1.3.235:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:452]Got set_vbucket event for default/773. Updated state: replica
      [ns_server:info,2012-08-27T17:25:09.485,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 7 state to active
      [ns_server:info,2012-08-27T17:25:09.486,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 6 state to active
      [ns_server:info,2012-08-27T17:25:09.486,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 5 state to active
      [ns_server:info,2012-08-27T17:25:09.487,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 4 state to active
      [ns_server:info,2012-08-27T17:25:09.487,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 3 state to active
      [ns_server:info,2012-08-27T17:25:09.488,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 2 state to active
      [ns_server:info,2012-08-27T17:25:09.493,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 1 state to active
      [ns_server:info,2012-08-27T17:25:09.494,ns_1@10.1.3.235:<0.32135.4>:ns_memcached:do_handle_call:485]Changed vbucket 0 state to active
      ...
      [ns_server:info,2012-08-27T17:25:09.541,ns_1@10.1.3.235:ns_port_memcached:ns_port_server:log:169]
      memcached<0.403.0>: Tue Aug 28 00:28:26.811685 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 0 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811723 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 1 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811738 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 2 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811749 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 3 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811758 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 4 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811768 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 5 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811781 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 6 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811791 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 7 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811802 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 8 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811813 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 9 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811825 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 10 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811964 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 11 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811980 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 12 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.811991 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 13 not found for TAP cursor. Skip it...
      memcached<0.403.0>: Tue Aug 28 00:28:26.812001 3: TAP (Producer) eq_tapq:replication_ns_1@10.1.3.236 - VBucket 14 not found for TAP cursor. Skip it...
      ...

      --> Those ep-engine warnings indicated that the ep-engine received the TAP replication connection requests even before receiving set_vbucket_state commands from the ns-server and creating vbucket instances in memory hash table according to their states. This race condition consequently breaks the TAP replication.

      Attachments

        For Gerrit Dashboard: MB-6462
        # Subject Branch Project Status CR V

        Activity

          People

            alkondratenko Aleksey Kondratenko (Inactive)
            chiyoung Chiyoung Seo (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty