Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-10967

recvd snapshot twice during failure scenario

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 3.0
    • 3.0
    • couchbase-bucket
    • Security Level: Public
    • Untriaged
    • Unknown
    • June 30 - July 18

    Description

      wrote a script to perform the failure scenario here: https://github.com/couchbaselabs/cbupr/blob/master/failure-scenarios.md

      the gist of it: https://gist.github.com/tahmmee/11301024 (also attached)

      At the end there is an attempt to stream 10k items.
      However, what I'm observing is a snapshot is sent and after this a repeat of mutations at seqno_1.

      Here is the stream response ->

      {'status': 0, 'body': '', 'opcode': 80} {'status': 0, 'opcode': 83, 'failover_log': [(68675663887800, 1222), (55655994587054, 0)]} {'value': 'val', 'lock_time': 0, 'opcode': 87, 'expiration': 0, 'key': 'key17', 'flags': 0, 'nru': 2, 'vbucket': 0, 'by_seqno': 1, 'rev_seqno': 1}

      ...

      {'value': 'val', 'lock_time': 0, 'opcode': 87, 'expiration': 0, 'key': 'key9976', 'flags': 0, 'nru': 2, 'vbucket': 0, 'by_seqno': 1224, 'rev_seqno': 1} {'value': 'val', 'lock_time': 0, 'opcode': 87, 'expiration': 0, 'key': 'key9982', 'flags': 0, 'nru': 2, 'vbucket': 0, 'by_seqno': 1225, 'rev_seqno': 1}

      ...

      {'vbucket': 0, 'opcode': 86} {'value': 'val', 'lock_time': 0, 'opcode': 87, 'expiration': 0, 'key': 'key17', 'flags': 0, 'nru': 2, 'vbucket': 0, 'by_seqno': 1, 'rev_seqno': 1} {'value': 'val', 'lock_time': 0, 'opcode': 87, 'expiration': 0, 'key': 'key21', 'flags': 0, 'nru': 2, 'vbucket': 0, 'by_seqno': 2, 'rev_seqno': 1} {'value': 'val', 'lock_time': 0, 'opcode': 87, 'expiration': 0, 'key': 'key24', 'flags': 0, 'nru': 2, 'vbucket': 0, 'by_seqno': 3, 'rev_seqno': 1}

      ...

      The last seqno sent is 1225, and in logs it shows this was where backfill completed:

      [ns_server:info,2014-04-25T15:53:54.574,babysitter_of_n_1@127.0.0.1:<0.85.0>:ns_port_server:log:169]memcached<0.85.0>: Fri Apr 25 15:53:54.372982 EDT 3: (default) Scheduling backfill for vb 0 (0 to 1226)
      memcached<0.85.0>: Fri Apr 25 15:53:54.373173 EDT 3: (default) UPR (Producer) eq_uprq:failuerscenario1 - Stream created for vbucket 0
      memcached<0.85.0>: Fri Apr 25 15:53:54.377299 EDT 3: (default) UPR (Producer) eq_uprq:failuerscenario1 - Backfill complete for vb 0, last seqno read: 1225

      I've attached the script with it's deps and it can be unpacked and reproduced in cluster_run:
      ./cluster_run -n4
      python uprfailurescenario.py
      ...
      AssertionError: ERROR: Out of order response on vbucket 0:

      {'value': 'val', 'lock_time': 0, 'opcode': 87, 'expiration': 0, 'key': 'key5', 'flags': 0, 'nru': 2, 'vbucket': 0, 'by_seqno': 1, 'rev_seqno': 1}

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            tommie Tommie McAfee (Inactive)
            tommie Tommie McAfee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty