Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6714

ep-engine needs around 1 seconds to mark empty and closed checkpoint as persisted

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • None
    • None
    • couchbase-bucket
    • Security Level: Public
    • None

    Description

      I'm pretty sure this was not a case up until very recently. Suspect this is caused by recent XDCR changes where we enabled closing of empty checkpoints. (BTW there's no need IMHO)

      Steps to reproduce:

      • grab everything latest and enable consistent views
      • rebalance 1-2 or 1->4
      • observe how slow it is
      • in logs of ns_server you would see a lot of this:

      [rebalance:info,2012-09-25T6:54:32.463,n_0@192.168.1.2:<0.18358.0>:janitor_agent:get_replication_persistence_checkpoint_id:470]default: Doing get_replication_persistence_checkpoint_id call for vbucket 127 on n_0@192.168.1.2
      [ns_server:debug,2012-09-25T6:54:32.463,n_0@192.168.1.2:'janitor_agent-default':janitor_agent:handle_call:661]After creating new checkpoint here's what we have:

      {2,3,4}

      [ns_server:debug,2012-09-25T6:54:32.464,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.464,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.475,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.475,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.486,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.486,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.497,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.497,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.508,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.508,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.519,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.519,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.550,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.550,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.581,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.581,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.612,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.612,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.643,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.643,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.674,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.674,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.775,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.775,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.876,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.876,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.977,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:32.977,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:33.078,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:33.078,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:33.179,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:33.179,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:33.280,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:33.280,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:33.381,n_2@192.168.1.2:<0.12504.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:33.381,n_3@192.168.1.2:<0.12508.0>:janitor_agent:do_wait_checkpoint_persisted:746]Waiting persisted checkpoint 3 on 127, now 2
      [ns_server:debug,2012-09-25T6:54:33.482,n_3@192.168.1.2:'janitor_agent-default':janitor_agent:handle_info:677]Got done message from subprocess: <0.12508.0> (ok)

      P.S.: given that it'll greatly slow down testing I'd raise this to blocker, but not sure people would understand me so set this to just critical.

      Attachments

        For Gerrit Dashboard: MB-6714
        # Subject Branch Project Status CR V

        Activity

          People

            chiyoung Chiyoung Seo (Inactive)
            alkondratenko Aleksey Kondratenko (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                PagerDuty