Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6232

ep-engine needs 1.5 minutes to create 1k vbuckets. Seems too slow (faster with barrier=0)

    Details

    • Triage:
      Triaged
    • Is this a Regression?:
      Yes

      Description

      SUBJ.

      Just try single node dev cluster and try to load beers sample (it'll fail for perhaps different reason).

      You can observe in logs:

      We're doing set_vbucket_state requests from janitor (ns_memcached messages) and this is the first messages from mccouch about just created vbucket (capi_set_view_manager messages and mc_connection messages). Note: I've added extra log messages at the beginning of sync_notify and end to see if it's couchdb's or ns_servers' fault. From timestamps it's clear that it's not.

      [ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
      [ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 740 state to active
      [views:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1023. Updated state: active
      [ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 739 state to active
      [ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done
      [ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 738 state to active
      [ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 737 state to active

      And last message is at:

      [ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
      [views:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1. Updated state: active
      [ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done
      [ns_server:debug,2012-08-15T9:49:14.501,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
      [views:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/0. Updated state: active
      [ns_server:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done

      I've just retried with remounting with barrier=0 and it was blazing fast. What this mount option changes is short fsync are nearly instant because they merely send data to disk's buffer without waiting until data actually hits platter.

      So that's evidence of some excessive use of fsyncs somewhere during vbucket db files creation and that's clearly on ep-engine/couchstore side.

      Set this to blocker as it'll affect people trying to create buckets on real hardware, hard disks and any modern linux distro (where barriers are on by default).

      1. debug.1.gz
        269 kB
        Tommie McAfee

        Issue Links

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

          alkondratenko Aleksey Kondratenko (Inactive) created issue -
          alkondratenko Aleksey Kondratenko (Inactive) made changes -
          Field Original Value New Value
          Description SUBJ.

          Just try single node dev cluster and try to load beers sample (it's fail for perhaps different reason).

          You can observe in logs:

          (we're doing set_vbucket_state requests from janitor and are this is the first message from mccouch about just created bucket. Note: I've added extra log messages at the beginning of sync_notify and end to see if it's couchdb's or ns_servers' fault. It's clear that it's not.

          [ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
          [ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 740 state to active
          [views:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1023. Updated state: active
          [ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 739 state to active
          [ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done
          [ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 738 state to active
          [ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 737 state to active

          And last message is at:

          [ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
          [views:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1. Updated state: active
          [ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done
          [ns_server:debug,2012-08-15T9:49:14.501,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
          [views:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/0. Updated state: active
          [ns_server:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done


          I've just retried with remounting with barrier=0 and it was blazing fast. What this mount option changes is short fsync are nearly instant because they merely send data to disk's buffer without waiting until data actually hits platter.

          So that's evidence of some excessive use of fsyncs somewhere during vbucket db files creation and that's clearly on ep-engine/couchstore side.

          Set this to blocker as it'll affect people trying to create buckets on real hardware, hard disks and any modern linux distro (where barriers are on by default).
          SUBJ.

          Just try single node dev cluster and try to load beers sample (it'll fail for perhaps different reason).

          You can observe in logs:

          We're doing set_vbucket_state requests from janitor (ns_memcached messages) and this is the first messages from mccouch about just created vbucket (capi_set_view_manager messages and mc_connection messages). Note: I've added extra log messages at the beginning of sync_notify and end to see if it's couchdb's or ns_servers' fault. From timestamps it's clear that it's not.

          [ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
          [ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 740 state to active
          [views:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1023. Updated state: active
          [ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 739 state to active
          [ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done
          [ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 738 state to active
          [ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 737 state to active

          And last message is at:

          [ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
          [views:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1. Updated state: active
          [ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done
          [ns_server:debug,2012-08-15T9:49:14.501,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
          [views:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/0. Updated state: active
          [ns_server:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done


          I've just retried with remounting with barrier=0 and it was blazing fast. What this mount option changes is short fsync are nearly instant because they merely send data to disk's buffer without waiting until data actually hits platter.

          So that's evidence of some excessive use of fsyncs somewhere during vbucket db files creation and that's clearly on ep-engine/couchstore side.

          Set this to blocker as it'll affect people trying to create buckets on real hardware, hard disks and any modern linux distro (where barriers are on by default).
          Hide
          ingenthr Matt Ingenthron added a comment -

          Note that on MB-6305 this caused quite a bit of trouble. Currently, one cannot reliably go through a cycle of delete/create bucket in any reliable fashion using public interfaces. Sometime between build 1495 and build 2.0.0c 709, the delete/recreate cycle now takes multiple minutes.

          In automated testing some SDKs used to use flush_all(). At the recommendation of engineering, we moved to delete/create. Now using delete/create is intractable. This has broken our testing again, unfortunately.

          The larger concern is that our users will probably see this pretty quickly too. What we were doing here isn't all that unusual.

          Show
          ingenthr Matt Ingenthron added a comment - Note that on MB-6305 this caused quite a bit of trouble. Currently, one cannot reliably go through a cycle of delete/create bucket in any reliable fashion using public interfaces. Sometime between build 1495 and build 2.0.0c 709, the delete/recreate cycle now takes multiple minutes. In automated testing some SDKs used to use flush_all(). At the recommendation of engineering, we moved to delete/create. Now using delete/create is intractable. This has broken our testing again, unfortunately. The larger concern is that our users will probably see this pretty quickly too. What we were doing here isn't all that unusual.
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          Matt, I think you can try workaround of:

          mount / -o remount,nobarrier

          And if it's not helping in your case then there's something else that makes it slower then needed.

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - Matt, I think you can try workaround of: mount / -o remount,nobarrier And if it's not helping in your case then there's something else that makes it slower then needed.
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          Matt, I think you can try workaround of:

          mount / -o remount,nobarrier

          And if it's not helping in your case then there's something else that makes it slower then needed.

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - Matt, I think you can try workaround of: mount / -o remount,nobarrier And if it's not helping in your case then there's something else that makes it slower then needed.
          farshid Farshid Ghods (Inactive) made changes -
          Priority Blocker [ 1 ] Critical [ 2 ]
          peter peter made changes -
          Fix Version/s .next [ 10342 ]
          Priority Critical [ 2 ] Major [ 3 ]
          dipti Dipti Borkar made changes -
          Assignee Damien Katz [ damien ] Peter Wansch [ peter ]
          Fix Version/s 2.0 [ 10114 ]
          Fix Version/s .next [ 10342 ]
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          BTW I recall now having similar problem pre DP4. I've fixed it by creating all vbuckets in parallel. But that was ofcourse when mccouch was at it's full power.

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - BTW I recall now having similar problem pre DP4. I've fixed it by creating all vbuckets in parallel. But that was ofcourse when mccouch was at it's full power.
          Hide
          ingenthr Matt Ingenthron added a comment -

          @alk, I presume that should still be in here, so the issue is further down it would seem.

          thanks for the additional info

          Show
          ingenthr Matt Ingenthron added a comment - @alk, I presume that should still be in here, so the issue is further down it would seem. thanks for the additional info
          Hide
          tommie Tommie McAfee added a comment -

          Just tried to reproduce this with 2.0 build 1689 and loading bear samples times out:

          [ns_server:error,2012-09-06T10:49:42.436,ns_1@127.0.0.1:<0.752.0>:menelaus_web:handle_post_sample_buckets:506]Loading sample buckets timed out

          Same error is shown in the UI. If I click to retry it says data is already loaded, however if I inspect the db there are no documents in the bucket. Not sure if this is similar issue...attaching diag

          Show
          tommie Tommie McAfee added a comment - Just tried to reproduce this with 2.0 build 1689 and loading bear samples times out: [ns_server:error,2012-09-06T10:49:42.436,ns_1@127.0.0.1:<0.752.0>:menelaus_web:handle_post_sample_buckets:506] Loading sample buckets timed out Same error is shown in the UI. If I click to retry it says data is already loaded, however if I inspect the db there are no documents in the bucket. Not sure if this is similar issue...attaching diag
          tommie Tommie McAfee made changes -
          Attachment debug.1.gz [ 14791 ]
          tommie Tommie McAfee made changes -
          Attachment debug.1.gz [ 14792 ]
          tommie Tommie McAfee made changes -
          Attachment debug.1.gz [ 14791 ]
          Show
          farshid Farshid Ghods (Inactive) added a comment - more info here : http://www.couchbase.com/issues/browse/MB-6305
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          Matt, when I referenced mccouch that was old mccouch that was our persistence path. This is not the case anymore. So same issue is still hitting us. We're seemingly doing (potentially few) fsyncs per vbucket and we're doing all that sequentially.

          Tommie, docloader thing is most likely distinct issue. I think we already have bug for that.

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - Matt, when I referenced mccouch that was old mccouch that was our persistence path. This is not the case anymore. So same issue is still hitting us. We're seemingly doing (potentially few) fsyncs per vbucket and we're doing all that sequentially. Tommie, docloader thing is most likely distinct issue. I think we already have bug for that.
          farshid Farshid Ghods (Inactive) made changes -
          Labels 2.0-beta-release-notes
          Hide
          mikew Mike Wiederhold added a comment -

          Alk,

          I investigated this issue further and found that the latency to send a vbucket state transition to mccouch was 100ms. Interestingly to do a state transition to dead the latency was an order of magnitude quicker so I think there might be something going on on the mccouch side here.

          Show
          mikew Mike Wiederhold added a comment - Alk, I investigated this issue further and found that the latency to send a vbucket state transition to mccouch was 100ms. Interestingly to do a state transition to dead the latency was an order of magnitude quicker so I think there might be something going on on the mccouch side here.
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          Mike, by looking at timestamps in logs I'm pretty sure there are no delays on mccouch side.

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - Mike, by looking at timestamps in logs I'm pretty sure there are no delays on mccouch side.
          Hide
          peter peter added a comment -

          Mike, can you continue driving to analyze this?

          Show
          peter peter added a comment - Mike, can you continue driving to analyze this?
          peter peter made changes -
          Assignee Peter Wansch [ peter ] Mike Wiederhold [ mikew ]
          chiyoung Chiyoung Seo made changes -
          Sprint Status Current Sprint
          Hide
          mikew Mike Wiederhold added a comment -

          The fix for this bug will require changes on both the ep-engine and mccouch side and requires modifying how we store vbucket state data and how we notify mccouch about vbucket state changes. The changes will not be small and will be risky for the 2.0 release. I recommend this change be move to 2.0.1.

          Show
          mikew Mike Wiederhold added a comment - The fix for this bug will require changes on both the ep-engine and mccouch side and requires modifying how we store vbucket state data and how we notify mccouch about vbucket state changes. The changes will not be small and will be risky for the 2.0 release. I recommend this change be move to 2.0.1.
          Hide
          chiyoung Chiyoung Seo added a comment -

          For more details. we currently store a vbucket state file in its vbucket database as a local doc, which means that it will require one fsync for every state change. I was trying to replace those individual local docs with a global vbucket state doc and store it in the "master" meta database, but realized that it requires changes in interactions between ep-engine and erlang, and also affects our 2.0 backup and restore tools at this time.

          This is the design defect and limitation and risky changes for 2.0 at this time.

          Show
          chiyoung Chiyoung Seo added a comment - For more details. we currently store a vbucket state file in its vbucket database as a local doc, which means that it will require one fsync for every state change. I was trying to replace those individual local docs with a global vbucket state doc and store it in the "master" meta database, but realized that it requires changes in interactions between ep-engine and erlang, and also affects our 2.0 backup and restore tools at this time. This is the design defect and limitation and risky changes for 2.0 at this time.
          chiyoung Chiyoung Seo made changes -
          Fix Version/s .next [ 10342 ]
          Fix Version/s 2.0 [ 10114 ]
          chiyoung Chiyoung Seo made changes -
          Sprint Status Current Sprint
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          Can we consider not using fsync for initial vbucket creation ?

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - Can we consider not using fsync for initial vbucket creation ?
          dipti Dipti Borkar made changes -
          Priority Major [ 3 ] Blocker [ 1 ]
          avsej Sergey Avseyev made changes -
          Link This issue is duplicated by MB-7160 [ MB-7160 ]
          Hide
          mikew Mike Wiederhold added a comment -

          Also see MB-6234 which is a duplicate of this issue.

          Show
          mikew Mike Wiederhold added a comment - Also see MB-6234 which is a duplicate of this issue.
          mikew Mike Wiederhold made changes -
          Fix Version/s Backlog [ 10090 ]
          Fix Version/s .next [ 10342 ]
          mikew Mike Wiederhold made changes -
          Planned Start (set to new fixed version's start date)
          Planned End (set to new fixed version's start date)
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          MB-6234 is not necessarily a dup. This bug is about metadata ops.

          Particularly, why ep-engine is even needs tons of fsync to create brand new vbuckets ?

          Why ep-engine cannot just drop those ops on the floor when during that 1.5 minute there's flush ?

          I think at least those actions are possible and perhaps more and sdk folks will finally have a working flush.

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - MB-6234 is not necessarily a dup. This bug is about metadata ops. Particularly, why ep-engine is even needs tons of fsync to create brand new vbuckets ? Why ep-engine cannot just drop those ops on the floor when during that 1.5 minute there's flush ? I think at least those actions are possible and perhaps more and sdk folks will finally have a working flush.
          dipti Dipti Borkar made changes -
          Fix Version/s 2.0.2 [ 10418 ]
          Fix Version/s Backlog [ 10090 ]
          dipti Dipti Borkar made changes -
          Planned Start (set to new fixed version's start date)
          Planned End (set to new fixed version's start date)
          Hide
          dipti Dipti Borkar added a comment -

          We need to better understand this issue in ep_engine and couchstore and come up with options to fix it in the 2.0.2 / 2.1 timeframe

          Show
          dipti Dipti Borkar added a comment - We need to better understand this issue in ep_engine and couchstore and come up with options to fix it in the 2.0.2 / 2.1 timeframe
          dipti Dipti Borkar made changes -
          Affects Version/s 2.0 [ 10114 ]
          Affects Version/s 2.0.1 [ 10399 ]
          Affects Version/s 2.0-beta [ 10113 ]
          Hide
          ingenthr Matt Ingenthron added a comment -

          Yay!

          Show
          ingenthr Matt Ingenthron added a comment - Yay!
          mikew Mike Wiederhold made changes -
          Fix Version/s 2.1 [ 10414 ]
          Fix Version/s 2.0.2 [ 10418 ]
          Hide
          mikew Mike Wiederhold added a comment -

          There are some other issues still in 2.0.2 that will likely help improve this issue, but I am moving this task out to 2.1

          Show
          mikew Mike Wiederhold added a comment - There are some other issues still in 2.0.2 that will likely help improve this issue, but I am moving this task out to 2.1
          frank Frank Weigel made changes -
          Labels 2.0-beta-release-notes 2.0-beta-release-notes customer
          Hide
          maria Maria McDuff (Inactive) added a comment -

          Created CBQE for QE to add in functional automation suite: http://www.couchbase.com/issues/browse/CBQE-1192

          Show
          maria Maria McDuff (Inactive) added a comment - Created CBQE for QE to add in functional automation suite: http://www.couchbase.com/issues/browse/CBQE-1192
          dipti Dipti Borkar made changes -
          Labels 2.0-beta-release-notes customer PM-PRIORITIZED customer
          maria Maria McDuff (Inactive) made changes -
          Fix Version/s 2.2.0 [ 10620 ]
          Fix Version/s .major-release [ 10414 ]
          Affects Version/s 2.1.0 [ 10418 ]
          Hide
          ingenthr Matt Ingenthron added a comment -

          I removed the note in the summary about the barrier/disk sync because we have some evidence that in some environments, even ramdisk does not make the problem go away.

          Show
          ingenthr Matt Ingenthron added a comment - I removed the note in the summary about the barrier/disk sync because we have some evidence that in some environments, even ramdisk does not make the problem go away.
          ingenthr Matt Ingenthron made changes -
          Summary ep-engine needs 1.5 minutes to create 1k vbuckets. Seems too slow (but gets fast with barrier=0) ep-engine needs 1.5 minutes to create 1k vbuckets.
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          Matt, unfortunately you're not quite correct. 1.5 minutes for bucket creation is only possible on spinning disk and barriers on. Customer SM issues are order of magnitude smaller.

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - Matt, unfortunately you're not quite correct. 1.5 minutes for bucket creation is only possible on spinning disk and barriers on. Customer SM issues are order of magnitude smaller.
          Hide
          ingenthr Matt Ingenthron added a comment -

          That's a fair criticism, but it still wasn't a correct statement by saying it "gets fast". Well, it may be if we are talking strictly about ep-engine. I've changed it back.

          Show
          ingenthr Matt Ingenthron added a comment - That's a fair criticism, but it still wasn't a correct statement by saying it "gets fast". Well, it may be if we are talking strictly about ep-engine. I've changed it back.
          ingenthr Matt Ingenthron made changes -
          Summary ep-engine needs 1.5 minutes to create 1k vbuckets. ep-engine needs 1.5 minutes to create 1k vbuckets. Seems too slow (faster with barrier=0)
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          "fast" is always not specific enough. For my human feelings fews seconds to create 1k vbuckets without barriers is fast. For automated tests that do flush (and that requires more than just creating 1k vbuckets) it's not fast.

          IMHO a bit too many of our bugs including admittedly this are not specific enough. Most importantly a bunch of tickets like "XXX is too slow".

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - "fast" is always not specific enough. For my human feelings fews seconds to create 1k vbuckets without barriers is fast. For automated tests that do flush (and that requires more than just creating 1k vbuckets) it's not fast. IMHO a bit too many of our bugs including admittedly this are not specific enough. Most importantly a bunch of tickets like "XXX is too slow".
          Hide
          ingenthr Matt Ingenthron added a comment -

          Much agreed. The request from users here would be sub-second, mostly due to the flush-testing use case. I can't speak for the practicality of this if 1024 individual IOs are needed and we have a disk that does only 120 IOPS, but that's what the user's desire is. I'm certain it can be refactored to not being 1024 synchronous IOs.

          Show
          ingenthr Matt Ingenthron added a comment - Much agreed. The request from users here would be sub-second, mostly due to the flush-testing use case. I can't speak for the practicality of this if 1024 individual IOs are needed and we have a disk that does only 120 IOPS, but that's what the user's desire is. I'm certain it can be refactored to not being 1024 synchronous IOs.
          chiyoung Chiyoung Seo made changes -
          Sprint CB Bucket - 22/Jul/13 [ 28 ]
          chiyoung Chiyoung Seo made changes -
          Rank Ranked higher
          Hide
          chiyoung Chiyoung Seo added a comment -

          Points: 3

          Show
          chiyoung Chiyoung Seo added a comment - Points: 3
          Hide
          chiyoung Chiyoung Seo added a comment -

          The ep-engine team will discuss the possible approaches with the other teams to resolve this issue.

          Show
          chiyoung Chiyoung Seo added a comment - The ep-engine team will discuss the possible approaches with the other teams to resolve this issue.
          Hide
          chiyoung Chiyoung Seo added a comment -

          Let's discuss this issue offline to see how we can tackle it.

          Show
          chiyoung Chiyoung Seo added a comment - Let's discuss this issue offline to see how we can tackle it.
          chiyoung Chiyoung Seo made changes -
          Assignee Mike Wiederhold [ mikew ] Sundar Sridharan [ sundar ]
          Hide
          chiyoung Chiyoung Seo added a comment -

          The engine team discussed this issue, but as we mentioned earlier in this ticket, maintaining a vbucket_state local doc per vbucket database file is one of the major reasons for this bucket creation slowness with the barrier on.

          We plan to consolidate all the vbucket state local docs into a single local doc and manage it in "master.couch" database file. Then, it won't require 1024 sync IOs. However, this fix is not only in the engine side, but backup / restore and compactor. Note that the DB compactor doesn't compact "master.couch" file.

          We plan to address it in the next major or minor release.

          Show
          chiyoung Chiyoung Seo added a comment - The engine team discussed this issue, but as we mentioned earlier in this ticket, maintaining a vbucket_state local doc per vbucket database file is one of the major reasons for this bucket creation slowness with the barrier on. We plan to consolidate all the vbucket state local docs into a single local doc and manage it in "master.couch" database file. Then, it won't require 1024 sync IOs. However, this fix is not only in the engine side, but backup / restore and compactor. Note that the DB compactor doesn't compact "master.couch" file. We plan to address it in the next major or minor release.
          chiyoung Chiyoung Seo made changes -
          Fix Version/s .next [ 10342 ]
          Fix Version/s 2.2.0 [ 10620 ]
          dipti Dipti Borkar made changes -
          Fix Version/s .major-release [ 10414 ]
          Fix Version/s .next [ 10342 ]
          Hide
          FilipeManana Filipe Manana (Inactive) added a comment -

          I would suggest a dedicated database file for keeping those states. Using master database will require some coordination between ep-engine and ns_server/couchdb (to prevent concurrent btree updates), leading to all sorts of issues and complexity. Shouldn't add any significant overhead this way.

          Show
          FilipeManana Filipe Manana (Inactive) added a comment - I would suggest a dedicated database file for keeping those states. Using master database will require some coordination between ep-engine and ns_server/couchdb (to prevent concurrent btree updates), leading to all sorts of issues and complexity. Shouldn't add any significant overhead this way.
          chiyoung Chiyoung Seo made changes -
          Sprint CB Bucket - 22/Jul/13 [ 28 ]
          chiyoung Chiyoung Seo made changes -
          Rank Ranked lower
          Hide
          sundar Sundar Sridharan added a comment -

          Just a quick note/update that we don't seem to be seeing this issue anymore with current master branch. Bucket creation on single node does not take more than a minute.
          I think Chiyoung and Filipe's suggestion of a separate file for storing the vbucket snapshots would be very useful.
          Reducing the priority from Blocker to Minor.
          thanks

          Show
          sundar Sundar Sridharan added a comment - Just a quick note/update that we don't seem to be seeing this issue anymore with current master branch. Bucket creation on single node does not take more than a minute. I think Chiyoung and Filipe's suggestion of a separate file for storing the vbucket snapshots would be very useful. Reducing the priority from Blocker to Minor. thanks
          sundar Sundar Sridharan made changes -
          Priority Blocker [ 1 ] Minor [ 4 ]
          Hide
          dipti Dipti Borkar added a comment -

          Would be good for someone else to validate this.

          Show
          dipti Dipti Borkar added a comment - Would be good for someone else to validate this.
          Hide
          ingenthr Matt Ingenthron added a comment -

          Sundar: why reduce the priority? The priority should remain the same until it's fixed.

          Show
          ingenthr Matt Ingenthron added a comment - Sundar: why reduce the priority? The priority should remain the same until it's fixed.
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          I willing to bet few hundreds that it's not fixed. That on durable spinning disk setup it still needs at least tens of seconds to create a bucket.

          Whether that's a blocker or minor or whatever it's hard to say. I.e. given that UPR will de-couple views and XDCR from disk persistence so folks mainly affected by this slowness today (i.e. running tests involving lots of bucket creations) will not hit this as hard as before.

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - I willing to bet few hundreds that it's not fixed. That on durable spinning disk setup it still needs at least tens of seconds to create a bucket. Whether that's a blocker or minor or whatever it's hard to say. I.e. given that UPR will de-couple views and XDCR from disk persistence so folks mainly affected by this slowness today (i.e. running tests involving lots of bucket creations) will not hit this as hard as before.
          Hide
          ingenthr Matt Ingenthron added a comment -

          Whether rubles or dollars, I'm not taking that bet.

          Regarding the UPR/XDCR decoupling, if that's the resolution that delivers the user needs, that's fine. I don't expect it would fix things, so the issues where this came up in the first place would still be affected.

          Show
          ingenthr Matt Ingenthron added a comment - Whether rubles or dollars, I'm not taking that bet. Regarding the UPR/XDCR decoupling, if that's the resolution that delivers the user needs, that's fine. I don't expect it would fix things, so the issues where this came up in the first place would still be affected.
          ingenthr Matt Ingenthron made changes -
          Priority Minor [ 4 ] Blocker [ 1 ]
          Hide
          sundar Sundar Sridharan added a comment -

          I just created a bucket on a single node couchbase server without any replicas on 10.5.2.36 - bucket became ready in 8 seconds.
          Which seems to be a spindle disk..
          cat /sys/block/sda/queue/rotational
          1
          I repeated the test on 10.5.2.34 and then again on 10.5.2.33 machines and the bucket creation time were all well under 10 seconds.
          Please let me know if you still see this issue with the machine name and login credentials and I'll take it from there.
          thanks

          Show
          sundar Sundar Sridharan added a comment - I just created a bucket on a single node couchbase server without any replicas on 10.5.2.36 - bucket became ready in 8 seconds. Which seems to be a spindle disk.. cat /sys/block/sda/queue/rotational 1 I repeated the test on 10.5.2.34 and then again on 10.5.2.33 machines and the bucket creation time were all well under 10 seconds. Please let me know if you still see this issue with the machine name and login credentials and I'll take it from there. thanks
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          Just tried on my box. Took about 50 seconds with barriers enabled.

          In order to see that slowness you need a box where fsync actually waits for data to hit the disk. Our VMs are not "honest" on fsync. And we also have few boxes with battery backed cache where fsync legitimately can be quick.

          Get yourself a Linux or Windows laptop with HDD and out of the box you'll observe the slowness.

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - Just tried on my box. Took about 50 seconds with barriers enabled. In order to see that slowness you need a box where fsync actually waits for data to hit the disk. Our VMs are not "honest" on fsync. And we also have few boxes with battery backed cache where fsync legitimately can be quick. Get yourself a Linux or Windows laptop with HDD and out of the box you'll observe the slowness.
          Hide
          ingenthr Matt Ingenthron added a comment -

          You can figure out if something is buffering for you pretty easily. A real disk would do somewhere from 80-150 IOPS. WIth dd and the oflags argument, you can ask it to sync between each write. If it's in that range, then it's proper spinning rust. If it's not, something is buffering for you.

          For example:
          [ingenthr@centosa ~]$ time dd if=/dev/zero of=testfile bs=4096 count=1200 oflag=dsync
          1200+0 records in
          1200+0 records out
          4915200 bytes (4.9 MB) copied, 11.8134 seconds, 416 kB/s

          real 0m11.833s
          user 0m0.004s
          sys 0m0.057s
          [ingenthr@centosa ~]$ time dd if=/dev/zero of=testfile bs=4096 count=1200
          1200+0 records in
          1200+0 records out
          4915200 bytes (4.9 MB) copied, 0.01124 seconds, 437 MB/s

          real 0m0.014s
          user 0m0.000s
          sys 0m0.013s

          Note the first one did about 100 IOPS with dsync on, and the latter one did about 100x better, simulating the buffering you might get from a VM hypervisor that lies, etc.

          (this is a CentOS instance on KVM with two 5900 RPM "green" disks under it, on ZFS on SmartOS)

          Show
          ingenthr Matt Ingenthron added a comment - You can figure out if something is buffering for you pretty easily. A real disk would do somewhere from 80-150 IOPS. WIth dd and the oflags argument, you can ask it to sync between each write. If it's in that range, then it's proper spinning rust. If it's not, something is buffering for you. For example: [ingenthr@centosa ~] $ time dd if=/dev/zero of=testfile bs=4096 count=1200 oflag=dsync 1200+0 records in 1200+0 records out 4915200 bytes (4.9 MB) copied, 11.8134 seconds, 416 kB/s real 0m11.833s user 0m0.004s sys 0m0.057s [ingenthr@centosa ~] $ time dd if=/dev/zero of=testfile bs=4096 count=1200 1200+0 records in 1200+0 records out 4915200 bytes (4.9 MB) copied, 0.01124 seconds, 437 MB/s real 0m0.014s user 0m0.000s sys 0m0.013s Note the first one did about 100 IOPS with dsync on, and the latter one did about 100x better, simulating the buffering you might get from a VM hypervisor that lies, etc. (this is a CentOS instance on KVM with two 5900 RPM "green" disks under it, on ZFS on SmartOS)
          chiyoung Chiyoung Seo made changes -
          Rank Ranked lower
          anil Anil Kumar made changes -
          Link This issue is duplicated by MB-7160 [ MB-7160 ]
          chiyoung Chiyoung Seo made changes -
          Rank Ranked higher
          chiyoung Chiyoung Seo made changes -
          Rank Ranked lower
          anil Anil Kumar made changes -
          Sprint Couchbase Server [ 94 ]
          sundar Sundar Sridharan made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          sundar Sundar Sridharan made changes -
          Comment [ Thank you Matt, the fix below is expected to ameliorate vbucket creation time.
          http://review.couchbase.org/#/c/32894/ ]
          maria Maria McDuff (Inactive) made changes -
          Triage Untriaged [ 10351 ]
          maria Maria McDuff (Inactive) made changes -
          Triage Untriaged [ 10351 ] Triaged [ 10350 ]
          maria Maria McDuff (Inactive) made changes -
          Priority Blocker [ 1 ] Critical [ 2 ]
          pavelpaulau Pavel Paulau made changes -
          Labels PM-PRIORITIZED customer performance
          pavelpaulau Pavel Paulau made changes -
          Labels performance PM-PRIORITIZED customer
          pavelpaulau Pavel Paulau made changes -
          Labels PM-PRIORITIZED customer PM-PRIORITIZED customer performance
          sundar Sundar Sridharan made changes -
          Fix Version/s Backlog [ 10090 ]
          Fix Version/s 3.0 [ 10414 ]
          Is this a Regression? Yes [ 10450 ]
          Hide
          alkondratenko Aleksey Kondratenko (Inactive) added a comment -

          Saw this marked as regression. Which is probably mistake. This was always the case since switch to couch*

          Show
          alkondratenko Aleksey Kondratenko (Inactive) added a comment - Saw this marked as regression. Which is probably mistake. This was always the case since switch to couch*
          Hide
          pavelpaulau Pavel Paulau added a comment -

          I guess every old ticket is marked as "Regression" when you update it.
          Somebody added this new field to MB project in JIRA recently... and it's probably not configured properly.

          Show
          pavelpaulau Pavel Paulau added a comment - I guess every old ticket is marked as "Regression" when you update it. Somebody added this new field to MB project in JIRA recently... and it's probably not configured properly.
          chiyoung Chiyoung Seo made changes -
          Rank Ranked lower
          maria Maria McDuff (Inactive) made changes -
          Fix Version/s bug-backlog [ 11600 ]
          Fix Version/s techdebt-backlog [ 10090 ]
          Triage Triaged [ 10350 ] Untriaged [ 10351 ]
          Hide
          bterkuile Benjamin ter Kuile added a comment -

          Leaving out a flush command in a test suit changes the time from 2minutes and 43 seconds to just 9 seconds. Flushing is important for isolated unit tests.

          Show
          bterkuile Benjamin ter Kuile added a comment - Leaving out a flush command in a test suit changes the time from 2minutes and 43 seconds to just 9 seconds. Flushing is important for isolated unit tests.
          ingenthr Matt Ingenthron made changes -
          Labels PM-PRIORITIZED customer performance PM-PRIORITIZED customer devX performance
          perry Perry Krug made changes -
          Labels PM-PRIORITIZED customer devX performance PM-PRIORITIZED customer devX performance supportability
          Hide
          sundar Sundar Sridharan added a comment -

          As per Pavel's performance testings this issue is only seen on spindle disks without any RAID setup. This setup is for single developer use only and production environments typically have some form of RAID or SSD setups where we do not see this issue. Hence after consultation with PM the decision is to not support the single HDD case since optimizing for that case can lower performance of the more typical RAID/SSD case.
          thanks

          Show
          sundar Sundar Sridharan added a comment - As per Pavel's performance testings this issue is only seen on spindle disks without any RAID setup. This setup is for single developer use only and production environments typically have some form of RAID or SSD setups where we do not see this issue. Hence after consultation with PM the decision is to not support the single HDD case since optimizing for that case can lower performance of the more typical RAID/SSD case. thanks
          sundar Sundar Sridharan made changes -
          Status In Progress [ 3 ] Resolved [ 5 ]
          Resolution Won't Fix [ 2 ]
          Triage Untriaged [ 10351 ] Triaged [ 10350 ]
          Hide
          ingenthr Matt Ingenthron added a comment -

          Sundar: Could you re-assign this to the PM who made this decision? I'd like to discuss it. This is a significant issue for developer experience. It's caused countless support requests and confusion. It affects multi-spindle and SSD deployments as well, so that's a false comparison.

          Show
          ingenthr Matt Ingenthron added a comment - Sundar: Could you re-assign this to the PM who made this decision? I'd like to discuss it. This is a significant issue for developer experience. It's caused countless support requests and confusion. It affects multi-spindle and SSD deployments as well, so that's a false comparison.
          ingenthr Matt Ingenthron made changes -
          Resolution Won't Fix [ 2 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Hide
          chiyoung Chiyoung Seo added a comment -

          There is mis-communication among the teams. We identified the root cause and still need to fix this issue, but not for 3.0 as it requires changing a way of storing vbucket_state docs and creating individual vbucket database files during the bucket instantiation.

          Show
          chiyoung Chiyoung Seo added a comment - There is mis-communication among the teams. We identified the root cause and still need to fix this issue, but not for 3.0 as it requires changing a way of storing vbucket_state docs and creating individual vbucket database files during the bucket instantiation.
          Hide
          ingenthr Matt Ingenthron added a comment -

          Chiyoung: who is the appropriate owner for this?

          Show
          ingenthr Matt Ingenthron added a comment - Chiyoung: who is the appropriate owner for this?
          Hide
          chiyoung Chiyoung Seo added a comment -

          Sundar still owns this ticket. I already communicated with him regarding how we can resolve this issue.

          Show
          chiyoung Chiyoung Seo added a comment - Sundar still owns this ticket. I already communicated with him regarding how we can resolve this issue.
          alkondratenko Aleksey Kondratenko (Inactive) made changes -
          Link This issue is duplicated by MB-12383 [ MB-12383 ]
          wayne Wayne Siu made changes -
          Affects Version/s 3.0 [ 10414 ]
          Affects Version/s 2.5.1 [ 11330 ]
          wayne Wayne Siu made changes -
          Fix Version/s sherlock [ 12148 ]
          Fix Version/s bug-backlog [ 11600 ]
          Hide
          wayne Wayne Siu added a comment -

          Triaged in Maintenance Meeting on Dec 16, 2014.
          Updating the affected and fixed versions.

          Show
          wayne Wayne Siu added a comment - Triaged in Maintenance Meeting on Dec 16, 2014. Updating the affected and fixed versions.
          Hide
          chiyoung Chiyoung Seo added a comment -

          I don't think this can be addressed by a sherlock release. When any long-lasting issue needs to be moved to the next major release, please communicate and discuss it with the corresponding component engineers.

          Show
          chiyoung Chiyoung Seo added a comment - I don't think this can be addressed by a sherlock release. When any long-lasting issue needs to be moved to the next major release, please communicate and discuss it with the corresponding component engineers.
          chiyoung Chiyoung Seo made changes -
          Fix Version/s techdebt-backlog [ 10090 ]
          Fix Version/s sherlock [ 12148 ]
          chiyoung Chiyoung Seo made changes -
          Rank Ranked lower
          ingenthr Matt Ingenthron made changes -
          Link This issue is triggering MB-17884 [ MB-17884 ]

            People

            • Assignee:
              sundar Sundar Sridharan
              Reporter:
              alkondratenko Aleksey Kondratenko (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:

                Gerrit Reviews

                There are no open Gerrit changes