Details
-
Bug
-
Resolution: Fixed
-
Major
-
3.0
-
Security Level: Public
-
None
Description
Attaching the email from Mike.
I have been running the unit tests on the master branch lately and things are becoming really slow. I timed one of the runs earlier today and found that to run all of the engine_tests it takes 10 minutes which in my opinion begins to make running these tests a burden. As a result I began investigating why things are so slow and I found one major issue that points to a larger problem.
When I initially started my investigation I quickly found out that it takes 1.5 seconds to start ep-engine. Since we have 240 tests in the master branch this means that 6 minutes of the 10 minutes it takes to run the tests are consumed in engine startup. Upon further investigation the main issue was in the warmup task where I found on my machine it takes 300ms to run the getItemEstimate() function in couchkvstore. This isn't actually so bad, but the problem is we call it num_shard (4 by default) times and we do this serially which means 1.2 seconds are spend calling this function. On top of this getItemEstimate() actually iterates all of the vbuckets so we really only need to call it once and by just calling it once we cut 4 minutes from the engine_test time.
The bigger issue here isn't actually that we call this function num_shard times, but that couchKVStore has no idea which vbuckets it operates on (due to multiple shards) and as a result some of the functions will actually check on or get stuff from vbuckets that they should not be operating on. In the above example we got the item estimate for all 1024 vbuckets 4 times, when really each shard should have checked on only 256 vbuckets. An even better improvement here would be that since there is only 1 vbucket in most test cases that we only actually get the item estimate for one vbucket.