Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-1857

up to 20 ms response time

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 1.7.0
    • 1.6.0 beta4
    • moxi
    • None
    • Operating System: All
      Platform: All

    Description

      I am running a "real life" test on the standalone machines on a single box, basically trying to mimic MYB environment.
      Starting with 16K request / Second, within few minutes, the request rate dropped to ~6K ops/sec, memory usage is very low (47M out of 25G), everything seems quite stable (see hourly and minute stats below).
      MYB below says that they see "130 CPU on an 8-core machine, mostly in system time".
      Here are the stats of my memcachtest client:

      [root@Config156VM0 ~]# /opt/memcachetest/bin/memcachetest.old -h 10.2.1.12 -s123456 -i 10000 -t 8 -c5000000 Get operations:
      #of ops. min max avg max90th max95th
      3350005 0 ns 20 ms 941 us 1347 us 1458 us

      Average with 8 threads:
      Avg set: 941 us (1649995) min: 218 us (7) max: 20 ms (2)
      Avg get: 941 us (3350005) min: 204 us (2) max: 20 ms (2)
      Usr: 489.993509
      Sys: 87.231738
      Tot: 652.375604
      Server time:
      Usr: 93.000000
      Sys: 124.000000

      95% of the request took ~14ms which matches the drop of performance of MYB of 20ms increase on their application response time.
      Moxi CPU was around 50%, membase around 30% during that load.

      I am trying to run a test directly against 11210 to see what's moxi impact on this.

      From: Perry Krug perry@northscale.com
      Sent: Monday, August 16, 2010 2:27 PM
      To: Staff - Development
      Subject: MYB Problems

      Team, we went into a production test at MyYearbook this morning and didn't fare quite so well. It wasn't a total failure, but we're no longer in production.

      The existing environment was:
      -~400 web servers
      -3 memcached servers, with about 2mb of data doing 54k ops/sec (across all 3)

      They replaced one of the memached servers with one Membase server (beta 2) and let it run for a bit.

      We got up to 18k ops/sec (on par with the 2 other servers in the group) but they saw about a 25-30% increase in latency for this application's workload. We don't have numbers about our latency specifically, but the overall effect on their site was clear. They also noticed that Moxi took %130 CPU on an 8-core machine, mostly in system time (I'm pretty sure this is related to MB-1714). We don't really have the option to run client-side Moxi yet (400 client machines to setup) but it shouldn't really make a difference until we've got more than one machine in the cluster.

      They are very interested in getting this to work but need some analysis and resolution from us in making it perform better. One thing that they asked about was whether we do or can use Unix domain sockets for the communication between Moxi and it's local Membase process to drastically reduce the amount of overhead involved in that communication. Same thing for a client-side Moxi...

      Let me know what I can go back to them with, they are an anchor account and this could be a great win if we can handle the load.

      The other thing (and less for the engineering team itself) is the desire to have expected benchmarks. i.e., for a given machine configuration, how many operations/sec should our software be able to handle.

      Perry

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Unassigned Unassigned
            sharon.barr@northscale.com sharon.barr@northscale.com
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty