Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-32061

Increased memory fragmentation and overhead

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • Yes

    Description

      Spdlog migration mistake caused us to allocate a buffer far bigger than we need (2GB).

      Original Description
      ====================================================================

      We have few memory-oriented tests like memory fragmentation, max .memcached RSS, memory overhead. Most of them show degradation  in Mad-Hatter:  http://showfast.sc.couchbase.com/#/timeline/Linux/kv/memory/all

      I wasn't able to identify the root cause or the code change of this issue yet.

      The issue was introduced somewhere between 6.5.0-1398 (good build) and 6.5.0-1405. Can't find exact build as we didn't have any centos enterprise builds in between. And this issue doesn't reproduce neither in CE not in Windows for some reason.

       

      Logs from "memory fragmentation" test on 1405.  

      https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-ares-8930/172.23.133.11.zip

       

      Also comparison between 1398 and 1405:

      http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=ares_650-1405_load_and_append_a3f4&label=6.5.0-1405&snapshot=ares_650-1398_load_and_append_b045&label=6.5.0-1398

       

       

       

       

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-32061
          # Subject Branch Project Status CR V

          Activity

            drigby Dave Rigby added a comment -

            Thanks Alex.

            Changelog between 1398 & 1405 shows a number of changes; including a bunch of forward merges from alice: http://172.23.123.43:8000/getchangelog?product=couchbase-server&fromb=6.5.0-1398&tob=6.5.0-1405

            CHANGELOG for kv_engine
             
             * Commit: 81226d895ba8b167e080bbe5648a852c20b0fd98 in build: 6.5.0-1405
               Merge MB-31481 from branch 'alice'
               
               See http://review.couchbase.org/100241 for commit details
               Merge also updates test to work in master branch
               
               Change-Id: I7101e376b4609cb5e4f7e176760e8ccdb92515f3
               
             * Commit: 66985a89b3f93c0dc981f5f43fe35055f99f84d0 in build: 6.5.0-1405
               MB-31481: Don't skip STREAM_END msg after cursor drop during backfill
               
               For DCP producers needing backfill, There's an edge-case in handling
               cursor dropping:
               
               - When a backfill is re-scheduled due to a slow stream trigger (cursor
                 dropping), if the second backfill isn't necessary (as the required
                 items are still in the checkpoing manager) then the second backfill
                 is skipped and the stream transition to in-memory.  - If the
                 in-memory stream then doesn't need to fetch anymore data (as no more
                 mutations have occurred), then the stream advances directly to
                 endStream - see ActiveStream::transitionState():
               
                     case StreamState::InMemory:
                         // Check if the producer has sent up till the last requested
                         // sequence number already, if not - move checkpoint items into
                         // the ready queue.
                         if (lastSentSeqno.load() >= end_seqno_) {
                             // Stream transitioning to DEAD state
                             endStream(END_STREAM_OK);         /// <--- HERE
                             notifyStreamReady();
                         }
               
               However, there is a bug where this END_STREAM message is never
               transmitted to the DCP client, resulting in the stream hanging.
               
               The problem is that stream end message gets stuck in the readyQ, as
               the stream is never notified as ready to send it. This is due to the
               expectation of Stream::next() dealing with queued items being broken -
               if next() retuns null then the itemsReady flag is cleared as it
               assumes a null retudn means no more items are available.  However in
               this situation there /are/ items ready, but not until
               transitionState() is called.
               
               Following from this, the stream is never notified so doesnt
               send a closure message, which means that DCP consumers can hang
               waiting for this closure message.
               
               This patch triggers nextQueuedItem inside the pendingBackfill case to
               ensure that the end stream message is notified and executed.  The
               patch also includes a test to follow this problematic sequence.
               
               Change-Id: I28885cca1e853b72cc886f756000190b903d1d53
               Reviewed-on: http://review.couchbase.org/100241
               Tested-by: Build Bot <build@couchbase.com>
               Reviewed-by: Dave Rigby <daver@couchbase.com>
               Well-Formed: Build Bot <build@couchbase.com>
               
             * Commit: 5a6eed4e0ea7153b4225bddef893d719d13c87b8 in build: 6.5.0-1401
               MB-30041: Migrate audit and event descriptor to nlohmann::json
               
               Migrate the auditd eventdescriptor file to nlohmann::json
               and make the necessary changes to audit parsing to do so.
               
               Change-Id: I5847553adfd24abddbe65d0562fb2c2127bd059a
               Reviewed-on: http://review.couchbase.org/99887
               Reviewed-by: Dave Rigby <daver@couchbase.com>
               Tested-by: Build Bot <build@couchbase.com>
               
             * Commit: 851d6b182d9feaffc5fc2c5082b694faa867637e in build: 6.5.0-1400
               unmerged-commits.py: Add alice branch
               
               Change-Id: Ie5d1c953886a63d728b0db1c9b249c2339e6c7b1
               Reviewed-on: http://review.couchbase.org/100350
               Tested-by: Dave Rigby <daver@couchbase.com>
               Tested-by: Build Bot <build@couchbase.com>
               Reviewed-by: Daniel Owen <owend@couchbase.com>
               
             * Commit: 2657a2bc8e515c58e09a11ca149eadb9c3dd745a in build: 6.5.0-1400
               Merge remote-tracking branch 'couchbase/alice'
               
               * couchbase/alice:
                 MB-31495: Fix bug in getRandomKey
                 MB-31175: Ephemeral HTTombstonePurger underflows age check
                 MB-31327: Ephemeral backfill does not respect purge seqno.
                 MB-31141: Don't reject snappy|raw DCP deletes
                 MB-31141: Account for nmeta in deleteWithMeta
               
               Change-Id: I15638b53f28908581011a852798494c4e76c54d2
               
             * Commit: 0238430fdd69a2e78a064ea7a683d3981468968f in build: 6.5.0-1399
               MB-30798: Update spdlog to version 1.1
               
               As we need to make changes to spdlog we should first update
               our version of it to 1.1. Make the necessary changes to
               our code to use the new spdlog api.
               
               Use the new buffer/size style string returned by fmtlib
               where we perform custom formatting.
               
               Change-Id: I616628d50008758e21a54003cf676c1eb3109cc9
               Reviewed-on: http://review.couchbase.org/100224
               Reviewed-by: Dave Rigby <daver@couchbase.com>
               Tested-by: Build Bot <build@couchbase.com>
               
             * Commit: 9444e64e9ada06b9fd6c1aac365a49d084e64465 in build: 6.5.0-1400
               MB-31495: Fix bug in getRandomKey
               
               This patch fixes a bug where getRandomKey would loop indefinitely if
               start == 0.
               
               The bug is that we check for curr == start before checking if
               curr == max.  See the original code below.
               
               start = 0;
               curr = start;
               VBucketPtr vb = getVBucket(Vbid(curr++));
               while (!vb || vb->getState() != vbucket_state_active) {
                   if (curr == start) {
                       return GetValue(NULL, ENGINE_KEY_ENOENT);
                   }
                   if (curr == max) {
                       curr = 0;
                   }
               
                   vb = getVBucket(Vbid(curr++));
               }
               
               Assume the bucket is empty. The first time we enter the while loop
               curr == 1.  curr != start and curr != max and so we will call
               getVBucket and increment curr to 2.  We repeat until curr == 1024.
               On going round the while loop again curr != start, but curr == max
               and so set curr = 0.
               
               We then call getVbucket and increment curr to 1.  On going round the
               while loop again curr != start (as it is 1), and hence we loop
               indefinitely.
               
               Change-Id: I0de040627c481fb53e0faeddd34f44f31055c241
               Reviewed-on: http://review.couchbase.org/100280
               Well-Formed: Build Bot <build@couchbase.com>
               Reviewed-by: Dave Rigby <daver@couchbase.com>
               Reviewed-by: Trond Norbye <trond.norbye@gmail.com>
               Tested-by: Build Bot <build@couchbase.com>
               
             * Commit: 69082c1564d91be98467d63f57ec4045b90b1bd8 in build: 6.5.0-1400
               MB-31175: Ephemeral HTTombstonePurger underflows age check
               
               There exists a bug in the HTTombstonePurger such that we can cause
               an integer underflow when determining if a deleted item should be
               marked stale (for removal). This occurs because the task records and
               uses a start time when it is created. When the task runs slowly due
               to heavy write locking on the seqList an item can be deleted seemingly
               in the future to the HTTombstonePurger. This causes an integer
               underflow when we perform the "now - osv->getDeletedTime() >= purgeAge"
               check and the subsequent removal of the tombstone before the purgeAge.
               
               Fix this bug by checking that now >= the deleted time of an osv.
               
               Change-Id: Ibffdb88820d5335dfb7b75c2cc250fc766847cf7
               Reviewed-on: http://review.couchbase.org/99744
               Well-Formed: Build Bot <build@couchbase.com>
               Reviewed-by: David Haikney <david.haikney@couchbase.com>
               Reviewed-by: Dave Rigby <daver@couchbase.com>
               Tested-by: Dave Rigby <daver@couchbase.com>
               
             * Commit: 90eadc1556511a7f35118d45890070bd53df67b1 in build: 6.5.0-1400
               MB-31327: Ephemeral backfill does not respect purge seqno.
               
               There exists a bug in Ephemeral backfill_memory such that we can
               start a backfill for a startSeqno > purgeSeqno of a given vBucket.
               This leads to data inconsistencies as we may not replciate a delete.
               
               Fix the bug by checking that the purgeSeqno < startSeqno of a
               backfill when startSeqno != 1.
               
               Only fix for buffered memory backfill as the non-buffered backfill
               is not used and will be deleted in master.
               
               Change-Id: Ia91d6d1e82fcfb7ea2d6463c76a531ea0d8aa182
               Reviewed-on: http://review.couchbase.org/99713
               Well-Formed: Build Bot <build@couchbase.com>
               Reviewed-by: David Haikney <david.haikney@couchbase.com>
               Reviewed-by: Dave Rigby <daver@couchbase.com>
               Tested-by: Build Bot <build@couchbase.com>
               
             * Commit: 0540f14ebf4cb7a0f1de71c717b4fb0141547b02 in build: 6.5.0-1400
               MB-31141: Merge couchbase/vulcan to couchbase/alice
               
               * couchbase/vulcan:
                 MB-31141: Don't reject snappy|raw DCP deletes
                 MB-31141: Account for nmeta in deleteWithMeta
               
               Change-Id: Iaaa6e99ff0c868f8fe8874abd7673022bb9c4e87
               
             * Commit: 84af6315742fadb27d0d76473ff1391104222073 in build: 6.5.0-1400
               MB-31141: Don't reject snappy|raw DCP deletes
               
               A related bug means that is possible for 5.x to create
               deletes with a non-zero raw value. When 5.5x backfills such
               an item for transmission to another 5.5x node (and snappy
               is enabled), the delete gets sent with a snappy datatype
               and rejected by the target node, which triggers a rebalance
               failure.
               
               Change-Id: Ib91453f96882ef4e01ee0e2a3e5e51ed0786a325
               Reviewed-on: http://review.couchbase.org/99414
               Well-Formed: Build Bot <build@couchbase.com>
               Reviewed-by: Dave Rigby <daver@couchbase.com>
               Tested-by: Dave Rigby <daver@couchbase.com>
               
             * Commit: bb60376f9ad3adba5741c6fcffcd87a551c2c61a in build: 6.5.0-1400
               MB-31141: Account for nmeta in deleteWithMeta
               
               When calculating the item size in deleteWithMeta we must
               ensure nmeta is removed from the calculated size.
               
               Tests added for both set and del withMeta
               
               Change-Id: Iab31591c521e9556d436905be0177da773afc058
               Reviewed-on: http://review.couchbase.org/99152
               Tested-by: Build Bot <build@couchbase.com>
               Well-Formed: Build Bot <build@couchbase.com>
               Reviewed-by: Dave Rigby <daver@couchbase.com>
            

            drigby Dave Rigby added a comment - Thanks Alex. Changelog between 1398 & 1405 shows a number of changes; including a bunch of forward merges from alice: http://172.23.123.43:8000/getchangelog?product=couchbase-server&fromb=6.5.0-1398&tob=6.5.0-1405 CHANGELOG for kv_engine   * Commit: 81226d895ba8b167e080bbe5648a852c20b0fd98 in build: 6.5.0-1405 Merge MB-31481 from branch 'alice' See http://review.couchbase.org/100241 for commit details Merge also updates test to work in master branch Change-Id: I7101e376b4609cb5e4f7e176760e8ccdb92515f3 * Commit: 66985a89b3f93c0dc981f5f43fe35055f99f84d0 in build: 6.5.0-1405 MB-31481: Don't skip STREAM_END msg after cursor drop during backfill For DCP producers needing backfill, There's an edge-case in handling cursor dropping: - When a backfill is re-scheduled due to a slow stream trigger (cursor dropping), if the second backfill isn't necessary (as the required items are still in the checkpoing manager) then the second backfill is skipped and the stream transition to in-memory. - If the in-memory stream then doesn't need to fetch anymore data (as no more mutations have occurred), then the stream advances directly to endStream - see ActiveStream::transitionState(): case StreamState::InMemory: // Check if the producer has sent up till the last requested // sequence number already, if not - move checkpoint items into // the ready queue. if (lastSentSeqno.load() >= end_seqno_) { // Stream transitioning to DEAD state endStream(END_STREAM_OK); /// <--- HERE notifyStreamReady(); } However, there is a bug where this END_STREAM message is never transmitted to the DCP client, resulting in the stream hanging. The problem is that stream end message gets stuck in the readyQ, as the stream is never notified as ready to send it. This is due to the expectation of Stream::next() dealing with queued items being broken - if next() retuns null then the itemsReady flag is cleared as it assumes a null retudn means no more items are available. However in this situation there /are/ items ready, but not until transitionState() is called. Following from this, the stream is never notified so doesnt send a closure message, which means that DCP consumers can hang waiting for this closure message. This patch triggers nextQueuedItem inside the pendingBackfill case to ensure that the end stream message is notified and executed. The patch also includes a test to follow this problematic sequence. Change-Id: I28885cca1e853b72cc886f756000190b903d1d53 Reviewed-on: http://review.couchbase.org/100241 Tested-by: Build Bot <build@couchbase.com> Reviewed-by: Dave Rigby <daver@couchbase.com> Well-Formed: Build Bot <build@couchbase.com> * Commit: 5a6eed4e0ea7153b4225bddef893d719d13c87b8 in build: 6.5.0-1401 MB-30041: Migrate audit and event descriptor to nlohmann::json Migrate the auditd eventdescriptor file to nlohmann::json and make the necessary changes to audit parsing to do so. Change-Id: I5847553adfd24abddbe65d0562fb2c2127bd059a Reviewed-on: http://review.couchbase.org/99887 Reviewed-by: Dave Rigby <daver@couchbase.com> Tested-by: Build Bot <build@couchbase.com> * Commit: 851d6b182d9feaffc5fc2c5082b694faa867637e in build: 6.5.0-1400 unmerged-commits.py: Add alice branch Change-Id: Ie5d1c953886a63d728b0db1c9b249c2339e6c7b1 Reviewed-on: http://review.couchbase.org/100350 Tested-by: Dave Rigby <daver@couchbase.com> Tested-by: Build Bot <build@couchbase.com> Reviewed-by: Daniel Owen <owend@couchbase.com> * Commit: 2657a2bc8e515c58e09a11ca149eadb9c3dd745a in build: 6.5.0-1400 Merge remote-tracking branch 'couchbase/alice' * couchbase/alice: MB-31495: Fix bug in getRandomKey MB-31175: Ephemeral HTTombstonePurger underflows age check MB-31327: Ephemeral backfill does not respect purge seqno. MB-31141: Don't reject snappy|raw DCP deletes MB-31141: Account for nmeta in deleteWithMeta Change-Id: I15638b53f28908581011a852798494c4e76c54d2 * Commit: 0238430fdd69a2e78a064ea7a683d3981468968f in build: 6.5.0-1399 MB-30798: Update spdlog to version 1.1 As we need to make changes to spdlog we should first update our version of it to 1.1. Make the necessary changes to our code to use the new spdlog api. Use the new buffer/size style string returned by fmtlib where we perform custom formatting. Change-Id: I616628d50008758e21a54003cf676c1eb3109cc9 Reviewed-on: http://review.couchbase.org/100224 Reviewed-by: Dave Rigby <daver@couchbase.com> Tested-by: Build Bot <build@couchbase.com> * Commit: 9444e64e9ada06b9fd6c1aac365a49d084e64465 in build: 6.5.0-1400 MB-31495: Fix bug in getRandomKey This patch fixes a bug where getRandomKey would loop indefinitely if start == 0. The bug is that we check for curr == start before checking if curr == max. See the original code below. start = 0; curr = start; VBucketPtr vb = getVBucket(Vbid(curr++)); while (!vb || vb->getState() != vbucket_state_active) { if (curr == start) { return GetValue(NULL, ENGINE_KEY_ENOENT); } if (curr == max) { curr = 0; } vb = getVBucket(Vbid(curr++)); } Assume the bucket is empty. The first time we enter the while loop curr == 1. curr != start and curr != max and so we will call getVBucket and increment curr to 2. We repeat until curr == 1024. On going round the while loop again curr != start, but curr == max and so set curr = 0. We then call getVbucket and increment curr to 1. On going round the while loop again curr != start (as it is 1), and hence we loop indefinitely. Change-Id: I0de040627c481fb53e0faeddd34f44f31055c241 Reviewed-on: http://review.couchbase.org/100280 Well-Formed: Build Bot <build@couchbase.com> Reviewed-by: Dave Rigby <daver@couchbase.com> Reviewed-by: Trond Norbye <trond.norbye@gmail.com> Tested-by: Build Bot <build@couchbase.com> * Commit: 69082c1564d91be98467d63f57ec4045b90b1bd8 in build: 6.5.0-1400 MB-31175: Ephemeral HTTombstonePurger underflows age check There exists a bug in the HTTombstonePurger such that we can cause an integer underflow when determining if a deleted item should be marked stale (for removal). This occurs because the task records and uses a start time when it is created. When the task runs slowly due to heavy write locking on the seqList an item can be deleted seemingly in the future to the HTTombstonePurger. This causes an integer underflow when we perform the "now - osv->getDeletedTime() >= purgeAge" check and the subsequent removal of the tombstone before the purgeAge. Fix this bug by checking that now >= the deleted time of an osv. Change-Id: Ibffdb88820d5335dfb7b75c2cc250fc766847cf7 Reviewed-on: http://review.couchbase.org/99744 Well-Formed: Build Bot <build@couchbase.com> Reviewed-by: David Haikney <david.haikney@couchbase.com> Reviewed-by: Dave Rigby <daver@couchbase.com> Tested-by: Dave Rigby <daver@couchbase.com> * Commit: 90eadc1556511a7f35118d45890070bd53df67b1 in build: 6.5.0-1400 MB-31327: Ephemeral backfill does not respect purge seqno. There exists a bug in Ephemeral backfill_memory such that we can start a backfill for a startSeqno > purgeSeqno of a given vBucket. This leads to data inconsistencies as we may not replciate a delete. Fix the bug by checking that the purgeSeqno < startSeqno of a backfill when startSeqno != 1. Only fix for buffered memory backfill as the non-buffered backfill is not used and will be deleted in master. Change-Id: Ia91d6d1e82fcfb7ea2d6463c76a531ea0d8aa182 Reviewed-on: http://review.couchbase.org/99713 Well-Formed: Build Bot <build@couchbase.com> Reviewed-by: David Haikney <david.haikney@couchbase.com> Reviewed-by: Dave Rigby <daver@couchbase.com> Tested-by: Build Bot <build@couchbase.com> * Commit: 0540f14ebf4cb7a0f1de71c717b4fb0141547b02 in build: 6.5.0-1400 MB-31141: Merge couchbase/vulcan to couchbase/alice * couchbase/vulcan: MB-31141: Don't reject snappy|raw DCP deletes MB-31141: Account for nmeta in deleteWithMeta Change-Id: Iaaa6e99ff0c868f8fe8874abd7673022bb9c4e87 * Commit: 84af6315742fadb27d0d76473ff1391104222073 in build: 6.5.0-1400 MB-31141: Don't reject snappy|raw DCP deletes A related bug means that is possible for 5.x to create deletes with a non-zero raw value. When 5.5x backfills such an item for transmission to another 5.5x node (and snappy is enabled), the delete gets sent with a snappy datatype and rejected by the target node, which triggers a rebalance failure. Change-Id: Ib91453f96882ef4e01ee0e2a3e5e51ed0786a325 Reviewed-on: http://review.couchbase.org/99414 Well-Formed: Build Bot <build@couchbase.com> Reviewed-by: Dave Rigby <daver@couchbase.com> Tested-by: Dave Rigby <daver@couchbase.com> * Commit: bb60376f9ad3adba5741c6fcffcd87a551c2c61a in build: 6.5.0-1400 MB-31141: Account for nmeta in deleteWithMeta When calculating the item size in deleteWithMeta we must ensure nmeta is removed from the calculated size. Tests added for both set and del withMeta Change-Id: Iab31591c521e9556d436905be0177da773afc058 Reviewed-on: http://review.couchbase.org/99152 Tested-by: Build Bot <build@couchbase.com> Well-Formed: Build Bot <build@couchbase.com> Reviewed-by: Dave Rigby <daver@couchbase.com>
            drigby Dave Rigby added a comment -

            Simplified view of the set of changes between 1398 & 1405:

            ‡ git hist c9048515837b867dbc5e1fae5e06dc68ec1eebb1..81226d895ba8b167e080bbe5648a852c20b0fd98
            *   81226d895 2018-10-08 | Merge MB-31481 from branch 'alice' (HEAD) [Christopher Farman] 
            |\  
            | * 66985a89b 2018-10-08 | MB-31481: Don't skip STREAM_END msg after cursor drop during backfill [Christopher Farman] 
            * | 5a6eed4e0 2018-10-08 | MB-30041: Migrate audit and event descriptor to nlohmann::json [Ben Huddleston] 
            * | 851d6b182 2018-10-08 | unmerged-commits.py: Add alice branch [Dave Rigby] 
            * |   2657a2bc8 2018-10-08 | Merge remote-tracking branch 'couchbase/alice' [Dave Rigby] 
            |\ \  
            | |/  
            | * 9444e64e9 2018-10-04 | MB-31495: Fix bug in getRandomKey [Daniel Owen] 
            | * 69082c156 2018-09-19 | MB-31175: Ephemeral HTTombstonePurger underflows age check [Ben Huddleston] 
            | * 90eadc155 2018-09-19 | MB-31327: Ephemeral backfill does not respect purge seqno. [Ben Huddleston] 
            | * 0540f14eb 2018-09-11 | MB-31141: Merge couchbase/vulcan to couchbase/alice [Jim Walker] 
            | * 84af63157 2018-09-11 | MB-31141: Don't reject snappy|raw DCP deletes (tag: v5.5.2) [Jim Walker] 
            | * bb60376f9 2018-09-10 | MB-31141: Account for nmeta in deleteWithMeta [Jim Walker] 
            * 0238430fd 2018-10-08 | MB-30798: Update spdlog to version 1.1 [Ben Huddleston] 
            

            drigby Dave Rigby added a comment - Simplified view of the set of changes between 1398 & 1405: ‡ git hist c9048515837b867dbc5e1fae5e06dc68ec1eebb1..81226d895ba8b167e080bbe5648a852c20b0fd98 * 81226d895 2018-10-08 | Merge MB-31481 from branch 'alice' (HEAD) [Christopher Farman] |\ | * 66985a89b 2018-10-08 | MB-31481: Don't skip STREAM_END msg after cursor drop during backfill [Christopher Farman] * | 5a6eed4e0 2018-10-08 | MB-30041: Migrate audit and event descriptor to nlohmann::json [Ben Huddleston] * | 851d6b182 2018-10-08 | unmerged-commits.py: Add alice branch [Dave Rigby] * | 2657a2bc8 2018-10-08 | Merge remote-tracking branch 'couchbase/alice' [Dave Rigby] |\ \ | |/ | * 9444e64e9 2018-10-04 | MB-31495: Fix bug in getRandomKey [Daniel Owen] | * 69082c156 2018-09-19 | MB-31175: Ephemeral HTTombstonePurger underflows age check [Ben Huddleston] | * 90eadc155 2018-09-19 | MB-31327: Ephemeral backfill does not respect purge seqno. [Ben Huddleston] | * 0540f14eb 2018-09-11 | MB-31141: Merge couchbase/vulcan to couchbase/alice [Jim Walker] | * 84af63157 2018-09-11 | MB-31141: Don't reject snappy|raw DCP deletes (tag: v5.5.2) [Jim Walker] | * bb60376f9 2018-09-10 | MB-31141: Account for nmeta in deleteWithMeta [Jim Walker] * 0238430fd 2018-10-08 | MB-30798: Update spdlog to version 1.1 [Ben Huddleston]
            drigby Dave Rigby added a comment - - edited

            diffstat between the good & bad SHAs, excluding tests are:

            ‡ git diff --patience --stat c9048515837b867dbc5e1fae5e06dc68ec1eebb1..81226d895ba8b167e080bbe5648a852c20b0fd98 -- . ':!*/tests/*' 
             auditd/src/audit.cc                          | 124 ++++++++++++---------------
             auditd/src/audit.h                           |   6 +-
             auditd/src/auditfile.cc                      |   2 +-
             auditd/src/eventdescriptor.cc                |  97 ++++++++-------------
             auditd/src/eventdescriptor.h                 |  20 +----
             daemon/mcbp_validators.cc                    |   6 +-
             engines/ep/scripts/unmerged-commits.py       |   2 +
             engines/ep/src/bucket_logger.cc              |   4 +-
             engines/ep/src/bucket_logger.h               |   4 +-
             engines/ep/src/dcp/active_stream.cc          |   5 ++
             engines/ep/src/dcp/backfill_memory.cc        |  15 ++++
             engines/ep/src/dcp/consumer.cc               |  14 ++-
             engines/ep/src/ep_engine.cc                  |   9 +-
             engines/ep/src/ephemeral_tombstone_purger.cc |   8 +-
             engines/ep/src/kv_bucket.cc                  |  10 +--
             engines/ep/src/kv_bucket.h                   |  11 +++
             engines/ep/src/stored-value.cc               |   1 +
             logger/custom_rotating_file_sink.cc          |  52 +++++++----
             logger/custom_rotating_file_sink.h           |  15 ++--
             logger/spdlogger.cc                          |  63 ++++++++------
             tests/mcbp/mcbp_test.cc                      |   6 +-
             21 files changed, 255 insertions(+), 219 deletions(-)
            

            To print full diff:

            git diff --patience c9048515837b867dbc5e1fae5e06dc68ec1eebb1..81226d895ba8b167e080bbe5648a852c20b0fd98 -- . ':!*/tests/*'
            

            drigby Dave Rigby added a comment - - edited diffstat between the good & bad SHAs, excluding tests are: ‡ git diff --patience --stat c9048515837b867dbc5e1fae5e06dc68ec1eebb1..81226d895ba8b167e080bbe5648a852c20b0fd98 -- . ':!*/tests/*' auditd/src/audit.cc | 124 ++++++++++++--------------- auditd/src/audit.h | 6 +- auditd/src/auditfile.cc | 2 +- auditd/src/eventdescriptor.cc | 97 ++++++++------------- auditd/src/eventdescriptor.h | 20 +---- daemon/mcbp_validators.cc | 6 +- engines/ep/scripts/unmerged-commits.py | 2 + engines/ep/src/bucket_logger.cc | 4 +- engines/ep/src/bucket_logger.h | 4 +- engines/ep/src/dcp/active_stream.cc | 5 ++ engines/ep/src/dcp/backfill_memory.cc | 15 ++++ engines/ep/src/dcp/consumer.cc | 14 ++- engines/ep/src/ep_engine.cc | 9 +- engines/ep/src/ephemeral_tombstone_purger.cc | 8 +- engines/ep/src/kv_bucket.cc | 10 +-- engines/ep/src/kv_bucket.h | 11 +++ engines/ep/src/stored-value.cc | 1 + logger/custom_rotating_file_sink.cc | 52 +++++++---- logger/custom_rotating_file_sink.h | 15 ++-- logger/spdlogger.cc | 63 ++++++++------ tests/mcbp/mcbp_test.cc | 6 +- 21 files changed, 255 insertions(+), 219 deletions(-) To print full diff: git diff --patience c9048515837b867dbc5e1fae5e06dc68ec1eebb1..81226d895ba8b167e080bbe5648a852c20b0fd98 -- . ':!*/tests/*'
            drigby Dave Rigby added a comment -

            The issue was introduced somewhere between 6.5.0-1398 (good build) and 6.5.0-1405. Can't find exact build as we didn't have any centos enterprise builds in between. And this issue doesn't reproduce neither in CE not in Windows for some reason.

            If it's not showing up in CE; then that would imply something EE-related. The only EE component changed above it the audit daemon.

            drigby Dave Rigby added a comment - The issue was introduced somewhere between 6.5.0-1398 (good build) and 6.5.0-1405. Can't find exact build as we didn't have any centos enterprise builds in between. And this issue doesn't reproduce neither in CE not in Windows for some reason. If it's not showing up in CE; then that would imply something EE-related. The only EE component changed above it the audit daemon.
            drigby Dave Rigby added a comment -

            Hi Ben, could you take a look at this please? As per the description this degredation is only seen with EE builds; and given the files which have changed implies an issue with the auditd changes you made.

            If you want spot anything from code inspection I would suggest creating a couple of toy builds with intermediate kv_engine SHAs to binary chop to where it first occurred.

            drigby Dave Rigby added a comment - Hi Ben, could you take a look at this please? As per the description this degredation is only seen with EE builds; and given the files which have changed implies an issue with the auditd changes you made. If you want spot anything from code inspection I would suggest creating a couple of toy builds with intermediate kv_engine SHAs to binary chop to where it first occurred.

            Managed to build a few different versions and test to work out where exactly this fragmentation is coming from. Currently I believe that we SHOULD be seeing this in CE as this was introduced BEFORE the audit json parsing migration.

            All numbers reported for 1 node, 1 bucket.

            Build number memcached system reported memory (MB)
            1398 ~240
            • last known good build
            1399 ~560
            • upgrade of spdlog to version 1.1
            1400 ~575.7
            • various bug fixes
            1401 ~572
            • migration of audit json parsing

            This appears to be related to our upgrade of spdlog to version 1.1. Build diff from 1398 to 1399:
            http://172.23.123.43:8000/getchangelog?product=couchbase-server&fromb=6.5.0-1398&tob=6.5.0-1399

            First worry was that spdlog was doing something absolutely ludicrous with memory so I created some more buckets (we create another spdlog instance per bucket).

            With 5 buckets.

            Build number memcached system reported memory (MB) Difference to 1 bucket(MB)
            1398 ~275 35
            • last known good build
            1399 ~593 33
            • upgrade of spdlog to version 1.1

            Will investigate further.

            ben.huddleston Ben Huddleston added a comment - Managed to build a few different versions and test to work out where exactly this fragmentation is coming from. Currently I believe that we SHOULD be seeing this in CE as this was introduced BEFORE the audit json parsing migration. All numbers reported for 1 node, 1 bucket. Build number memcached system reported memory (MB) 1398 ~240 last known good build 1399 ~560 upgrade of spdlog to version 1.1 1400 ~575.7 various bug fixes 1401 ~572 migration of audit json parsing This appears to be related to our upgrade of spdlog to version 1.1. Build diff from 1398 to 1399: http://172.23.123.43:8000/getchangelog?product=couchbase-server&fromb=6.5.0-1398&tob=6.5.0-1399 First worry was that spdlog was doing something absolutely ludicrous with memory so I created some more buckets (we create another spdlog instance per bucket). With 5 buckets. Build number memcached system reported memory (MB) Difference to 1 bucket(MB) 1398 ~275 35 last known good build 1399 ~593 33 upgrade of spdlog to version 1.1 Will investigate further.
            ben.huddleston Ben Huddleston added a comment - - edited

            The cause of the increased memory fragmentation is due to the migration from spdlog version 0.14.0 to version 1.1.0.

            Between the two versions, asynchronous logging was refactored to use a given thread pool and a blocking queue creating for a specific number of items (defaulting to 8196). Previously, the buffer used for asynchronous logging was specified by size (it was deemed at the time that 2MB was appropriate). Not realising that this change was made, when I migrated spdlog from 0.14.0 to 1.1.0 I left the buffer size parameter that we supplied as 2048 * 1024. This lead to a buffer with over 2 million items vs 2MB. The item type is defined in thread_pool.h as async_msg.

            Using the following command after making some tweaks to thread_pool.h (commenting out function calls we don't need to get a size, fixing some using declarations and trying to get the size of the async_msg) I dumped the struct size.

            clang++ -Xclang -fdump-record-layouts -I ../../ -std=c++11 thread_pool.h
            

            *** Dumping AST Record Layout
                     0 | struct spdlog::details::async_msg
                     0 |   enum spdlog::details::async_msg_type msg_type
                     4 |   level::level_enum level
                     8 |   class std::__1::chrono::time_point<class std::__1::chrono::system_clock, class std::__1::chrono::duration<long long, class std::__1::ratio<1, 1000000> > > time
                     8 |     class std::__1::chrono::duration<long long, class std::__1::ratio<1, 1000000> > __d_
                     8 |       std::__1::chrono::duration<long long, class std::__1::ratio<1, 1000000> >::rep __rep_
                    16 |   size_t thread_id
                    24 |   class fmt::v5::basic_memory_buffer<char, 176, class std::__1::allocator<char> > raw
                    24 |     class std::__1::allocator<char> (base) (empty)
                    24 |     class fmt::v5::internal::basic_buffer<char> (primary base)
                    24 |       (basic_buffer vtable pointer)
                    32 |       char * ptr_
                    40 |       std::size_t size_
                    48 |       std::size_t capacity_
                    56 |     char [176] store_
                   232 |   size_t msg_id
                   240 |   class std::__1::shared_ptr<class spdlog::async_logger> worker_ptr
                   240 |     std::__1::shared_ptr<class spdlog::async_logger>::element_type * __ptr_
                   248 |     class std::__1::__shared_weak_count * __cntrl_
                       | [sizeof=256, dsize=256, align=8,
                       |  nvsize=256, nvalign=8]
            

            So we have a size of 256 bytes per item, which gives us a buffer that is 256 times greater than what we had before.

            It should be noted that in version 1.1.0 we require an extra thread (the one that is logging in the background) and the test also takes roughly twice as long (~30s vs ~1 minute). This gives us a throughput of 166,667 items/s with version 1.1.0 so this speed decrease isn't a worry.

            If we drop the buffer size down to the default of 8196 items (2MB) then memory usage drops down to about 3.5MB. Will raise a change to do so. Also should modify the unit tests slightly to use default parameters where possible.

            ben.huddleston Ben Huddleston added a comment - - edited The cause of the increased memory fragmentation is due to the migration from spdlog version 0.14.0 to version 1.1.0. Between the two versions, asynchronous logging was refactored to use a given thread pool and a blocking queue creating for a specific number of items (defaulting to 8196). Previously, the buffer used for asynchronous logging was specified by size (it was deemed at the time that 2MB was appropriate). Not realising that this change was made, when I migrated spdlog from 0.14.0 to 1.1.0 I left the buffer size parameter that we supplied as 2048 * 1024. This lead to a buffer with over 2 million items vs 2MB. The item type is defined in thread_pool.h as async_msg. Using the following command after making some tweaks to thread_pool.h (commenting out function calls we don't need to get a size, fixing some using declarations and trying to get the size of the async_msg) I dumped the struct size. clang++ -Xclang -fdump-record-layouts -I ../../ -std=c++11 thread_pool.h *** Dumping AST Record Layout 0 | struct spdlog::details::async_msg 0 | enum spdlog::details::async_msg_type msg_type 4 | level::level_enum level 8 | class std::__1::chrono::time_point<class std::__1::chrono::system_clock, class std::__1::chrono::duration<long long, class std::__1::ratio<1, 1000000> > > time 8 | class std::__1::chrono::duration<long long, class std::__1::ratio<1, 1000000> > __d_ 8 | std::__1::chrono::duration<long long, class std::__1::ratio<1, 1000000> >::rep __rep_ 16 | size_t thread_id 24 | class fmt::v5::basic_memory_buffer<char, 176, class std::__1::allocator<char> > raw 24 | class std::__1::allocator<char> (base) (empty) 24 | class fmt::v5::internal::basic_buffer<char> (primary base) 24 | (basic_buffer vtable pointer) 32 | char * ptr_ 40 | std::size_t size_ 48 | std::size_t capacity_ 56 | char [176] store_ 232 | size_t msg_id 240 | class std::__1::shared_ptr<class spdlog::async_logger> worker_ptr 240 | std::__1::shared_ptr<class spdlog::async_logger>::element_type * __ptr_ 248 | class std::__1::__shared_weak_count * __cntrl_ | [sizeof=256, dsize=256, align=8, | nvsize=256, nvalign=8] So we have a size of 256 bytes per item, which gives us a buffer that is 256 times greater than what we had before. It should be noted that in version 1.1.0 we require an extra thread (the one that is logging in the background) and the test also takes roughly twice as long (~30s vs ~1 minute). This gives us a throughput of 166,667 items/s with version 1.1.0 so this speed decrease isn't a worry. If we drop the buffer size down to the default of 8196 items (2MB) then memory usage drops down to about 3.5MB. Will raise a change to do so. Also should modify the unit tests slightly to use default parameters where possible.

            Build couchbase-server-6.5.0-1691 contains kv_engine commit 54d4e73 with commit message:
            MB-32061: Lower spdlog buffer size

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.0-1691 contains kv_engine commit 54d4e73 with commit message: MB-32061 : Lower spdlog buffer size

            Hi Alex, my change to lower the spdlog buffer size is present in the latest Mad-Hatter build. Could you please rerun the test?

            I'd expect to see a lower memcached RSS.

            ben.huddleston Ben Huddleston added a comment - Hi Alex, my change to lower the spdlog buffer size is present in the latest Mad-Hatter build. Could you please rerun the test? I'd expect to see a lower memcached RSS.
            oleksandr.gyryk Alex Gyryk (Inactive) added a comment - - edited

            Tests are still running but I see that memory is looking good now:

            I'll leave ticket open until all tests are done

            oleksandr.gyryk Alex Gyryk (Inactive) added a comment - - edited Tests are still running but I see that memory is looking good now: I'll leave ticket open until all tests are done

            I'm leaving this ticket open and assigning it back to Ben Huddleston as looks like this change causes about 3-4% regression in KV throughput
            See https://issues.couchbase.com/browse/MB-32388 

            and https://issues.couchbase.com/browse/MB-32389

             

            oleksandr.gyryk Alex Gyryk (Inactive) added a comment - I'm leaving this ticket open and assigning it back to Ben Huddleston as looks like this change causes about 3-4% regression in KV throughput See https://issues.couchbase.com/browse/MB-32388   and https://issues.couchbase.com/browse/MB-32389  
            drigby Dave Rigby added a comment -

            Resolving this one as the memory issue has been addressed, and performance issues are being tracked via MB-32388 & MB-32389.

            drigby Dave Rigby added a comment - Resolving this one as the memory issue has been addressed, and performance issues are being tracked via MB-32388 & MB-32389 .
            wayne Wayne Siu added a comment -

            The issue has been addressed (see attached).

            wayne Wayne Siu added a comment - The issue has been addressed (see attached).

            People

              wayne Wayne Siu
              oleksandr.gyryk Alex Gyryk (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty