Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-38140

cbexport export throughput drop in CC build 7.0.0-1165 (or earlier)

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Won't Fix
    • Cheshire-Cat
    • None
    • tools
    • Triaged
    • Unknown

    Description

      Patrick Varley

      Starting CheshirCat build 7.0.0-1165 build . cbexport seem to have drop in export throughput .

      Can someone from tools team take a look at it .

       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            sharath.sulochana Sharath Sulochana (Inactive) created issue -
            owend Daniel Owen made changes -
            Field Original Value New Value
            Assignee Patrick Varley [ pvarley ] James Lee [ james.lee ]
            james.lee James Lee added a comment -

            I've had an initial look at this and have been unable to reproduce the slower throughput locally. In my local testing with 10,000,000 1KB documents master is faster than mad-hatter. I have a hunch that the dataset that we are generating is not valid JSON so master is spending longer printing log lines informing us that each and every document is invalid (I've taken a dive into the perfrunner code to check, but I'm not 100% sure yet). It would be helpful if we had the backup logs for our successful runs, however, we don't keep them.

            james.lee James Lee added a comment - I've had an initial look at this and have been unable to reproduce the slower throughput locally. In my local testing with 10,000,000 1KB documents master is faster than mad-hatter. I have a hunch that the dataset that we are generating is not valid JSON so master is spending longer printing log lines informing us that each and every document is invalid (I've taken a dive into the perfrunner code to check, but I'm not 100% sure yet). It would be helpful if we had the backup logs for our successful runs, however, we don't keep them.
            james.lee James Lee made changes -
            Link This issue relates to CBPS-750 [ CBPS-750 ]
            james.lee James Lee added a comment -

            After further looking into this and speaking to Patrick Varley, this drop in throughput makes sense since we are now having to parse each document to insert the scope/collection into the document JSON; this is similar to the drop in performance which we saw for cbimport when we added the '--include-key' flag. With this said, there might be ways which we could improve the throughput such as:
            1) Changing the concurrency model to handle the JSON marshaling/un-marshaling in s separate goroutine that doesn't block.
            2) Adopting a "pay for what you use model" e.g. we avoid having the performance drop for users that don't use collections.

            james.lee James Lee added a comment - After further looking into this and speaking to Patrick Varley , this drop in throughput makes sense since we are now having to parse each document to insert the scope/collection into the document JSON; this is similar to the drop in performance which we saw for cbimport when we added the '--include-key' flag. With this said, there might be ways which we could improve the throughput such as: 1) Changing the concurrency model to handle the JSON marshaling/un-marshaling in s separate goroutine that doesn't block. 2) Adopting a "pay for what you use model" e.g. we avoid having the performance drop for users that don't use collections.
            james.lee James Lee added a comment -

            Marking as 'Not a Bug' and closing since drop in performance makes sense since the collection aware cbexport is slightly more expensive since we need to encode scope/collection names into the exported JSON meaning we are handling the JSON more than we were previously.

            james.lee James Lee added a comment - Marking as 'Not a Bug' and closing since drop in performance makes sense since the collection aware cbexport is slightly more expensive since we need to encode scope/collection names into the exported JSON meaning we are handling the JSON more than we were previously.
            james.lee James Lee made changes -
            Resolution Not a Bug [ 10200 ]
            Status Open [ 1 ] Resolved [ 5 ]
            james.lee James Lee made changes -
            Status Resolved [ 5 ] Closed [ 6 ]
            pvarley Patrick Varley made changes -
            Assignee James Lee [ james.lee ] Patrick Varley [ pvarley ]
            Resolution Not a Bug [ 10200 ]
            Status Closed [ 6 ] Reopened [ 4 ]
            pvarley Patrick Varley made changes -
            Fix Version/s Cheshire-Cat [ 15915 ]
            pvarley Patrick Varley made changes -
            Triage Untriaged [ 10351 ] Triaged [ 10350 ]
            Resolution Won't Fix [ 2 ]
            Status Reopened [ 4 ] Closed [ 6 ]
            owend Daniel Owen made changes -
            Link This issue relates to MB-41899 [ MB-41899 ]

            People

              pvarley Patrick Varley
              sharath.sulochana Sharath Sulochana (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty