Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51409

EMPFAIL on 20MB x 30 upserts

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown
    • KV March-22

    Description

      A recent regression, we are seeing failures upserting docs using the REST api, after some number of docs are upserted (in this example we are upserting 30 docs, and we see this failure after a dozen or so successful upserts. When we move on to the next doc it works, but then we start seeing the same failure on all keys after that:

      n_0:

      172.18.0.3 - couchbase [10/Mar/2022:10:02:44 -0800] "POST /pools/default/buckets/testBucket/docs/key-0 HTTP/1.1" 200 2 - "Apache-HttpClient/4.5.13 (Java/11)" 4283
      ...
      172.18.0.3 - couchbase [10/Mar/2022:10:03:22 -0800] "POST /pools/default/buckets/testBucket/docs/key-12 HTTP/1.1" 200 2 - "Apache-HttpClient/4.5.13 (Java/11)" 2724
      172.18.0.3 - - [10/Mar/2022:10:03:25 -0800] "POST /pools/default/buckets/testBucket/docs/key-13 HTTP/1.1" 500 44 - "Apache-HttpClient/4.5.13 (Java/11)" 1865
      172.18.0.3 - - [10/Mar/2022:10:03:30 -0800] "POST /pools/default/buckets/testBucket/docs/key-13 HTTP/1.1" 500 44 - "Apache-HttpClient/4.5.13 (Java/11)" 2718
      172.18.0.3 - - [10/Mar/2022:10:03:34 -0800] "POST /pools/default/buckets/testBucket/docs/key-13 HTTP/1.1" 500 44 - "Apache-HttpClient/4.5.13 (Java/11)" 2330
      172.18.0.3 - - [10/Mar/2022:10:03:39 -0800] "POST /pools/default/buckets/testBucket/docs/key-13 HTTP/1.1" 500 44 - "Apache-HttpClient/4.5.13 (Java/11)" 2501
      172.18.0.3 - - [10/Mar/2022:10:03:44 -0800] "POST /pools/default/buckets/testBucket/docs/key-13 HTTP/1.1" 500 44 - "Apache-HttpClient/4.5.13 (Java/11)" 2762
      172.18.0.3 - - [10/Mar/2022:10:03:48 -0800] "POST /pools/default/buckets/testBucket/docs/key-13 HTTP/1.1" 500 44 - "Apache-HttpClient/4.5.13 (Java/11)" 2478
      172.18.0.3 - - [10/Mar/2022:10:03:53 -0800] "POST /pools/default/buckets/testBucket/docs/key-13 HTTP/1.1" 500 44 - "Apache-HttpClient/4.5.13 (Java/11)" 2602
      172.18.0.3 - - [10/Mar/2022:10:03:58 -0800] "POST /pools/default/buckets/testBucket/docs/key-13 HTTP/1.1" 500 44 - "Apache-HttpClient/4.5.13 (Java/11)" 2767
      172.18.0.3 - - [10/Mar/2022:10:04:03 -0800] "POST /pools/default/buckets/testBucket/docs/key-13 HTTP/1.1" 500 44 - "Apache-HttpClient/4.5.13 (Java/11)" 2614
      172.18.0.3 - - [10/Mar/2022:10:04:08 -0800] "POST /pools/default/buckets/testBucket/docs/key-13 HTTP/1.1" 500 44 - "Apache-HttpClient/4.5.13 (Java/11)" 2403
      172.18.0.3 - couchbase [10/Mar/2022:10:04:10 -0800] "POST /pools/default/buckets/testBucket/docs/key-14 HTTP/1.1" 200 2 - "Apache-HttpClient/4.5.13 (Java/11)" 2315
      172.18.0.3 - - [10/Mar/2022:10:04:13 -0800] "POST /pools/default/buckets/testBucket/docs/key-15 HTTP/1.1" 500 44 - "Apache-HttpClient/4.5.13 (Java/11)" 2373
      ...
      

      [ns_server:error,2022-03-10T10:22:47.348-08:00,n_0@172.18.0.3:<0.24422.2>:menelaus_util:reply_server_error_before_close:210]Server error during processing: ["web request failed",
                                       {path,
                                        "/pools/default/buckets/testBucket/docs/key-21"},
                                       {method,'POST'},
                                       {type,error},
                                       {what,
                                        {case_clause,
                                         {badrpc,
                                          {'EXIT',
                                           {function_clause,
                                            [{capi_crud,handle_mutation_rv,
                                              [{mc_header,1,134,0,0,0,0,0,undefined},
                                               {mc_entry,undefined,undefined,0,0,0,
                                                undefined,0}],
                                              [{file,"src/capi_crud.erl"},
                                               {line,28}]},
                                             {capi_crud,set,6,[]}]}}}}},
                                       {trace,
                                        [{menelaus_web_crud,handle_post,4,
                                          [{file,"src/menelaus_web_crud.erl"},
                                           {line,334}]},
                                         {request_tracker,request,2,
                                          [{file,"src/request_tracker.erl"},
                                           {line,40}]},
                                         {menelaus_util,handle_request,2,
                                          [{file,"src/menelaus_util.erl"},
                                           {line,221}]},
                                         {mochiweb_http,headers,6,
                                          [{file,
                                            "/home/couchbase/jenkins/workspace/cbas-cbcluster-stress-oraclejdk11/couchdb/src/mochiweb/mochiweb_http.erl"},
                                           {line,153}]},
                                         {proc_lib,init_p_do_apply,3,
                                          [{file,"proc_lib.erl"},{line,226}]}]}]
      
      

      Attachments

        1. cbcollect_info_n_0.zip
          6.08 MB
        2. cbcollect_info_n_1.zip
          4.57 MB
        3. cbcollect_info_n_2.zip
          5.22 MB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            drigby Dave Rigby added a comment -

            As per comments on MB-51408, this does appear to be an issue triggered by changes to tlm to disable what should have been assert-only code, however the disabled code incorrectly had side-effects.

            Paolo is addressing the issue via MB-51408, so closing this as a duplicate.

            drigby Dave Rigby added a comment - As per comments on MB-51408 , this does appear to be an issue triggered by changes to tlm to disable what should have been assert-only code, however the disabled code incorrectly had side-effects. Paolo is addressing the issue via MB-51408 , so closing this as a duplicate.
            michael.blow Michael Blow added a comment -

            >> Michael Blow Do you have any specific build numbers when this issue started to occur?

            Dave Rigby, just to close the loop- I do not have any specific build numbers as we no longer have any 7.1 Jenkins runs before the regression started, as we are not able to keep a large number of runs due to disk limitations placed on us by the build team.

            We would only be able to provide SHAs in any event, as these are all manifest-based (i.e. not installer) tests.

            michael.blow Michael Blow added a comment - >> Michael Blow Do you have any specific build numbers when this issue started to occur? Dave Rigby , just to close the loop- I do not have any specific build numbers as we no longer have any 7.1 Jenkins runs before the regression started, as we are not able to keep a large number of runs due to disk limitations placed on us by the build team. We would only be able to provide SHAs in any event, as these are all manifest-based (i.e. not installer) tests.
            drigby Dave Rigby added a comment -

            Michael Blow Thanks for confirming. SHAs would also work, but given the CV job logs have wrapped it's somewhat moot now

            drigby Dave Rigby added a comment - Michael Blow Thanks for confirming. SHAs would also work, but given the CV job logs have wrapped it's somewhat moot now
            drigby Dave Rigby added a comment -

            FYI this should be fixed in 7.1.0-2485.

            drigby Dave Rigby added a comment - FYI this should be fixed in 7.1.0-2485.
            michael.blow Michael Blow added a comment -

            Verified that on recent Neo manifests, the regression observed by Analytics is gone.

            michael.blow Michael Blow added a comment - Verified that on recent Neo manifests, the regression observed by Analytics is gone.

            People

              michael.blow Michael Blow
              michael.blow Michael Blow
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty