Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51118

ddoc creation failures are seen on nonroot couchbase cluster

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 1
    • No

    Description

      Script to Repro

      ./testrunner -i /tmp/win10_bucket_ops_ini_non_root.ini -p bucket_storage=couchstore,get-cbcollect-info=True,nodes_init=4 -t failover.failovertests.FailoverTests.test_failover_stop_server,replicas=1,graceful=False,num_failed_nodes=1,numViews=1,withViewsOps=True,createIndexesDuringFailover=True,items=10000,failoverMaster=True,bucket_storage=couchstore
      

      Noticed ddoc creation failures on nonroot couchbase cluster. Looks similar to MB-48360.

      {"error":"{bad_return_value,\n    {undef,\n        [{snappy,compress,\n             [[1,\n               [<<1,80,0,0,27>>,\n                <<\"_design/dev_dev_ddoc1\">>,\n                <<0,0,0,0,0,1,0,0,0,95,0,0,0,0,4,150,0,0,0,0,0,1,0,7,118,\n                  141,24>>]]],\n             []},\n         {couch_compress,compress,1,\n             [{file,\n                  \"/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/couchdb/couch_compress.erl\"},\n              {line,20}]},\n         {couch_btree,'-write_node/3-lc$^0/1-0-',5,\n             [{file,\n                  \"/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/couchdb/couch_btree.erl\"},\n              {line,514}]},\n         {couch_btree,write_node,3,\n             [{file,\n                  \"/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/couchdb/couch_btree.erl\"},\n              {line,525}]},\n         {couch_btree,modify_node,8,\n             [{file,\n                  \"/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/couchdb/couch_btree.erl\"},\n              {line,418}]},\n         {couch_btree,query_modify_raw,2,\n             [{file,\n                  \"/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/couchdb/couch_btree.erl\"},\n              {line,261}]},\n         {couch_db_updater,update_docs_int,4,\n             [{file,\n                  \"/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/couchdb/couch_db_updater.erl\"},\n              {line,601}]},\n         {couch_db_updater,handle_info,2,\n             [{file,\n                  \"/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/couchdb/couch_db_updater.erl\"},\n              {line,317}]}]}}","reason":"{gen_server,call,\n            ['capi_ddoc_manager-travel-sample',\n             {interactive_update,{doc,<<\"_design/dev_dev_ddoc1\">>,\n                                      {0,<<>>},\n                                      <<\"{\\\"views\\\":{\\\"default_view1\\\":{\\\"map\\\":\\\"function (doc, meta) {\\\\n  emit(meta.id, null);\\\\n}\\\"}}}\">>,\n                                      0,false,[]}},\n             infinity]}"}
      

      cbcollect_info attached. See and

      Attachments

        1. consoleText_MB-51118
          1.84 MB
        2. consoleText_MB-51118.txt
          122 kB
        3. ddoc_creation_api.png
          ddoc_creation_api.png
          410 kB
        4. ddoc_creation_failure.png
          ddoc_creation_failure.png
          470 kB
        5. Screenshot 2022-02-24 at 1.29.06 PM.png
          Screenshot 2022-02-24 at 1.29.06 PM.png
          559 kB
        For Gerrit Dashboard: MB-51118
        # Subject Branch Project Status CR V

        Activity

          Balakumaran Gopal I'm unable to reproduce this on my local setup with non-root installation.

          However, do observe that rpath for $ORIGIN/../lib:/opt/couchbase/lib which doesn't match up because we will first try searching for dependencies /home/nonroot/opt/couchbase/lib/couchdb/erlang/lib/ whereas the dependencies actually are at /home/nonroot/opt/couchbase/lib.

          The rpath is set correctly for mapreduce_nif though: $ORIGIN/../../../../../:/opt/couchbase/lib

          Can you please try the same non-root install with the following toy build: http://latestbuilds.service.couchbase.com/builds/latestbuilds/couchbase-server/toybuilds/14466/ (which has the correct rpath set) and check for the same on a non-root install ?

          Thanks,

          abhishek.jindal Abhishek Jindal added a comment - Balakumaran Gopal I'm unable to reproduce this on my local setup with non-root installation. However, do observe that rpath for $ORIGIN/../lib:/opt/couchbase/lib which doesn't match up because we will first try searching for dependencies /home/nonroot/opt/couchbase/lib/couchdb/erlang/lib/ whereas the dependencies actually are at /home/nonroot/opt/couchbase/lib . The rpath is set correctly for mapreduce_nif though: $ORIGIN/../../../../../:/opt/couchbase/lib — Can you please try the same non-root install with the following toy build: http://latestbuilds.service.couchbase.com/builds/latestbuilds/couchbase-server/toybuilds/14466/ (which has the correct rpath set) and check for the same on a non-root install ? Thanks,

          For more context :

          On this server, we have:

          • couchbase server 6.6.3 installed at root paths (standard installation at /opt/couchbase). this service is not running though.
          • couchbase server 7.1.x installed at non-root path (/home/nonroot) and non-sudo which is what's being run and tested here.

          From the logs there are problems while trying to use the compression library nifs and are not able to locate some dependencies:

           =========================WARNING REPORT=========================
           The on_load function for module snappy returned:
           {error,{load_failed,"Failed to load NIF library: '/home/nonroot/opt/couchbase/lib/couchdb/erlang/lib/snappy-1.0.4/priv/snappy_nif.so: undefined symbol: _ZN6snappy4Sink22AppendAndTakeOwnershipEPcmPFvPvPKcmES2_'"}}
          

          Looking at how the nifs are built and rpaths, I can see one difference here -

          We build and use the following 3 nifs for couchdb:

          1. snappy_nif.so
          2. couch_view_parser_nif.so
          3. mapreduce_nif.so

          These shared objects are located at <installation path>/opt/couchbase/lib/couchdb/erlang/lib and we are supposed to locate the dependencies of these shared objects at
          /home/nonroot/opt/couchbase/lib which relative to nifs path is $ORIGIN/../../../../../ or /opt/couchbase/bin (only valid for a root install).

          However, checking for snappy_nif.so rpath, we see $ORIGIN/../lib which resolves to an incorrect path: /home/nonroot/opt/couchbase/lib/couchdb/erlang/lib/snappy-1.0.4/lib

          This is in contrast to rpath for mapreduce_nif.so which is indeed correct: $ORIGIN/../../../../... This change in rpath for mapreduce nif was made around 4 yrs ago in 5.5.0 timeframe: https://review.couchbase.org/c/couchdb/+/94316/

          However, for some reason wasn't made for other nifs. Example :

          1. https://github.com/couchbase/couchdb/blob/8a75fd2faa89f95158de1776354ceccf3e762753/src/snappy/CMakeLists.txt
          2. https://github.com/couchbase/couchdb/blob/8a75fd2faa89f95158de1776354ceccf3e762753/src/couchdb/priv/CMakeLists.txt#L30-L34

          I tried copying over the change made in mapreduce cmake to the above two: https://review.couchbase.org/c/couchdb/+/171299 which resolves the dependency issues as QE confirmed above.

          Chris Hillery I believe you made the rpath change mentioned above via https://review.couchbase.org/c/couchdb/+/94316/ . Would be helpful if you can review the analysis above and confirm whether the rpath needs a change for the other two shared objects? In particular, whether https://review.couchbase.org/c/couchdb/+/171299 should be sufficient enough to resolve the dependency issues whether it's for a root or non-root install?

          abhishek.jindal Abhishek Jindal added a comment - For more context : On this server, we have: couchbase server 6.6.3 installed at root paths (standard installation at /opt/couchbase). this service is not running though. couchbase server 7.1.x installed at non-root path (/home/nonroot) and non-sudo which is what's being run and tested here. From the logs there are problems while trying to use the compression library nifs and are not able to locate some dependencies: =========================WARNING REPORT========================= The on_load function for module snappy returned: {error,{load_failed,"Failed to load NIF library: '/home/nonroot/opt/couchbase/lib/couchdb/erlang/lib/snappy-1.0.4/priv/snappy_nif.so: undefined symbol: _ZN6snappy4Sink22AppendAndTakeOwnershipEPcmPFvPvPKcmES2_'"}} Looking at how the nifs are built and rpaths, I can see one difference here - We build and use the following 3 nifs for couchdb: 1. snappy_nif.so 2. couch_view_parser_nif.so 3. mapreduce_nif.so These shared objects are located at <installation path>/opt/couchbase/lib/couchdb/erlang/lib and we are supposed to locate the dependencies of these shared objects at /home/nonroot/opt/couchbase/lib which relative to nifs path is $ORIGIN/../../../../../ or /opt/couchbase/bin (only valid for a root install). However, checking for snappy_nif.so rpath, we see $ORIGIN/../lib which resolves to an incorrect path: /home/nonroot/opt/couchbase/lib/couchdb/erlang/lib/snappy-1.0.4/lib This is in contrast to rpath for mapreduce_nif.so which is indeed correct: $ORIGIN/../../../../.. . This change in rpath for mapreduce nif was made around 4 yrs ago in 5.5.0 timeframe: https://review.couchbase.org/c/couchdb/+/94316/ However, for some reason wasn't made for other nifs. Example : 1. https://github.com/couchbase/couchdb/blob/8a75fd2faa89f95158de1776354ceccf3e762753/src/snappy/CMakeLists.txt 2. https://github.com/couchbase/couchdb/blob/8a75fd2faa89f95158de1776354ceccf3e762753/src/couchdb/priv/CMakeLists.txt#L30-L34 — I tried copying over the change made in mapreduce cmake to the above two: https://review.couchbase.org/c/couchdb/+/171299 which resolves the dependency issues as QE confirmed above. Chris Hillery I believe you made the rpath change mentioned above via https://review.couchbase.org/c/couchdb/+/94316/ . Would be helpful if you can review the analysis above and confirm whether the rpath needs a change for the other two shared objects? In particular, whether https://review.couchbase.org/c/couchdb/+/171299 should be sufficient enough to resolve the dependency issues whether it's for a root or non-root install?

          Abhishek Jindal Yes, I believe your diagnosis is correct. I am a little confused how the current CMakeLists for mapreduce is ending up with the RPATH being set to $ORIGIN/../../../../..:/opt/couchbase/lib - I don't know where the /opt/couchbase/lib part is coming from. But that's a relatively minor issue.

          I think your proposed change looks correct, and I've put a +2 on it. Thanks for tracking it down.

          ceej Chris Hillery added a comment - Abhishek Jindal Yes, I believe your diagnosis is correct. I am a little confused how the current CMakeLists for mapreduce is ending up with the RPATH being set to $ORIGIN/../../../../..:/opt/couchbase/lib - I don't know where the /opt/couchbase/lib part is coming from. But that's a relatively minor issue. I think your proposed change looks correct, and I've put a +2 on it. Thanks for tracking it down.

          Build couchbase-server-7.1.0-2407 contains couchdb commit fff78b5 with commit message:
          MB-51118 : nifs dependency lookup

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-2407 contains couchdb commit fff78b5 with commit message: MB-51118 : nifs dependency lookup

          Validated this on 7.1.0-2407. See consoleText_MB-51118 for logs.

          Balakumaran.Gopal Balakumaran Gopal added a comment - Validated this on 7.1.0-2407. See consoleText_MB-51118 for logs.

          People

            Balakumaran.Gopal Balakumaran Gopal
            Balakumaran.Gopal Balakumaran Gopal
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty