Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7592

[system test] view compaction crashes often

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.1
    • Fix Version/s: 2.0.1
    • Component/s: test-execution
    • Security Level: Public
    • Labels:
    • Environment:
      Windows 2008 R2 64bit

      Description

      Environment:

      • 9 windows 2008 R2 64bit.
      • Each server has 4 CPU, 8GB RAM and SSD disk
      • Cluster has 2 buckets, default and sasl bucket with consistent view enable.
      • Load 26 million items to default bucket and 16 million items to sasl bucket. Each key has size from 128 to 512 bytes
      • Each bucket has one doc and 2 views for each doc.
      • Rebalance out 2 nodes 10.3.121.173 and 10.3.121.243

      Starting rebalance, KeepNodes = ['ns_1@10.3.3.181','ns_1@10.3.121.47',
      'ns_1@10.3.3.214','ns_1@10.3.3.182',
      'ns_1@10.3.3.180','ns_1@10.3.121.171',
      'ns_1@10.3.121.169'], EjectNodes = ['ns_1@10.3.121.173',
      'ns_1@10.3.121.243'] ns_orchestrator004 ns_1@10.3.121.169 23:26:03 - Tue Jan 22, 2013

      [couchdb:error,2013-01-23T5:02:44.204,ns_1@10.3.121.169:<0.10514.1>:couch_log:error:42]Set view `default`, replica group `_design/d1`, compactor process <0.24289.57> died with unexpected reason: {{badmatch,
      {error,

      {file_error, "c:/view/@indexes/default/tmp_40ae216d3de3826de065e665ac0f52dd_replica/d8004c67163a37e21f8c542b9ffa9c05.compact", enoent}

      }},
      [

      {couch_set_view_compactor, merge_files, 3}

      ,

      {couch_set_view_compactor, apply_log, 4}

      ,

      {couch_set_view_compactor, maybe_retry_compact, 5}

      ]}
      [ns_server:warn,2013-01-23T5:02:44.204,ns_1@10.3.121.169:<0.12760.55>:compaction_daemon:do_chain_compactors:660]Compactor for view `default/_design/d1/replica` (pid [

      {type,view},
      {important,true},
      {name, <<"default/_design/d1/replica">>},
      {fa,
      {#Fun<compaction_daemon.16.119703424>,
      [<<"default">>,
      <<"_design/d1">>,
      replica,
      {config,
      {30, 18446744073709551616},
      {30, 18446744073709551616},
      undefined,false,
      false,
      {daemon_config,30,
      131072}},
      false,
      {[{type,bucket}]}]}}]) terminated unexpectedly: {{badmatch,
      {error,
      {file_error, "c:/view/@indexes/default/tmp_40ae216d3de3826de065e665ac0f52dd_replica/d8004c67163a37e21f8c542b9ffa9c05.compact", enoent}}},
      [{couch_set_view_compactor, merge_files, 3},
      {couch_set_view_compactor, apply_log, 4},
      {couch_set_view_compactor, maybe_retry_compact, 5}]}
      [error_logger:error,2013-01-23T5:02:44.204,ns_1@10.3.121.169:error_logger<0.6.0>:ale_error_logger_handler:log_report:72]
      =========================CRASH REPORT=========================
      crasher:
      initial call: compaction_daemon:spawn_view_index_compactor/6-fun-0/0
      pid: <0.29697.56>
      registered_name: []
      exception exit: {{badmatch,
      {error,
      {file_error, "c:/view/@indexes/default/tmp_40ae216d3de3826de065e665ac0f52dd_replica/d8004c67163a37e21f8c542b9ffa9c05.compact", enoent}}},
      [{couch_set_view_compactor,merge_files,3},
      {couch_set_view_compactor,apply_log,4},
      {couch_set_view_compactor,maybe_retry_compact,5}]}
      in function compaction_daemon:do_spawn_view_index_compactor/5
      in call from compaction_daemon:'spawn_view_index_compactor/6-fun-0'/7
      ancestors: [<0.12760.55>,<0.16968.54>,<0.16931.54>,compaction_daemon,
      <0.152.1>,ns_server_sup,ns_server_cluster_sup,<0.67.0>]
      messages: []
      links: [<0.12760.55>]
      dictionary: []
      trap_exit: true
      status: running
      heap_size: 4181
      stack_size: 24
      reductions: 2873
      neighbours:

      [error_logger:error,2013-01-23T5:02:44.219,ns_1@10.3.121.169:error_logger<0.6.0>:ale_error_logger_handler:log_msg:76]Error in process <0.24289.57> on node 'ns_1@10.3.121.169' with exit value: badmatch,{error,{file_error,"c:/view/@indexes/default/tmp_40ae216d3de3826de065e665ac0f52dd_replica/d8004c67163a37e21f8c542b9ffa9c05.compact",enoent},[{couch_set_view_compactor,merge_files,3},{couch_set_view_compactor...


      [ns_server:warn,2013-01-23T5:02:44.219,ns_1@10.3.121.169:<0.16968.54>:compaction_daemon:do_chain_compactors:665]Compactor for view `default/_design/d1` (pid [{type,view}

      ,

      {name,<<"default/_design/d1">>}

      ,

      {important,false}

      ,
      {fa,
      {#Fun<compaction_daemon.20.123074146>,
      [<<"default">>,
      <<"_design/d1">>,
      {config,

      {30,18446744073709551616},
      {30,18446744073709551616}

      ,
      undefined,false,false,
      {daemon_config,30,131072}},
      false,
      {[

      {type,bucket}

      ]}]}}]) terminated unexpectedly (ignoring this): {{badmatch,
      {error,

      {file_error, "c:/view/@indexes/default/tmp_40ae216d3de3826de065e665ac0f52dd_replica/d8004c67163a37e21f8c542b9ffa9c05.compact", enoent}

      }},
      [

      {couch_set_view_compactor, merge_files, 3}

      ,

      {couch_set_view_compactor, apply_log, 4}

      ,

      {couch_set_view_compactor, maybe_retry_compact, 5}

      ]}
      [error_logger:error,2013-01-23T5:02:44.219,ns_1@10.3.121.169:error_logger<0.6.0>:ale_error_logger_handler:log_report:72]
      =========================CRASH REPORT=========================
      crasher:
      initial call: compaction_daemon:spawn_view_compactor/5-fun-1/0
      pid: <0.12760.55>
      registered_name: []
      exception exit: {{badmatch,
      {error,

      {file_error, "c:/view/@indexes/default/tmp_40ae216d3de3826de065e665ac0f52dd_replica/d8004c67163a37e21f8c542b9ffa9c05.compact", enoent}

      }},
      [

      {couch_set_view_compactor,merge_files,3}

      ,

      {couch_set_view_compactor,apply_log,4}

      ,

      {couch_set_view_compactor,maybe_retry_compact,5}

      ]}
      in function compaction_daemon:do_chain_compactors/2
      ancestors: [<0.16968.54>,<0.16931.54>,compaction_daemon,<0.152.1>,
      ns_server_sup,ns_server_cluster_sup,<0.67.0>]
      messages: []
      links: [<0.16968.54>]
      dictionary: []
      trap_exit: true
      status: running
      heap_size: 1597
      stack_size: 24
      reductions: 4965
      neighbours:

      Crash happens during rebalance. Rebalance failed around 8:30 AM Jan 23 2013

      Link to manifest file of this build http://builds.hq.northscale.net/latestbuilds/couchbase-server-enterprise_x86_64_2.0.1-140-rel.setup.exe.manifest.xml

      Link to collect info of all nodes https://s3.amazonaws.com/packages.couchbase/collect_info/2_0_1/201301/9nodes-col-201-140-rebalance-failed-buckets-shutdown-20130123-14-11-14.tgz

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        thuan Thuan Nguyen added a comment -

        Another crash at 7:40

        [error_logger:error,2013-01-23T7:40:55.719,ns_1@10.3.121.169:error_logger<0.6.0>:ale_error_logger_handler:log_report:72]
        =========================CRASH REPORT=========================
        crasher:
        initial call: compaction_daemon:spawn_view_index_compactor/6-fun-0/0
        pid: <0.15309.72>
        registered_name: []
        exception exit: {updater_died,
        {updater_error,
        badmatch,{error,enoent,
        [

        {couch_set_view_updater_helper,update_btree,5},
        {couch_set_view_updater, '-update_btrees/1-fun-0-',7},
        {lists,mapfoldl,3},
        {lists,mapfoldl,3},
        {couch_set_view_updater,update_btrees,1},
        {couch_set_view_updater, '-spawn_updater_worker/2-fun-2-',6}]}}}
        in function compaction_daemon:do_spawn_view_index_compactor/5
        in call from compaction_daemon:'spawn_view_index_compactor/6-fun-0'/7
        ancestors: [<0.5372.71>,<0.23398.69>,<0.23392.69>,compaction_daemon,
        <0.152.1>,ns_server_sup,ns_server_cluster_sup,<0.67.0>]
        messages: []
        links: [<0.5372.71>]
        dictionary: []
        trap_exit: true
        status: running
        heap_size: 4181
        stack_size: 24
        reductions: 2830
        neighbours:

        [rebalance:info,2013-01-23T7:40:56.079,ns_1@10.3.121.169:<0.23824.73>:janitor_agent:set_vbucket_state:419]Doing vbucket 42 state change: {'ns_1@10.3.3.214',replica,undefined,undefined}
        [error_logger:error,2013-01-23T7:40:56.079,ns_1@10.3.121.169:error_logger<0.6.0>:ale_error_logger_handler:log_report:72]
        =========================CRASH REPORT=========================
        crasher:
        initial call: compaction_daemon:spawn_view_compactor/5-fun-1/0
        pid: <0.5372.71>
        registered_name: []
        exception exit: {updater_died,
        {updater_error,
        badmatch,{error,enoent,
        [{couch_set_view_updater_helper,update_btree,5}

        ,

        {couch_set_view_updater, '-update_btrees/1-fun-0-',7}

        ,

        {lists,mapfoldl,3},
        {lists,mapfoldl,3}

        ,

        {couch_set_view_updater,update_btrees,1}

        ,

        {couch_set_view_updater, '-spawn_updater_worker/2-fun-2-',6}

        ]}}}
        in function compaction_daemon:do_chain_compactors/2
        ancestors: [<0.23398.69>,<0.23392.69>,compaction_daemon,<0.152.1>,
        ns_server_sup,ns_server_cluster_sup,<0.67.0>]
        messages: []
        links: [<0.23398.69>]
        dictionary: []
        trap_exit: true
        status: running
        heap_size: 4181
        stack_size: 24
        reductions: 4857
        neighbours:

        Show
        thuan Thuan Nguyen added a comment - Another crash at 7:40 [error_logger:error,2013-01-23T7:40:55.719,ns_1@10.3.121.169:error_logger<0.6.0>:ale_error_logger_handler:log_report:72] =========================CRASH REPORT========================= crasher: initial call: compaction_daemon: spawn_view_index_compactor/6-fun-0 /0 pid: <0.15309.72> registered_name: [] exception exit: {updater_died, {updater_error, badmatch,{error,enoent , [ {couch_set_view_updater_helper,update_btree,5}, {couch_set_view_updater, '-update_btrees/1-fun-0-',7}, {lists,mapfoldl,3}, {lists,mapfoldl,3}, {couch_set_view_updater,update_btrees,1}, {couch_set_view_updater, '-spawn_updater_worker/2-fun-2-',6}]}}} in function compaction_daemon:do_spawn_view_index_compactor/5 in call from compaction_daemon:' spawn_view_index_compactor/6-fun-0 '/7 ancestors: [<0.5372.71>,<0.23398.69>,<0.23392.69>,compaction_daemon, <0.152.1>,ns_server_sup,ns_server_cluster_sup,<0.67.0>] messages: [] links: [<0.5372.71>] dictionary: [] trap_exit: true status: running heap_size: 4181 stack_size: 24 reductions: 2830 neighbours: [rebalance:info,2013-01-23T7:40:56.079,ns_1@10.3.121.169:<0.23824.73>:janitor_agent:set_vbucket_state:419] Doing vbucket 42 state change: {'ns_1@10.3.3.214',replica,undefined,undefined} [error_logger:error,2013-01-23T7:40:56.079,ns_1@10.3.121.169:error_logger<0.6.0>:ale_error_logger_handler:log_report:72] =========================CRASH REPORT========================= crasher: initial call: compaction_daemon: spawn_view_compactor/5-fun-1 /0 pid: <0.5372.71> registered_name: [] exception exit: {updater_died, {updater_error, badmatch,{error,enoent , [{couch_set_view_updater_helper,update_btree,5} , {couch_set_view_updater, '-update_btrees/1-fun-0-',7} , {lists,mapfoldl,3}, {lists,mapfoldl,3} , {couch_set_view_updater,update_btrees,1} , {couch_set_view_updater, '-spawn_updater_worker/2-fun-2-',6} ]}}} in function compaction_daemon:do_chain_compactors/2 ancestors: [<0.23398.69>,<0.23392.69>,compaction_daemon,<0.152.1>, ns_server_sup,ns_server_cluster_sup,<0.67.0>] messages: [] links: [<0.23398.69>] dictionary: [] trap_exit: true status: running heap_size: 4181 stack_size: 24 reductions: 4857 neighbours:
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        Tony,

        which node did compaction crash ?
        does compaction crash continously on this node ?
        what is the index size versus disk size for that index now ?

        Show
        farshid Farshid Ghods (Inactive) added a comment - Tony, which node did compaction crash ? does compaction crash continously on this node ? what is the index size versus disk size for that index now ?
        Hide
        FilipeManana Filipe Manana (Inactive) added a comment -

        Let's be precise here. What crashes is the view compactor, not the compaction daemon (scheduler), which is a different animal in a different component.

        I have a fix here: http://review.couchbase.org/#/c/24178/

        But I can't merge it because latest 2.0.1 is unstable at least for me, see CBQE-993.

        Show
        FilipeManana Filipe Manana (Inactive) added a comment - Let's be precise here. What crashes is the view compactor, not the compaction daemon (scheduler), which is a different animal in a different component. I have a fix here: http://review.couchbase.org/#/c/24178/ But I can't merge it because latest 2.0.1 is unstable at least for me, see CBQE-993.
        Hide
        thuan Thuan Nguyen added a comment -

        Integrated in github-couchdb-preview #557 (See http://qa.hq.northscale.net/job/github-couchdb-preview/557/)
        MB-7592 Race condition free management of tmp files (Revision 8565f54d4b8f94a63afdc79175cc6a7da9800510)

        Result = SUCCESS
        Filipe David Borba Manana :
        Files :

        • src/couch_set_view/src/couch_set_view_util.erl
        • src/couch_set_view/src/couch_set_view_group.erl
        • src/couch_set_view/src/couch_set_view_compactor.erl
        • src/couch_set_view/src/couch_set_view_updater.erl
        Show
        thuan Thuan Nguyen added a comment - Integrated in github-couchdb-preview #557 (See http://qa.hq.northscale.net/job/github-couchdb-preview/557/ ) MB-7592 Race condition free management of tmp files (Revision 8565f54d4b8f94a63afdc79175cc6a7da9800510) Result = SUCCESS Filipe David Borba Manana : Files : src/couch_set_view/src/couch_set_view_util.erl src/couch_set_view/src/couch_set_view_group.erl src/couch_set_view/src/couch_set_view_compactor.erl src/couch_set_view/src/couch_set_view_updater.erl
        Hide
        FilipeManana Filipe Manana (Inactive) added a comment -

        Merged.

        Tony, can you test the next build including this patch (preferably on Windows again)?

        Thanks

        Show
        FilipeManana Filipe Manana (Inactive) added a comment - Merged. Tony, can you test the next build including this patch (preferably on Windows again)? Thanks
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        Tony,
        please verify this with build 144+

        Show
        farshid Farshid Ghods (Inactive) added a comment - Tony, please verify this with build 144+

          People

          • Assignee:
            thuan Thuan Nguyen
            Reporter:
            thuan Thuan Nguyen
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes