Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-48601

LEFT JOIN on collections returns wrong results

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 7.0.0, Neo, 7.0.2, 7.0.1
    • Neo, 7.0.2
    • query
    • Untriaged
    • 1
    • Yes

    Description

      The following LEFT JOIN on collections returns 0 results.

      DROP INDEX ix20 ON default.s1.c1;
      DROP INDEX ix21 ON default.s1.c2;
      UPSERT INTO default.s1.c1 VALUES("c1k01",{"_scope":"test", "_type":"achievement", "id":"k01", "hash":10});
      UPSERT INTO default.s1.c1 VALUES("c1k02",{"_scope":"test", "_type":"achievement", "id":"k02", "hash":20});
      UPSERT INTO default.s1.c1 VALUES("c1k03",{"_scope":"test", "_type":"achievement_order", "id":"k03", "hash":30});
      CREATE INDEX ix20 ON default.s1.c1(`_type`,`_scope`);
      CREATE INDEX ix21 ON default.s1.c2(`status`,`_type`,`hash`,`_scope`,`id`);
      SELECT col_data.id AS _k, dex as c
      FROM default.s1.c1 col_data
      LEFT JOIN  default.s1.c2 as  dex
      ON dex.id = col_data.id
         AND dex.status = 'progress'
         AND dex.`hash` != col_data.`hash`
         AND dex._scope=col_data._scope
         AND dex._type= col_data._type
      WHERE col_data._scope='test' AND col_data._type IN ['achievement', 'achievement_order'];
      

      Instead of collection on bucket returns right results.

      DROP INDEX ix20 ON default;
      DROP INDEX ix21 ON default;
      UPSERT INTO default VALUES("c1k01",{"_scope":"test", "_type":"achievement", "id":"k01", "hash":10});
      UPSERT INTO default VALUES("c1k02",{"_scope":"test", "_type":"achievement", "id":"k02", "hash":20});
      UPSERT INTO default VALUES("c1k03",{"_scope":"test", "_type":"achievement_order", "id":"k03", "hash":30});
      CREATE INDEX ix20 ON default(`_type`,`_scope`);
      CREATE INDEX ix21 ON default(`status`,`_type`,`hash`,`_scope`,`id`);
      SELECT col_data.id AS _k, dex as c
      FROM default col_data
      LEFT JOIN  default as  dex
      ON dex.id = col_data.id
         AND dex.status = 'progress'
         AND dex.`hash` != col_data.`hash`
         AND dex._scope=col_data._scope
         AND dex._type= col_data._type
      WHERE col_data._scope='test' AND col_data._type IN ['achievement', 'achievement_order'];
      

      Collection:

      	                    {
      	                        "#operator": "NestedLoopJoin",
      	                        "#stats": {
      	                            "#itemsIn": 3,
      	                            "#phaseSwitches": 25,
      	                            "execTime": "33.854µs",
      	                            "kernTime": "1.700277ms"
      	                        },
      	                        "alias": "dex",
      	                        "filter": "(not ((`dex`.`hash`) = (`col_data`.`hash`)))",
      	                        "on_clause": "((((((`dex`.`id`) = (`col_data`.`id`)) and ((`dex`.`status`) = \"progress\")) and (not ((`dex`.`hash`) = (`col_data`.`hash`)))) and ((`dex`.`_scope`) = (`col_data`.`_scope`))) and ((`dex`.`_type`) = (`col_data`.`_type`)))",
      	                        "outer": true,
      

      Bucket

      {
      	                        "#operator": "NestedLoopJoin",
      	                        "#stats": {
      	                            "#itemsIn": 3,
      	                            "#itemsOut": 3,
      	                            "#phaseSwitches": 31,
      	                            "execTime": "11.183µs",
      	                            "kernTime": "1.647968ms"
      	                        },
      	                        "alias": "dex",
      	                        "on_clause": "((((((`dex`.`id`) = (`col_data`.`id`)) and ((`dex`.`status`) = \"progress\")) and (not ((`dex`.`hash`) = (`col_data`.`hash`)))) and ((`dex`.`_scope`) = (`col_data`.`_scope`))) and ((`dex`.`_type`) = (`col_data`.`_type`)))",
      	                        "optimizer_estimates": {
      	                            "cardinality": 0.00005925925925925928,
      	                            "cost": 24.034868485281375,
      	                            "fr_cost": 24.034868485281375,
      	                            "size": 2
      	                        },
      	                        "outer": true,
      

      Why there is extra filter on "filter": "(not ((`dex`.`hash`) = (`col_data`.`hash`)))" on JOIN for collection for not bucket? Is this impacting?

      Same issue with HASH Join too.

      https://forums.couchbase.com/t/left-join-works-as-inner-join-condition/31810

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          Sitaram.Vemulapalli Sitaram Vemulapalli added a comment - - edited

          https://github.com/couchbase/query/blob/master/execution/join_nl.go#L164
          The filter causing https://github.com/couchbase/query/blob/master/execution/join_nl.go#L164.
          If the filter is left side it has been applied before, if on right side it should have applied on scan. Not On JOIN condition. Need to investigate why it is derived?

          Sitaram.Vemulapalli Sitaram Vemulapalli added a comment - - edited https://github.com/couchbase/query/blob/master/execution/join_nl.go#L164 The filter causing https://github.com/couchbase/query/blob/master/execution/join_nl.go#L164 . If the filter is left side it has been applied before, if on right side it should have applied on scan. Not On JOIN condition. Need to investigate why it is derived?

          Build couchbase-server-7.1.0-1360 contains query commit b2f0b4d with commit message:
          MB-48601 Handle transformed filter when checking for postjoin filter

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-1360 contains query commit b2f0b4d with commit message: MB-48601 Handle transformed filter when checking for postjoin filter

          Build couchbase-server-7.0.2-6694 contains query commit a85fba1 with commit message:
          MB-48601 Handle transformed filter when checking for postjoin filter

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.2-6694 contains query commit a85fba1 with commit message: MB-48601 Handle transformed filter when checking for postjoin filter

          Verified on 7.0.2-6694 and 7.1.0-1360

          cbq> SELECT col_data.id AS _k, dex as c
             > FROM default.s1.c1 col_data
             > LEFT JOIN  default.s1.c2 as  dex
             > ON dex.id = col_data.id
             >    AND dex.status = 'progress'
             >    AND dex.`hash` != col_data.`hash`
             >    AND dex._scope=col_data._scope
             >    AND dex._type= col_data._type
             > WHERE col_data._scope='test' AND col_data._type IN ['achievement', 'achievement_order'];
          {
              "requestID": "3cf42c97-a382-403d-bffd-a0e4c97fe8e9",
              "signature": {
                  "_k": "json",
                  "c": "json"
              },
              "results": [
              {
                  "_k": "k01"
              },
              {
                  "_k": "k02"
              },
              {
                  "_k": "k03"
              }
              ],
              "status": "success",
              "metrics": {
                  "elapsedTime": "26.788886ms",
                  "executionTime": "26.623614ms",
                  "resultCount": 3,
                  "resultSize": 81,
                  "serviceLoad": 6
              }
          }
           

          pierre.regazzoni Pierre Regazzoni added a comment - Verified on 7.0.2-6694 and 7.1.0-1360 cbq> SELECT col_data.id AS _k, dex as c > FROM default .s1.c1 col_data > LEFT JOIN default .s1.c2 as dex > ON dex.id = col_data.id > AND dex.status = 'progress' > AND dex.`hash` != col_data.`hash` > AND dex._scope=col_data._scope > AND dex._type= col_data._type > WHERE col_data._scope= 'test' AND col_data._type IN [ 'achievement' , 'achievement_order' ]; { "requestID" : "3cf42c97-a382-403d-bffd-a0e4c97fe8e9" , "signature" : { "_k" : "json" , "c" : "json" }, "results" : [ { "_k" : "k01" }, { "_k" : "k02" }, { "_k" : "k03" } ], "status" : "success" , "metrics" : { "elapsedTime" : "26.788886ms" , "executionTime" : "26.623614ms" , "resultCount" : 3 , "resultSize" : 81 , "serviceLoad" : 6 } }

          People

            pierre.regazzoni Pierre Regazzoni
            Sitaram.Vemulapalli Sitaram Vemulapalli
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty