Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-24284

[N1QL] count(distinct field) gives incorrect/different results with secondary index vs primary index

    XMLWordPrintable

Details

    • Untriaged
    • Yes

    Description

      Test which fails:
      ./testrunner -i query2.ini -p doc-per-day=6 -t tuqquery.tuq_2i_index.QueriesIndexTests.test_count_distinct

      Query executed:
      select count(distinct VMs[1].os) from default where VMs[1].os='windows' and tasks_points.task1>1

      Explain plan of above query:

      {
          "plan": {
            "#operator": "Sequence",
            "~children": [
              {
                "#operator": "IndexCountDistinctScan2",
                "covers": [
                  "cover ((((`default`.`VMs`)[1]).`os`))",
                  "cover (((`default`.`tasks_points`).`task1`))",
                  "cover ((meta(`default`).`id`))"
                ],
                "index": "idx2",
                "index_id": "fda272eda4be601",
                "keyspace": "default",
                "namespace": "default",
                "spans": [
                  {
                    "exact": true,
                    "range": [
                      {
                        "high": "\"windows\"",
                        "inclusion": 3,
                        "low": "\"windows\""
                      },
                      {
                        "inclusion": 0,
                        "low": "1"
                      }
                    ]
                  }
                ],
                "using": "gsi"
              },
              {
                "#operator": "IndexCountProject",
                "result_terms": [
                  {
                    "expr": "count(distinct (((`default`.`VMs`)[1]).`os`))"
                  }
                ]
              }
            ]
          },
      
      

      Indexes present:
      CREATE INDEX `idx` ON `default`(`job_title`) WHERE (`email` like "%@%.%")
      CREATE INDEX `idx2` ON `default`(((`VMs`[1]).`os`),(`tasks_points`.`task1`))
      CREATE INDEX `idx3` ON `default`(((`VMs`[1]).`os`),`name`) WHERE ((`tasks_points`.`task1`) = 1)

      Result for query with secondary index:
      [

      { "$1": 5 }

      ]

      Result for query with primary index:
      [

      { "$1": 1 }

      ]

      Distinct query without count gives same results for primary and secondary index:

      select (distinct VMs[1].os) from default where VMs[1].os='windows' and tasks_points.task1>1;
      gives 10080 results.

      Please let me know if you need the dataset.

      Uploading dataset tar file.

      I am still checking if its a regression.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            prathibha Prathibha Bisarahalli (Inactive)
            Prerna.Manaktala Prerna Manaktala (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty