Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-48563

[IPV6] Advisor not returning session advise with cluster using IPV6 address

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 7.0.2
    • Fix Version/s: Neo, 7.0.2
    • Component/s: query
    • Environment:
      7.0.2-6883
    • Triage:
      Untriaged
    • Story Points:
      1
    • Is this a Regression?:
      No

      Description

      To repro:

      • setup single node using ipv6 address
      • load travel-sample
      • connect via cbq and run following:
        • SELECT ADVISOR({'action':'start', 'duration':'40m'})
        • SELECT airportname FROM `travel-sample` WHERE type = "airport" AND lower(city) = "lyon" AND country = "France";
        • SELECT airportname FROM `travel-sample` WHERE type = "airport" AND lower(city) = "grenoble" AND country = "France";
        • select advisor({'action':'list'});
        • select advisor({'action':'stop', 'session':'18a0a2d5-9750-4245-8415-447bff9c1322'}); – replace with your session-id
        • select advisor({'action':'list'}) or select advisor({'action':'get', 'session':'18a0a2d5-9750-4245-8415-447bff9c1322'});

      Advise is empty:

      cbq> SELECT ADVISOR({'action':'start', 'duration':'40m'});
      {
          "requestID": "aff6c04f-e145-4f98-b154-a6726ac9fc86",
          "signature": {
              "$1": "object"
          },
          "results": [
          {
              "$1": {
                  "session": "18a0a2d5-9750-4245-8415-447bff9c1322"
              }
          }
          ],
          "status": "success",
          "metrics": {
              "elapsedTime": "4.227506ms",
              "executionTime": "4.04726ms",
              "resultCount": 1,
              "resultSize": 95,
              "serviceLoad": 6
          }
      }
      cbq> SELECT airportname FROM `travel-sample` WHERE type = "airport" AND lower(city) = "lyon" AND country = "France";
      {
          "requestID": "913a0ace-bd10-49b0-ba47-fb0980848529",
          "signature": {
              "airportname": "json"
          },
          "results": [
          {
              "airportname": "Bron"
          },
          {
              "airportname": "Lyon Part-Dieu Railway"
          },
          {
              "airportname": "Saint Exupery"
          }
          ],
          "status": "success",
          "metrics": {
              "elapsedTime": "156.141559ms",
              "executionTime": "155.886014ms",
              "resultCount": 3,
              "resultSize": 138,
              "serviceLoad": 6
          }
      }
      cbq> SELECT airportname FROM `travel-sample` WHERE type = "airport" AND lower(city) = "grenoble" AND country = "France";
      {
          "requestID": "1b34c6aa-2c71-4fb9-a4ce-caf0e7a1e14f",
          "signature": {
              "airportname": "json"
          },
          "results": [
          {
              "airportname": "Saint Geoirs"
          }
          ],
          "status": "success",
          "metrics": {
              "elapsedTime": "153.643882ms",
              "executionTime": "153.451189ms",
              "resultCount": 1,
              "resultSize": 45,
              "serviceLoad": 6
          }
      }
      cbq> select advisor({'action':'list'});
      {
          "requestID": "0d71eb7f-6f2a-4327-b5a5-d360859a3b49",
          "signature": {
              "$1": "object"
          },
          "results": [
          {
              "$1": [
                  {
                      "tasks_cache": {
                          "class": "advisor",
                          "delay": "40m0s",
                          "id": "6cf01d10-e690-5074-943c-eda1342aea5b",
                          "name": "18a0a2d5-9750-4245-8415-447bff9c1322",
                          "node": "fd63:6f75:6368:2078:b4fb:edff:fe2c:9451:8091",
                          "state": "scheduled",
                          "subClass": "analyze",
                          "submitTime": "2021-09-22 12:40:15.162733883 -0700 PDT m=+973.007142208"
                      }
                  },
                  {
                      "tasks_cache": {
                          "class": "advisor",
                          "delay": "40m0s",
                          "id": "6cf01d10-e690-5074-943c-eda1342aea5b",
                          "name": "18a0a2d5-9750-4245-8415-447bff9c1322",
                          "node": "[fd63:6f75:6368:2078:b4fb:edff:fe2c:9451]:8091",
                          "state": "scheduled",
                          "subClass": "analyze",
                          "submitTime": "2021-09-22 12:40:15.162733883 -0700 PDT m=+973.007142208"
                      }
                  }
              ]
          }
          ],
          "status": "success",
          "metrics": {
              "elapsedTime": "8.306142ms",
              "executionTime": "8.11363ms",
              "resultCount": 1,
              "resultSize": 1126,
              "serviceLoad": 6
          }
      }
      cbq> select advisor({'action':'stop', 'session':'18a0a2d5-9750-4245-8415-447bff9c1322'});
      {
          "requestID": "de362223-ce2e-47ba-8fd6-e6c71484a7d2",
          "signature": {
              "$1": "object"
          },
          "results": [
          {
              "$1": []
          }
          ],
          "status": "success",
          "metrics": {
              "elapsedTime": "2.063466559s",
              "executionTime": "2.063197491s",
              "resultCount": 1,
              "resultSize": 24,
              "serviceLoad": 6
          }
      }
      cbq> select advisor({'action':'list'});
      {
          "requestID": "8a070f82-9bbc-4c35-a09b-3aa9ccb2f737",
          "signature": {
              "$1": "object"
          },
          "results": [
          {
              "$1": []
          }
          ],
          "status": "success",
          "metrics": {
              "elapsedTime": "6.310308ms",
              "executionTime": "6.079502ms",
              "resultCount": 1,
              "resultSize": 24,
              "serviceLoad": 6
          }
      }
       

      Note that when i setup cluster with hostname and use ipv6-only settings it works as expected.

      Seems like we had this issue on 7.0 as well http://cb-logs-qe.s3-website-us-west-2.amazonaws.com/7.0.0-5302/jenkins_logs/test_suite_executor/362320/consoleText.txt

       

        Attachments

          Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

            Activity

            Hide
            kamini.jagtiani Kamini Jagtiani added a comment -

            Since Isha Kandaswamy is out, Donald Haggart can you take a look?

            Show
            kamini.jagtiani Kamini Jagtiani added a comment - Since Isha Kandaswamy  is out, Donald Haggart  can you take a look?
            Hide
            Donald.haggart Donald Haggart added a comment - - edited

            In my initial testing using 7.0.2-6683, if you issue the stop once list will list results.  If you issue the stop a second time, the results are lost:

             

            cbq> select advisor({'action':'stop', 'session':'e39c0787-e4cd-4f5e-b0ba-345a1415e2db'});
            {
                "requestID": "f8dcb43b-32d6-4ef6-b806-0de8050672c2",
                "signature": {
                    "$1": "object"
                },
                "results": [
                {
                    "$1": []
                }
                ],
                "status": "success",
                "metrics": {
                    "elapsedTime": "106.744658ms",
                    "executionTime": "106.607137ms",
                    "resultCount": 1,
                    "resultSize": 24,
                    "serviceLoad": 3
                }
            }
            cbq> select advisor({'action':'list'});
            {
                "requestID": "aaaff717-c91d-449e-bb6e-f6f057d86ee9",
                "signature": {
                    "$1": "object"
                },
                "results": [
                {
                    "$1": [
                        {
                            "tasks_cache": {
                                "class": "advisor",
                                "delay": "40m0s",
                                "id": "5cb3bd21-3006-5554-a9bb-e272dde3dcec",
                                "name": "e39c0787-e4cd-4f5e-b0ba-345a1415e2db",
             
            ...
            cbq> select advisor({'action':'stop', 'session':'e39c0787-e4cd-4f5e-b0ba-345a1415e2db'});
            {
                "requestID": "66452603-3121-4885-bc9d-401c4d526c2f",
                "signature": {
                    "$1": "object"
                },
                "results": [
                {
                    "$1": []
                }
                ],
                "status": "success",
                "metrics": {
                    "elapsedTime": "74.794112ms",
                    "executionTime": "74.657144ms",
                    "resultCount": 1,
                    "resultSize": 24,
                    "serviceLoad": 3
                }
            }
            cbq> select advisor({'action':'list'});
            {
                "requestID": "3fb32226-0fad-45c5-a1a1-9e2280fc7521",
                "signature": {
                    "$1": "object"
                },
                "results": [
                {
                    "$1": []
                }
            ...

             

            Same behaviour with a cluster configured as "IPv4", "IPv6" or "IPv6-only".

            Same on 7.1.0 too.

            Show
            Donald.haggart Donald Haggart added a comment - - edited In my initial testing using 7.0.2-6683, if you issue the stop once list will list results.  If you issue the stop a second time, the results are lost:   cbq> select advisor({'action':'stop', 'session':'e39c0787-e4cd-4f5e-b0ba-345a1415e2db'}); { "requestID": "f8dcb43b-32d6-4ef6-b806-0de8050672c2", "signature": { "$1": "object" }, "results": [ { "$1": [] } ], "status": "success", "metrics": { "elapsedTime": "106.744658ms", "executionTime": "106.607137ms", "resultCount": 1, "resultSize": 24, "serviceLoad": 3 } } cbq> select advisor({'action':'list'}); { "requestID": "aaaff717-c91d-449e-bb6e-f6f057d86ee9", "signature": { "$1": "object" }, "results": [ { "$1": [ { "tasks_cache": { "class": "advisor", "delay": "40m0s", "id": "5cb3bd21-3006-5554-a9bb-e272dde3dcec", "name": "e39c0787-e4cd-4f5e-b0ba-345a1415e2db",   ... cbq> select advisor({'action':'stop', 'session':'e39c0787-e4cd-4f5e-b0ba-345a1415e2db'}); { "requestID": "66452603-3121-4885-bc9d-401c4d526c2f", "signature": { "$1": "object" }, "results": [ { "$1": [] } ], "status": "success", "metrics": { "elapsedTime": "74.794112ms", "executionTime": "74.657144ms", "resultCount": 1, "resultSize": 24, "serviceLoad": 3 } } cbq> select advisor({'action':'list'}); { "requestID": "3fb32226-0fad-45c5-a1a1-9e2280fc7521", "signature": { "$1": "object" }, "results": [ { "$1": [] } ...   Same behaviour with a cluster configured as "IPv4", "IPv6" or "IPv6-only". Same on 7.1.0 too.
            Hide
            Donald.haggart Donald Haggart added a comment -

            OK, so the real problem is that the cluster is created and has its hostname set to the text of the IPv6 address. If it was the default "::1" IPv6, then a hostname wouldn't be set.

            Specifically:

             

            $ curl -su Administrator:password http://localhost:8091/pools/default/nodeServices
            {
              "rev": 51,
              "nodesExt": [
                {
            ...
                  "thisNode": true,
                  "hostname": "fd52:a81c:df85:1:e06b:94c5:ef5:a72a"
                }
              ],

            For us, all this means is that the look-up is for a name that just looks like an IPv6 address and this is then assumed to be an IPv6 address (in clustering_cb) since it contains ':' characters and as a result it has '[]' added to it.

             

            WhoAmI() returns the string that is the name (since there is a hostname entry containing this string in nodeServices); this is untouched.

            When we then try and access the tasks_cache (amongst others in the system keyspace) we can't eliminate the local node since these two values don't match.

            So we return rows twice.

            And this fits in to the second issue... deleting once from the tasks cache results in a remaining cancelled task; deleting a second time removes it. So the stop which is driven by a select, ends up calling delete on the key twice thus removing it.

            I have a fix I'm preparing to address this and MB-48576 that'll have ready tomorrow.

             

            Show
            Donald.haggart Donald Haggart added a comment - OK, so the real problem is that the cluster is created and has its hostname set to the text of the IPv6 address. If it was the default "::1" IPv6, then a hostname wouldn't be set. Specifically:   $ curl -su Administrator:password http://localhost:8091/pools/default/nodeServices { "rev": 51, "nodesExt": [ { ... "thisNode": true, "hostname": "fd52:a81c:df85:1:e06b:94c5:ef5:a72a" } ], For us, all this means is that the look-up is for a name that just looks like an IPv6 address and this is then assumed to be an IPv6 address (in clustering_cb) since it contains ':' characters and as a result it has '[]' added to it.   WhoAmI() returns the string that is the name (since there is a hostname entry containing this string in nodeServices); this is untouched. When we then try and access the tasks_cache (amongst others in the system keyspace) we can't eliminate the local node since these two values don't match. So we return rows twice. And this fits in to the second issue... deleting once from the tasks cache results in a remaining cancelled task; deleting a second time removes it. So the stop which is driven by a select, ends up calling delete on the key twice thus removing it. I have a fix I'm preparing to address this and MB-48576 that'll have ready tomorrow.  
            Hide
            Sitaram.Vemulapalli Sitaram Vemulapalli added a comment - - edited

            Checkout https://github.com/couchbase/go-couchbase/blob/master/port_map.go#L98

            How hostname IPV6 resolved. same might need in cluster code.

            Show
            Sitaram.Vemulapalli Sitaram Vemulapalli added a comment - - edited Checkout https://github.com/couchbase/go-couchbase/blob/master/port_map.go#L98 How hostname IPV6 resolved. same might need in cluster code.
            Hide
            build-team Couchbase Build Team added a comment -

            Build couchbase-server-7.1.0-1351 contains query commit ea6b2ca with commit message:
            MB-48563 Better handle IPv6 literal hostnames

            Show
            build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-1351 contains query commit ea6b2ca with commit message: MB-48563 Better handle IPv6 literal hostnames
            Hide
            build-team Couchbase Build Team added a comment -

            Build couchbase-server-7.0.2-6691 contains query commit eb13fc1 with commit message:
            MB-48563 Better handle IPv6 literal hostnames

            Show
            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.2-6691 contains query commit eb13fc1 with commit message: MB-48563 Better handle IPv6 literal hostnames
            Hide
            build-team Couchbase Build Team added a comment -

            Build couchbase-server-7.0.2-6692 contains query commit dbc93ce with commit message:
            MB-48563. Fix build failures

            Show
            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.2-6692 contains query commit dbc93ce with commit message: MB-48563 . Fix build failures
            Hide
            pierre.regazzoni Pierre Regazzoni added a comment -

            Verified on 7.0.2-6692 and 7.1.0-1351

            Show
            pierre.regazzoni Pierre Regazzoni added a comment - Verified on 7.0.2-6692 and 7.1.0-1351

              People

              Assignee:
              pierre.regazzoni Pierre Regazzoni
              Reporter:
              pierre.regazzoni Pierre Regazzoni
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Gerrit Reviews

                  There are no open Gerrit changes

                    PagerDuty