Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-48030

[BP to 7.0.2] [TLS] Strict mode, see errors in query log for 9101 port for indexer client

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown

    Description

      I see these errors in the query log 

       

      t_02, keyspace: users
      2021-08-18T23:02:18.624-07:00 [Info] GsiClient::UpdateUsecjson: using collatejson as data format between indexer and GsiClient
      2021-08-18T23:02:18.624-07:00 [Info] GSIC[default/travel-sample-tenant_agent_02-bookings-1629352938623032499] started ...
      2021-08-18T23:02:18.624-07:00 [Info] Recieve security change notification. encryption=true
      2021-08-18T23:02:18.625-07:00 [Info] Certificate refreshed successfully
      _time=2021-08-18T23:02:18.625-07:00 _level=INFO _msg=n1fty: NewFTSIndexer2, server: http://127.0.0.1:8091, namespace: default, bucket: travel-sample, scope: tenant_agent_02, keyspace: bookings
      2021-08-18T23:02:21.778-07:00 [Error] transport error between 172.23.99.49:46598->172.23.99.49:9101: write tcp 172.23.99.49:46598->172.23.99.49:9101: write: broken pipe
      2021-08-18T23:02:21.778-07:00 [Error] [GsiScanClient:"172.23.99.49:9101"] retriever request transport failed `write tcp 172.23.99.49:46598->172.23.99.49:9101: write: broken pipe`
      2021-08-18T23:02:21.778-07:00 [Info] [Queryport-connpool:172.23.99.49:9101] closing unhealthy connection "172.23.99.49:46598"
      2021-08-18T23:02:21.778-07:00 [Warn] scan failed: requestId retriever queryport 172.23.99.49:9101 inst 13536734719740518051 partition [0]
      2021-08-18T23:02:21.778-07:00 [Warn] Scan failed with error for index 9705858241300905373.  Trying scan again with replica, reqId:retriever :  write tcp 172.23.99.49:46598->172.23.99.49:9101: write: broken pipe from [172.23.99.49:9101] ...
      2021-08-18T23:02:21.778-07:00 [Error] PickRandom: Fail to find indexer for all index partitions. Num partition 1.  Partition with instances 0
      2021-08-18T23:02:21.778-07:00 [Warn] Fail to find indexers to satisfy query request.  Trying scan again for index 9705858241300905373, reqId:retriever :  write tcp 172.23.99.49:46598->172.23.99.49:9101: write: broken pipe from [172.23.99.49:9101] ...
      

      Steps to recreate 

      2 node cluster with default deployment 

      • enable N2N encryption: /opt/couchbase/bin/couchbase-cli node-to-node-encryption -c http://localhost:8091 -u Administrator -p password --enable
      • set TLS strict mode: /opt/couchbase/bin/couchbase-cli setting-security -c http://localhost:8091 -u Administrator -p password --set --cluster-encryption-level strict

      Attachments

        1. node_49.tar.gz
          35.96 MB
        2. node_50.tar.gz
          7.83 MB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Ok.. we will retest and revert back on this issue today. Thanks Amit Kulkarni and Sitaram Vemulapalli. Pierre Regazzoni FYA.

            mihir.kamdar Mihir Kamdar (Inactive) added a comment - Ok.. we will retest and revert back on this issue today. Thanks Amit Kulkarni and Sitaram Vemulapalli . Pierre Regazzoni FYA.
            amit.kulkarni Amit Kulkarni added a comment - - edited

            Hi Pierre Regazzoni,

            If this gets reproduced, please keep the setup intact and let me know. We want more information than what cbcollect can collect.

            Thanks.

            amit.kulkarni Amit Kulkarni added a comment - - edited Hi Pierre Regazzoni , If this gets reproduced, please keep the setup intact and let me know. We want more information than what cbcollect can collect. Thanks.

            I have setup cluster in TLS strict mode with build 7.0.2-6590 and can successfully run transaction now ... not sure if this issue was the same as reported by Isha.

            # /opt/couchbase/bin/cbq -u Administrator -p password -e https://localhost:18091 --no-ssl-verify
             
             
             Disabling SSL verification means that cbq will be vulnerable to man-in-the-middle attacks.
             
             
             Connected to : https://localhost:18091/. Type Ctrl-D or \QUIT to exit.
             
             
             Path to history file for the shell : /root/.cbq_history 
            cbq> begin work;
            {
                "requestID": "f2945381-efcb-4499-9e42-b1c7919e343d",
                "signature": "json",
                "results": [
                {
                    "txid": "d679bb19-8088-46ab-9b3e-e655add117f3"
                }
                ],
                "status": "success",
                "metrics": {
                    "elapsedTime": "1.127179ms",
                    "executionTime": "979.157µs",
                    "resultCount": 1,
                    "resultSize": 62,
                    "serviceLoad": 6,
                    "transactionElapsedTime": "153.263µs",
                    "transactionRemainingTime": "1m59.999812928s"
                }
            }
            cbq> select city,airportname from `travel-sample`.inventory.airport where lower(city) = 'lyon';
            {
                "requestID": "25e50763-757c-443c-9c2f-2f98582dadd8",
                "signature": {
                    "airportname": "json",
                    "city": "json"
                },
                "results": [
                {
                    "airportname": "Saint Exupery",
                    "city": "Lyon"
                },
                {
                    "airportname": "Bron",
                    "city": "Lyon"
                },
                {
                    "airportname": "Lyon Part-Dieu Railway",
                    "city": "Lyon"
                }
                ],
                "status": "success",
                "metrics": {
                    "elapsedTime": "4.780097192s",
                    "executionTime": "4.779856278s",
                    "resultCount": 3,
                    "resultSize": 210,
                    "serviceLoad": 1,
                    "transactionElapsedTime": "8.199090368s",
                    "transactionRemainingTime": "1m51.800882456s"
                }
            }
            cbq> commit;
            {
                "requestID": "94330d88-03eb-47ce-9c11-0c9f25af5d35",
                "signature": "json",
                "results": [
                ],
                "status": "success",
                "metrics": {
                    "elapsedTime": "876.589µs",
                    "executionTime": "719.31µs",
                    "resultCount": 0,
                    "resultSize": 0,
                    "serviceLoad": 1,
                    "transactionElapsedTime": "13.850990982s"
                }
            } 

            pierre.regazzoni Pierre Regazzoni added a comment - I have setup cluster in TLS strict mode with build 7.0.2-6590 and can successfully run transaction now ... not sure if this issue was the same as reported by Isha. # /opt/couchbase/bin/cbq -u Administrator -p password -e https: //localhost:18091 --no-ssl-verify      Disabling SSL verification means that cbq will be vulnerable to man-in-the-middle attacks.      Connected to : https: //localhost:18091/. Type Ctrl-D or \QUIT to exit.      Path to history file for the shell : /root/.cbq_history  cbq> begin work; {     "requestID" : "f2945381-efcb-4499-9e42-b1c7919e343d" ,     "signature" : "json" ,     "results" : [     {         "txid" : "d679bb19-8088-46ab-9b3e-e655add117f3"     }     ],     "status" : "success" ,     "metrics" : {         "elapsedTime" : "1.127179ms" ,         "executionTime" : "979.157µs" ,         "resultCount" : 1 ,         "resultSize" : 62 ,         "serviceLoad" : 6 ,         "transactionElapsedTime" : "153.263µs" ,         "transactionRemainingTime" : "1m59.999812928s"     } } cbq> select city,airportname from `travel-sample`.inventory.airport where lower(city) = 'lyon' ; {     "requestID" : "25e50763-757c-443c-9c2f-2f98582dadd8" ,     "signature" : {         "airportname" : "json" ,         "city" : "json"     },     "results" : [     {         "airportname" : "Saint Exupery" ,         "city" : "Lyon"     },     {         "airportname" : "Bron" ,         "city" : "Lyon"     },     {         "airportname" : "Lyon Part-Dieu Railway" ,         "city" : "Lyon"     }     ],     "status" : "success" ,     "metrics" : {         "elapsedTime" : "4.780097192s" ,         "executionTime" : "4.779856278s" ,         "resultCount" : 3 ,         "resultSize" : 210 ,         "serviceLoad" : 1 ,         "transactionElapsedTime" : "8.199090368s" ,         "transactionRemainingTime" : "1m51.800882456s"     } } cbq> commit; {     "requestID" : "94330d88-03eb-47ce-9c11-0c9f25af5d35" ,     "signature" : "json" ,     "results" : [     ],     "status" : "success" ,     "metrics" : {         "elapsedTime" : "876.589µs" ,         "executionTime" : "719.31µs" ,         "resultCount" : 0 ,         "resultSize" : 0 ,         "serviceLoad" : 1 ,         "transactionElapsedTime" : "13.850990982s"     } }

            Build couchbase-server-7.0.2-6649 contains indexing commit 7f79312 with commit message:
            MB-48030 Retry shutdown with local kvaddrs incase of node rename

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.2-6649 contains indexing commit 7f79312 with commit message: MB-48030 Retry shutdown with local kvaddrs incase of node rename
            hemant.rajput Hemant Rajput added a comment - - edited

            Validated on 7.0.2-6650

            Steps to reproduce 

            a. Create a 1-node cluster with kv+n1ql+index services. Use 127.0.0.1 as hostname when initialising the cluster

            b. Create a collection test.test_scope_1.test_collection_1 and add 10 document to the collection

            c. Create index on the collection. Scan the index

            d. Now, add a new node to the cluster with KV + n1ql services on the cluster

            e. Enable node-to-node encryption and set TLS strict mode

             

            ./couchbase-cli setting-autofailover -c localhost:8091 -u Administrator -p password --enable-auto-failover=0
            ./couchbase-cli node-to-node-encryption -c localhost:8091 -u Administrator -p password --enable
            ./couchbase-cli setting-security -c https://localhost:18091 -u Administrator -p password --set --cluster-encryption-level strict --no-ssl-verify

            f. Add 10000 more documents to the bucket and initiate a session consistent scan - Before fix, The scan will timeout

            hemantrajput@LFC testrunner % curl -v https://172.23.136.160:18093/query/service -d 'scan_consistency=request_plus&statement=select join_yr from test.test_scope_1.test_collection_1 where join_yr is not null' -k -u Administrator:password
            *   Trying 172.23.136.160...
            * TCP_NODELAY set
            * Connected to 172.23.136.160 (172.23.136.160) port 18093 (#0)
            * ALPN, offering h2
            * ALPN, offering http/1.1
            * successfully set certificate verify locations:
            *   CAfile: /etc/ssl/cert.pem
              CApath: none
            * TLSv1.2 (OUT), TLS handshake, Client hello (1):
            * TLSv1.2 (IN), TLS handshake, Server hello (2):
            * TLSv1.2 (IN), TLS handshake, Certificate (11):
            * TLSv1.2 (IN), TLS handshake, Server key exchange (12):
            * TLSv1.2 (IN), TLS handshake, Server finished (14):
            * TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
            * TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
            * TLSv1.2 (OUT), TLS handshake, Finished (20):
            * TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
            * TLSv1.2 (IN), TLS handshake, Finished (20):
            * SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
            * ALPN, server accepted to use h2
            * Server certificate:
            *  subject: CN=Couchbase Server Node (172.23.136.160)
            *  start date: Jan  1 00:00:00 2013 GMT
            *  expire date: Dec 31 23:59:59 2049 GMT
            *  issuer: CN=Couchbase Server 9352843c
            *  SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
            * Using HTTP2, server supports multi-use
            * Connection state changed (HTTP/2 confirmed)
            * Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
            * Server auth using Basic with user 'Administrator'
            * Using Stream ID: 1 (easy handle 0x7fc9c600d600)
            > POST /query/service HTTP/2
            > Host: 172.23.136.160:18093
            > Authorization: Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==
            > User-Agent: curl/7.64.1
            > Accept: */*
            > Content-Length: 121
            > Content-Type: application/x-www-form-urlencoded
            * Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
            * We are completely uploaded and fine
            < HTTP/2 200 
            < content-type: application/json; version=7.0.1-N1QL
            < content-length: 432
            < date: Wed, 08 Sep 2021 06:25:50 GMT
            {
            "requestID": "6d25eeff-c4b0-49e4-b5e5-15d72e061529",
            "signature": {"join_yr":"json"},
            "results": [
            {"join_yr":2010},
            {"join_yr":2010},
            {"join_yr":2010},
            {"join_yr":2010},
            {"join_yr":2011},
            {"join_yr":2011},
            {"join_yr":2011},
            {"join_yr":2011},
            {"join_yr":2011},
            {"join_yr":2011}
            ],
            "status": "success",
            "metrics": {"elapsedTime": "20.496672ms","executionTime": "20.413537ms","resultCount": 10,"resultSize": 160,"serviceLoad": 1}
            }
            * Connection #0 to host 172.23.136.160 left intact
            * Closing connection 0

             

            hemant.rajput Hemant Rajput added a comment - - edited Validated on 7.0.2-6650 Steps to reproduce  a. Create a 1-node cluster with kv+n1ql+index services. Use 127.0.0.1 as hostname when initialising the cluster b. Create a collection test.test_scope_1.test_collection_1 and add 10 document to the collection c. Create index on the collection. Scan the index d. Now, add a new node to the cluster with KV + n1ql services on the cluster e. Enable node-to-node encryption and set TLS strict mode   ./couchbase-cli setting-autofailover -c localhost:8091 -u Administrator -p password --enable-auto-failover=0 ./couchbase-cli node-to-node-encryption -c localhost:8091 -u Administrator -p password --enable ./couchbase-cli setting-security -c https://localhost:18091 -u Administrator -p password --set --cluster-encryption-level strict --no-ssl-verify f. Add 10000 more documents to the bucket and initiate a session consistent scan - Before fix, The scan will timeout hemantrajput@LFC testrunner % curl -v https://172.23.136.160:18093/query/service -d 'scan_consistency=request_plus&statement=select join_yr from test.test_scope_1.test_collection_1 where join_yr is not null' -k -u Administrator:password *   Trying 172.23.136.160... * TCP_NODELAY set * Connected to 172.23.136.160 (172.23.136.160) port 18093 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: *   CAfile: /etc/ssl/cert.pem   CApath: none * TLSv1.2 (OUT), TLS handshake, Client hello (1): * TLSv1.2 (IN), TLS handshake, Server hello (2): * TLSv1.2 (IN), TLS handshake, Certificate (11): * TLSv1.2 (IN), TLS handshake, Server key exchange (12): * TLSv1.2 (IN), TLS handshake, Server finished (14): * TLSv1.2 (OUT), TLS handshake, Client key exchange (16): * TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.2 (OUT), TLS handshake, Finished (20): * TLSv1.2 (IN), TLS change cipher, Change cipher spec (1): * TLSv1.2 (IN), TLS handshake, Finished (20): * SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384 * ALPN, server accepted to use h2 * Server certificate: *  subject: CN=Couchbase Server Node (172.23.136.160) *  start date: Jan  1 00:00:00 2013 GMT *  expire date: Dec 31 23:59:59 2049 GMT *  issuer: CN=Couchbase Server 9352843c *  SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway. * Using HTTP2, server supports multi-use * Connection state changed (HTTP/2 confirmed) * Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0 * Server auth using Basic with user 'Administrator' * Using Stream ID: 1 (easy handle 0x7fc9c600d600) > POST /query/service HTTP/2 > Host: 172.23.136.160:18093 > Authorization: Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA== > User-Agent: curl/7.64.1 > Accept: */* > Content-Length: 121 > Content-Type: application/x-www-form-urlencoded >  * Connection state changed (MAX_CONCURRENT_STREAMS == 250)! * We are completely uploaded and fine < HTTP/2 200  < content-type: application/json; version=7.0.1-N1QL < content-length: 432 < date: Wed, 08 Sep 2021 06:25:50 GMT <  { "requestID": "6d25eeff-c4b0-49e4-b5e5-15d72e061529", "signature": {"join_yr":"json"}, "results": [ {"join_yr":2010}, {"join_yr":2010}, {"join_yr":2010}, {"join_yr":2010}, {"join_yr":2011}, {"join_yr":2011}, {"join_yr":2011}, {"join_yr":2011}, {"join_yr":2011}, {"join_yr":2011} ], "status": "success", "metrics": {"elapsedTime": "20.496672ms","executionTime": "20.413537ms","resultCount": 10,"resultSize": 160,"serviceLoad": 1} } * Connection #0 to host 172.23.136.160 left intact * Closing connection 0  

            People

              hemant.rajput Hemant Rajput
              isha Isha Kandaswamy
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty