Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-45700

[TXN] Transaction fetch error on cluster with node to node encryption

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • Cheshire-Cat
    • 7.0.0
    • query
    • 2 node cluster: kv:n1ql:index:fts-kv:n1ql:index
      setup with node-to-node-encryption
      Enterprise Edition 7.0.0 build 4907

    Description

      Simple update in transaction gets fetch error as follow:

      cbq> start transaction;
      {
          "requestID": "79c20840-dd51-4ee6-9a52-29bf19b05f9c",
          "signature": "json",
          "results": [
          {
              "txid": "6283ff75-f247-422b-898a-cc1e3661015a"
          }
          ],
          "status": "success",
          "metrics": {
              "elapsedTime": "3.337586ms",
              "executionTime": "3.122005ms",
              "resultCount": 1,
              "resultSize": 62,
              "serviceLoad": 6,
              "transactionElapsedTime": "61.767µs",
              "transactionRemainingTime": "1m59.99992038s"
          }
      }
      cbq> update `travel-sample` set city = 'lyon' where city = 'Lyon';
      {
          "requestID": "6dcb271d-4c7c-4b07-888b-0e173c5cc587",
          "signature": null,
          "results": [
          ],
          "errors": [
              {
                  "cause": {
                      "cause": {
                          "-": {
                              "InnerError": {
                                  "InnerError": {},
                                  "Message": "unambiguous timeout"
                              }
                          },
                          "i": "0x0",
                          "s": "LookupIn",
                          "t": 2500284
                      },
                      "raise": "failed",
                      "retry": true,
                      "rollback": true
                  },
                  "code": 17017,
                  "msg": "Transaction fetch error"
              }
          ],
          "status": "errors",
          "metrics": {
              "elapsedTime": "2.51043955s",
              "executionTime": "2.510160114s",
              "resultCount": 0,
              "resultSize": 0,
              "serviceLoad": 1,
              "transactionElapsedTime": "6.655637331s",
              "transactionRemainingTime": "1m53.344262311s",
              "errorCount": 1
          }
      } 

      From query log, also see:

      _time=2021-04-15T14:21:53.386-07:00 _level=WARN _msg=(GOCBCORE) CCCPPOLL: Failed to retrieve CCCP config. ambiguous timeout 

       

      For comparison, on same cluster outside the transaction the update statement is successful:

      cbq> update `travel-sample` set city = 'lyon' where city = 'Lyon';
      {
          "requestID": "ef292b1c-c14e-4bde-83fd-830bcd894bb2",
          "signature": null,
          "results": [
          ],
          "status": "success",
          "metrics": {
              "elapsedTime": "26.376296ms",
              "executionTime": "26.081391ms",
              "resultCount": 0,
              "resultSize": 0,
              "serviceLoad": 6,
              "mutationCount": 6
          }
      } 

      on another cluster with same config but no encryption, transaction is successful.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Sitaram.Vemulapalli Sitaram Vemulapalli added a comment - - edited

            Pierre Regazzoni can you try same with single node cluster

            172.23.104.90, 127.0.0.1 is loopback

            Also add instructions how to setup encryption

            _time=2021-04-15T12:18:04.419-07:00 _level=INFO _msg=(GOCBCORE) SDK Version: gocbcore/v9.1.3 
            _time=2021-04-15T12:18:04.419-07:00 _level=INFO _msg=(GOCBCORE) Creating new agent: {MemdAddrs:[] HTTPAddrs:[<sd>127.0.0.1:18091<sd>] BucketName:<md>travel-sample<md> UserAgent:travel-sample UseTLS:true NetworkType: Auth:0x3b24698 TLSRootCAProvider:0x1f096f0 UseMutationTokens:false UseCompression:false UseDurations:false DisableDecompression:false UseOutOfOrderResponses:false DisableXErrors:false DisableJSONHello:false DisableSyncReplicationHello:false UseCollections:true CompressionMinSize:0 CompressionMinRatio:0 HTTPRedialPeriod:0s HTTPRetryDelay:0s CccpMaxWait:0s CccpPollPeriod:0s ConnectTimeout:10s KVConnectTimeout:7s KvPoolSize:8 MaxQueueSize:32768 HTTPMaxIdleConns:0 HTTPMaxIdleConnsPerHost:0 HTTPIdleConnectionTimeout:0s Tracer:<nil> NoRootTraceSpans:false DefaultRetryStrategy:0xc004a10f38 CircuitBreakerConfig:{Enabled:false VolumeThreshold:0 ErrorThresholdPercentage:0 SleepWindow:0s RollingWindow:0s CompletionCallback:<nil> CanaryTimeout:0s} UseZombieLogger:false ZombieLoggerInterval:0s ZombieLoggerSampleSize:0 AuthMechanisms:[]} 
            _time=2021-04-15T12:18:04.419-07:00 _level=INFO _msg=(GOCBCORE) CCCPPOLL: No nodes available to poll, return upstream 
            _time=2021-04-15T12:18:04.431-07:00 _level=WARN _msg=(GOCBCORE) Failed to connect to host. Get https://127.0.0.1:18091/pools/default/bs/travel-sample: x509: certificate is valid for 172.23.104.90, not 127.0.0.1 
            _time=2021-04-15T12:18:14.441-07:00 _level=WARN _msg=(GOCBCORE) Failed to connect to host. Get https://127.0.0.1:18091/pools/default/bs/travel-sample: x509: certificate is valid for 172.23.104.90, not 127.0.0.1 
            _time=2021-04-15T12:18:24.449-07:00 _level=WARN _msg=(GOCBCORE) Failed to connect to host. Get https://127.0.0.1:18091/pools/default/bs/travel-sample: x509: certificate is valid for 172.23.104.90, not 127.0.0.1 
            _time=2021-04-15T12:18:34.460-07:00 _level=WARN _msg=(GOCBCORE) Failed to connect to host. Get https://127.0.0.1:18091/pools/default/bs/travel-sample: x509: certificate is valid for 172.23.104.90, not 127.0.0.1 
             
            
            

            Sitaram.Vemulapalli Sitaram Vemulapalli added a comment - - edited Pierre Regazzoni can you try same with single node cluster 172.23.104.90, 127.0.0.1 is loopback Also add instructions how to setup encryption _time= 2021 - 04 -15T12: 18 : 04.419 - 07 : 00 _level=INFO _msg=(GOCBCORE) SDK Version: gocbcore/v9. 1.3 _time= 2021 - 04 -15T12: 18 : 04.419 - 07 : 00 _level=INFO _msg=(GOCBCORE) Creating new agent: {MemdAddrs:[] HTTPAddrs:[<sd> 127.0 . 0.1 : 18091 <sd>] BucketName:<md>travel-sample<md> UserAgent:travel-sample UseTLS: true NetworkType: Auth: 0x3b24698 TLSRootCAProvider: 0x1f096f0 UseMutationTokens: false UseCompression: false UseDurations: false DisableDecompression: false UseOutOfOrderResponses: false DisableXErrors: false DisableJSONHello: false DisableSyncReplicationHello: false UseCollections: true CompressionMinSize: 0 CompressionMinRatio: 0 HTTPRedialPeriod:0s HTTPRetryDelay:0s CccpMaxWait:0s CccpPollPeriod:0s ConnectTimeout:10s KVConnectTimeout:7s KvPoolSize: 8 MaxQueueSize: 32768 HTTPMaxIdleConns: 0 HTTPMaxIdleConnsPerHost: 0 HTTPIdleConnectionTimeout:0s Tracer:<nil> NoRootTraceSpans: false DefaultRetryStrategy: 0xc004a10f38 CircuitBreakerConfig:{Enabled: false VolumeThreshold: 0 ErrorThresholdPercentage: 0 SleepWindow:0s RollingWindow:0s CompletionCallback:<nil> CanaryTimeout:0s} UseZombieLogger: false ZombieLoggerInterval:0s ZombieLoggerSampleSize: 0 AuthMechanisms:[]} _time= 2021 - 04 -15T12: 18 : 04.419 - 07 : 00 _level=INFO _msg=(GOCBCORE) CCCPPOLL: No nodes available to poll, return upstream _time= 2021 - 04 -15T12: 18 : 04.431 - 07 : 00 _level=WARN _msg=(GOCBCORE) Failed to connect to host. Get https: //127.0.0.1:18091/pools/default/bs/travel-sample: x509: certificate is valid for 172.23.104.90, not 127.0.0.1 _time= 2021 - 04 -15T12: 18 : 14.441 - 07 : 00 _level=WARN _msg=(GOCBCORE) Failed to connect to host. Get https: //127.0.0.1:18091/pools/default/bs/travel-sample: x509: certificate is valid for 172.23.104.90, not 127.0.0.1 _time= 2021 - 04 -15T12: 18 : 24.449 - 07 : 00 _level=WARN _msg=(GOCBCORE) Failed to connect to host. Get https: //127.0.0.1:18091/pools/default/bs/travel-sample: x509: certificate is valid for 172.23.104.90, not 127.0.0.1 _time= 2021 - 04 -15T12: 18 : 34.460 - 07 : 00 _level=WARN _msg=(GOCBCORE) Failed to connect to host. Get https: //127.0.0.1:18091/pools/default/bs/travel-sample: x509: certificate is valid for 172.23.104.90, not 127.0.0.1  
            pierre.regazzoni Pierre Regazzoni added a comment - - edited

            FYI, new logs have been updated since i configured 2 node cluster with IP to address warning above.

            I also tried single node but still see same issue.

            To setup node-2-node encryption, see https://docs.couchbase.com/server/7.0/manage/manage-nodes/apply-node-to-node-encryption.html

            pierre.regazzoni Pierre Regazzoni added a comment - - edited FYI, new logs have been updated since i configured 2 node cluster with IP to address warning above. I also tried single node but still see same issue. To setup node-2-node encryption, see https://docs.couchbase.com/server/7.0/manage/manage-nodes/apply-node-to-node-encryption.html

            If configuring single node and use 127.0.0.1 as the IP, transaction will work:

            cbq> begin transaction;
            {
                "requestID": "d9bc7b0d-0c79-4653-952c-d7f104d0fd87",
                "signature": "json",
                "results": [
                {
                    "txid": "483e33ec-ee52-4d3a-917b-61665facad99"
                }
                ],
                "status": "success",
                "metrics": {
                    "elapsedTime": "2.927446ms",
                    "executionTime": "2.703346ms",
                    "resultCount": 1,
                    "resultSize": 62,
                    "serviceLoad": 6,
                    "transactionElapsedTime": "69.419µs",
                    "transactionRemainingTime": "1m59.999912745s"
                }
            }
            cbq> update `travel-sample` set city = 'lyon' where city = 'Lyon';
            {
                "requestID": "6f7e8f8b-67b4-4039-ac28-d7825bb64017",
                "signature": null,
                "results": [
                ],
                "status": "success",
                "metrics": {
                    "elapsedTime": "159.524463ms",
                    "executionTime": "159.288917ms",
                    "resultCount": 0,
                    "resultSize": 0,
                    "serviceLoad": 1,
                    "mutationCount": 6,
                    "transactionElapsedTime": "7.005713852s",
                    "transactionRemainingTime": "1m52.994261426s"
                }
            }
            cbq> commit;
            {
                "requestID": "20e68c02-b0fc-4717-ae39-b285c684bada",
                "signature": "json",
                "results": [
                ],
                "status": "success",
                "metrics": {
                    "elapsedTime": "24.174113ms",
                    "executionTime": "23.918694ms",
                    "resultCount": 0,
                    "resultSize": 0,
                    "serviceLoad": 1,
                    "transactionElapsedTime": "13.421495071s"
                }
            } 

            pierre.regazzoni Pierre Regazzoni added a comment - If configuring single node and use 127.0.0.1 as the IP, transaction will work: cbq> begin transaction; {     "requestID" : "d9bc7b0d-0c79-4653-952c-d7f104d0fd87" ,     "signature" : "json" ,     "results" : [     {         "txid" : "483e33ec-ee52-4d3a-917b-61665facad99"     }     ],     "status" : "success" ,     "metrics" : {         "elapsedTime" : "2.927446ms" ,         "executionTime" : "2.703346ms" ,         "resultCount" : 1 ,         "resultSize" : 62 ,         "serviceLoad" : 6 ,         "transactionElapsedTime" : "69.419µs" ,         "transactionRemainingTime" : "1m59.999912745s"     } } cbq> update `travel-sample` set city = 'lyon' where city = 'Lyon' ; {     "requestID" : "6f7e8f8b-67b4-4039-ac28-d7825bb64017" ,     "signature" : null ,     "results" : [     ],     "status" : "success" ,     "metrics" : {         "elapsedTime" : "159.524463ms" ,         "executionTime" : "159.288917ms" ,         "resultCount" : 0 ,         "resultSize" : 0 ,         "serviceLoad" : 1 ,         "mutationCount" : 6 ,         "transactionElapsedTime" : "7.005713852s" ,         "transactionRemainingTime" : "1m52.994261426s"     } } cbq> commit; {     "requestID" : "20e68c02-b0fc-4717-ae39-b285c684bada" ,     "signature" : "json" ,     "results" : [     ],     "status" : "success" ,     "metrics" : {         "elapsedTime" : "24.174113ms" ,         "executionTime" : "23.918694ms" ,         "resultCount" : 0 ,         "resultSize" : 0 ,         "serviceLoad" : 1 ,         "transactionElapsedTime" : "13.421495071s"     } }
            Sitaram.Vemulapalli Sitaram Vemulapalli added a comment - - edited

            Brett Lawson If cluster configuring single node and use 127.0.0.1 as the IP, transaction will work,
            single node configured actual ip or multi node connection fail. As it uses 127.0.0.1 connect.

            _time=2021-04-15T12:18:14.441-07:00 _level=WARN _msg=(GOCBCORE) Failed to connect to host. Get https://127.0.0.1:18091/pools/default/bs/travel-sample: x509: certificate is valid for 172.23.104.90, not 127.0.0.1

            Is this issue with gocbcore or the way certificate setup by ns_server.

            Sitaram.Vemulapalli Sitaram Vemulapalli added a comment - - edited Brett Lawson If cluster configuring single node and use 127.0.0.1 as the IP, transaction will work, single node configured actual ip or multi node connection fail. As it uses 127.0.0.1 connect. _time=2021-04-15T12:18:14.441-07:00 _level=WARN _msg=(GOCBCORE) Failed to connect to host. Get https://127.0.0.1:18091/pools/default/bs/travel-sample: x509: certificate is valid for 172.23.104.90, not 127.0.0.1 Is this issue with gocbcore or the way certificate setup by ns_server.

            Seeing this in the longevity system test as well. All N1QL txns are failing due to this issue.

            mihir.kamdar Mihir Kamdar (Inactive) added a comment - Seeing this in the longevity system test as well. All N1QL txns are failing due to this issue.

            Build couchbase-server-7.0.0-4988 contains query commit 8ffce70 with commit message:
            MB-45700. Handle SSL host, custom port, alternative addresses.

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-4988 contains query commit 8ffce70 with commit message: MB-45700 . Handle SSL host, custom port, alternative addresses.

            Verified with 7.0.0-4988:

            • node to node encryption with explicit IP
            • custom port for:         "mgmt": 8099,

                    "mgmtSSL": 18099,

            With customer port:

            couchba+  3969  3964  0 16:41 ?        00:00:15 /opt/couchbase/bin/cbq-engine --datastore=http://127.0.0.1:8099 --http=:8093 --configstore=http://127.0.0.1:8099 --enterprise=true --ipv4=required --ipv6=optional --https=:18093 --certfile=/opt/couchbase/var/lib/couchbase/config/memcached-cert.pem --keyfile=/opt/couchbase/var/lib/couchbase/config/memcached-key.pem 

            I still need to run with AWS and int/ext IP before closing this issue.

            pierre.regazzoni Pierre Regazzoni added a comment - Verified with 7.0.0-4988: node to node encryption with explicit IP custom port for:         "mgmt": 8099,         "mgmtSSL": 18099, With customer port: couchba+  3969   3964   0 16 : 41 ?        00 : 00 : 15 /opt/couchbase/bin/cbq-engine --datastore=http: //127.0.0.1:8099 --http=:8093 --configstore=http://127.0.0.1:8099 --enterprise=true --ipv4=required --ipv6=optional --https=:18093 --certfile=/opt/couchbase/var/lib/couchbase/config/memcached-cert.pem --keyfile=/opt/couchbase/var/lib/couchbase/config/memcached-key.pem I still need to run with AWS and int/ext IP before closing this issue.

            People

              pierre.regazzoni Pierre Regazzoni
              pierre.regazzoni Pierre Regazzoni
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty