Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-30700

Introduce timeout in all code paths using projector-memcached connection

    XMLWordPrintable

Details

    Description

      There are multiple occurrences of the scenario where projector/indexer were waiting on a response from memcached, which never came. e.g. MB-29982.

      It will be good to have a timeout and retry in the code paths where projector/indexer is waiting for memcached response.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Build couchbase-server-6.5.0-2789 contains indexing commit 35d8b4b with commit message:
            MB-30700: Introduce timeout in all indexing to memcached receive paths

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.0-2789 contains indexing commit 35d8b4b with commit message: MB-30700 : Introduce timeout in all indexing to memcached receive paths

            Build couchbase-server-6.5.0-2957 contains indexing commit c21d011 with commit message:
            MB-30700: Introduce timeout in memcached Transmit codepaths

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.0-2957 contains indexing commit c21d011 with commit message: MB-30700 : Introduce timeout in memcached Transmit codepaths

            Did not face any issue with GSI system test with 6.5.0-4471.

            And tried the following:

            1) Created a cluster with data and query in one node and indexer on another node.
            2) Created a bucket default and primary index for the same.
            3) start adding documents with mutations
            4) issue SIGSTOP on memcached process.

            In projector log we see following timeout messages but no panics seen:

            2019-10-10T10:28:02.790-07:00 [Error] feed.DcpGetSeqnos(): read tcp 192.168.10.11:35070->192.168.10.11:11210: i/o timeout
            2019-10-10T10:40:07.916-07:00 [Error] feed.DcpGetSeqnos(): read tcp 192.168.10.11:35166->192.168.10.11:11210: i/o timeout
            2019-10-10T10:42:07.919-07:00 [Error] feed.DcpGetSeqnos(): read tcp 192.168.10.11:35166->192.168.10.11:11210: i/o timeout
            2019-10-10T10:44:17.972-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35236->192.168.10.11:11210: i/o timeout
            2019-10-10T10:48:17.993-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35308->192.168.10.11:11210: i/o timeout
            2019-10-10T10:52:18.026-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35376->192.168.10.11:11210: i/o timeout
            2019-10-10T10:56:18.050-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35446->192.168.10.11:11210: i/o timeout
            2019-10-10T11:00:18.076-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35516->192.168.10.11:11210: i/o timeout
            2019-10-10T11:04:18.097-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35586->192.168.10.11:11210: i/o timeout
            2019-10-10T11:07:47.432-07:00 [Error] DCPT[secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-10817794244566739099/1] doReceive(): read tcp 192.168.10.11:35148->192.168.10.11:11210: i/o timeout
            2019-10-10T11:07:47.433-07:00 [Error] DCPT[secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-10817794244566739099/2] doReceive(): read tcp 192.168.10.11:35150->192.168.10.11:11210: i/o timeout
            2019-10-10T11:07:47.435-07:00 [Error] DCPT[secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-10817794244566739099/0] doReceive(): read tcp 192.168.10.11:35146->192.168.10.11:11210: i/o timeout
            2019-10-10T11:07:47.452-07:00 [Error] DCPT[secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-10817794244566739099/3] doReceive(): read tcp 192.168.10.11:35152->192.168.10.11:11210: i/o timeout
            2019-10-10T11:08:18.118-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35656->192.168.10.11:11210: i/o timeout
            2019-10-10T11:11:48.488-07:00 [Error] FEED[<=>MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f(127.0.0.1:8091)] ##6 GetFailoverLogs("default"): read tcp 192.168.10.11:35716->192.168.10.11:11210: i/o timeout
            2019-10-10T11:12:18.209-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35726->192.168.10.11:11210: i/o timeout
            2019-10-10T11:15:51.635-07:00 [Error] DCP[secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-1825762437275105685] ##8 DcpFeed::connectToNodes StartDcpFeed failed for secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-1825762437275105685/0 with err read tcp 192.168.10.11:35788->192.168.10.11:11210: i/o timeout
            2019-10-10T11:16:18.234-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35798->192.168.10.11:11210: i/o timeout
            2019-10-10T11:19:57.756-07:00 [Error] DCP[secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-15805786367584753342] ##9 DcpFeed::connectToNodes StartDcpFeed failed for secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-15805786367584753342/0 with err read tcp 192.168.10.11:35862->192.168.10.11:11210: i/o timeout
            2019-10-10T11:20:18.264-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35870->192.168.10.11:11210: i/o timeout
            2019-10-10T11:24:03.864-07:00 [Error] DCP[secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-13630678435101308338] ##a DcpFeed::connectToNodes StartDcpFeed failed for secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-13630678435101308338/0 with err read tcp 192.168.10.11:35934->192.168.10.11:11210: i/o timeout
            2019-10-10T11:24:18.293-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35942->192.168.10.11:11210: i/o timeout
            2019-10-10T11:28:09.951-07:00 [Error] DCP[secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-3510394118682867578] ##b DcpFeed::connectToNodes StartDcpFeed failed for secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-3510394118682867578/0 with err read tcp 192.168.10.11:36008->192.168.10.11:11210: i/o timeout
            2019-10-10T11:28:18.312-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:36014->192.168.10.11:11210: i/o timeout

            girish.benakappa Girish Benakappa added a comment - Did not face any issue with GSI system test with 6.5.0-4471. And tried the following: 1) Created a cluster with data and query in one node and indexer on another node. 2) Created a bucket default and primary index for the same. 3) start adding documents with mutations 4) issue SIGSTOP on memcached process. In projector log we see following timeout messages but no panics seen: 2019-10-10T10:28:02.790-07:00 [Error] feed.DcpGetSeqnos(): read tcp 192.168.10.11:35070->192.168.10.11:11210: i/o timeout 2019-10-10T10:40:07.916-07:00 [Error] feed.DcpGetSeqnos(): read tcp 192.168.10.11:35166->192.168.10.11:11210: i/o timeout 2019-10-10T10:42:07.919-07:00 [Error] feed.DcpGetSeqnos(): read tcp 192.168.10.11:35166->192.168.10.11:11210: i/o timeout 2019-10-10T10:44:17.972-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35236->192.168.10.11:11210: i/o timeout 2019-10-10T10:48:17.993-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35308->192.168.10.11:11210: i/o timeout 2019-10-10T10:52:18.026-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35376->192.168.10.11:11210: i/o timeout 2019-10-10T10:56:18.050-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35446->192.168.10.11:11210: i/o timeout 2019-10-10T11:00:18.076-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35516->192.168.10.11:11210: i/o timeout 2019-10-10T11:04:18.097-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35586->192.168.10.11:11210: i/o timeout 2019-10-10T11:07:47.432-07:00 [Error] DCPT [secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-10817794244566739099/1] doReceive(): read tcp 192.168.10.11:35148->192.168.10.11:11210: i/o timeout 2019-10-10T11:07:47.433-07:00 [Error] DCPT [secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-10817794244566739099/2] doReceive(): read tcp 192.168.10.11:35150->192.168.10.11:11210: i/o timeout 2019-10-10T11:07:47.435-07:00 [Error] DCPT [secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-10817794244566739099/0] doReceive(): read tcp 192.168.10.11:35146->192.168.10.11:11210: i/o timeout 2019-10-10T11:07:47.452-07:00 [Error] DCPT [secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-10817794244566739099/3] doReceive(): read tcp 192.168.10.11:35152->192.168.10.11:11210: i/o timeout 2019-10-10T11:08:18.118-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35656->192.168.10.11:11210: i/o timeout 2019-10-10T11:11:48.488-07:00 [Error] FEED [<=>MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f(127.0.0.1:8091)] ##6 GetFailoverLogs("default"): read tcp 192.168.10.11:35716->192.168.10.11:11210: i/o timeout 2019-10-10T11:12:18.209-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35726->192.168.10.11:11210: i/o timeout 2019-10-10T11:15:51.635-07:00 [Error] DCP [secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-1825762437275105685] ##8 DcpFeed::connectToNodes StartDcpFeed failed for secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-1825762437275105685/0 with err read tcp 192.168.10.11:35788->192.168.10.11:11210: i/o timeout 2019-10-10T11:16:18.234-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35798->192.168.10.11:11210: i/o timeout 2019-10-10T11:19:57.756-07:00 [Error] DCP [secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-15805786367584753342] ##9 DcpFeed::connectToNodes StartDcpFeed failed for secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-15805786367584753342/0 with err read tcp 192.168.10.11:35862->192.168.10.11:11210: i/o timeout 2019-10-10T11:20:18.264-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35870->192.168.10.11:11210: i/o timeout 2019-10-10T11:24:03.864-07:00 [Error] DCP [secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-13630678435101308338] ##a DcpFeed::connectToNodes StartDcpFeed failed for secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-13630678435101308338/0 with err read tcp 192.168.10.11:35934->192.168.10.11:11210: i/o timeout 2019-10-10T11:24:18.293-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:35942->192.168.10.11:11210: i/o timeout 2019-10-10T11:28:09.951-07:00 [Error] DCP [secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-3510394118682867578] ##b DcpFeed::connectToNodes StartDcpFeed failed for secidx:proj-default-MAINT_STREAM_TOPIC_412f8c808ada86959015956cf54c5c9f-3510394118682867578/0 with err read tcp 192.168.10.11:36008->192.168.10.11:11210: i/o timeout 2019-10-10T11:28:18.312-07:00 [Error] StartDcpFeedOver(): read tcp 192.168.10.11:36014->192.168.10.11:11210: i/o timeout

            People

              amit.kulkarni Amit Kulkarni
              amit.kulkarni Amit Kulkarni
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  PagerDuty