Description
Scenario
-------------
- Uni-xdcr between 2 one node clusters. Load an item onto vb449. 1 checkpoint is recorded as is as follows -
Next record on updation of the same item -
{"commitopaque":[158975596682994,2],"start_time":"Thu, 08 May 2014 21:06:36 GMT","end_time":"Thu, 08 May 2014 22:51:37 GMT","failover_uuid":137909158430775,"failover_seq":0,"seqno":1,"upr_snapshot_seqno":1,"total_docs_checked":1,"total_docs_written":1,"total_data_replicated":10}Observations
--------------------
1. The failover_uuid for first checkpoint record is always 0. Should this not point to local vb_uuid? However subsequent checkpoint records contain expected value.
2. Commitopaque shows that high seqno on remote end is 1 i.e, both source and dest acknowledge that 1 mutation has been replicated. Yet we see :"total_docs_checked":0,"total_docs_written":0,"total_data_replicated":0. These values are always off by one mutation. Why is this so? For eg:
{"commitopaque":[158975596682994,6],"start_time":"Thu, 08 May 2014 21:06:36 GMT","end_time":"Thu, 08 May 2014 22:56:25 GMT","failover_uuid":137909158430775,"failover_seq":0,"seqno":5,"upr_snapshot_seqno":5,"total_docs_checked":5,"total_docs_written":5,"total_data_replicated":50}3. Why is "seqno":0? what exactly should this point to?
Logs (can possibly explain what we are seeing above?)
--------
xdcr.1-[xdcr:info,2014-05-08T15:17:40.578,ns_1@127.0.0.1:<0.9272.0>:xdc_vbucket_rep:start_replication:937]Replication `<<"670a9d8fe1d2c38e369630abeb147862/default/default">>` is using:
xdcr.1- 4 worker processes
xdcr.1- a worker batch size of 500
xdcr.1- a worker batch size (KiB) 2048
xdcr.1- 20 HTTP connections
xdcr.1- a connection timeout of 180 seconds
xdcr.1- 2 retries per request
xdcr.1- socket options are: [
xdcr.1:[xdcr:info,2014-05-08T15:17:40.578,ns_1@127.0.0.1:<0.9272.0>:xdc_vbucket_rep_ckpt:do_checkpoint_new:90]checkpointing for vb: 449 at 0 <==== Why are we checkpointing when there are no mutations?
xdcr.1-[xdcr:debug,2014-05-08T15:51:37.472,ns_1@127.0.0.1:<0.9272.0>:xdc_vbucket_rep:handle_info:193]get start-replication token for vb 449 from throttle (pid: <0.8436.0>)
xdcr.1-[xdcr:info,2014-05-08T15:51:37.514,ns_1@127.0.0.1:<0.9272.0>:xdc_vbucket_rep:start_replication:937]Replication `<<"670a9d8fe1d2c38e369630abeb147862/default/default">>` is using:
xdcr.1- 4 worker processes
xdcr.1- a worker batch size of 500
xdcr.1- a worker batch size (KiB) 2048
xdcr.1- 20 HTTP connections
xdcr.1- a connection timeout of 180 seconds
xdcr.1- 2 retries per request
xdcr.1- socket options are: [{keepalive,true}
,
{nodelay,false}]
xdcr.1- source start sequence 1
xdcr.1:[xdcr:info,2014-05-08T15:51:37.514,ns_1@127.0.0.1:<0.9272.0>:xdc_vbucket_rep_ckpt:do_checkpoint_new:90]checkpointing for vb: 449 at 1 <==== However here we checkpoint after we replicate
–
xdcr.1-[xdcr:debug,2014-05-08T15:54:07.524,ns_1@127.0.0.1:<0.9272.0>:xdc_vbucket_rep:handle_info:193]get start-replication token for vb 449 from throttle (pid: <0.8436.0>)