Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-27429

[FTS] scorch index contains duplicates

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 5.5.0
    • Fix Version/s: Mad-Hatter
    • Component/s: fts
    • Triage:
      Untriaged
    • Is this a Regression?:
      Unknown

      Description

      While attempting replicate a hang reported by QE I tried creating several indexes.  Everything went fine.  Then I tried creating 2 more indexes (on travel sample) at the same time.  By the end, when both indexes finished a problem was observed.  Both of these indexes reported more docs in the index than exist in the source bucket.  (see screenshot)

        Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

          Hide
          mschoch Marty Schoch [X] (Inactive) added a comment -

          We have an index t2 (correct count) and t3 and t4 (incorrect counts).  To start with I compared the scorch doc counts on all the pindexes between t2 and t3.  Since pindex count is determined solely by the hashing, we expect these counts to match.  However, the '13aa53f3' pindex counts did not match:

           

          scorch info t2_46fd31cc0f1c31b6_13aa53f3.pindex/store
          doc count: 5256
          has 2 snapshot(s), root: 375
          scorch info t3_4eabe84022488ca2_13aa53f3.pindex/store
          doc count: 5824
          has 3 snapshot(s), root: 411
          

          So, now we can focus on understanding this discrepancy.  We start with the assumption that t3 contains some sort of duplicate documents since the overall count is too high, and there is no mechanism for cbft to make up documents.  So, let's look at the scorch file:

          scorch snapshots .
          snapshots:
          411 *** root ***
          410 
          402

          So, we see that the root bolt has 3 snapshots.  Let's look at the root snapshot:

          segments:
          0 - 0000000000e7.zap - {}
          1 - 000000000141.zap - {}
          2 - 000000000142.zap - {}
          3 - 000000000143.zap - {}
          4 - 000000000144.zap - {}
          5 - 000000000145.zap - {}
          6 - 000000000146.zap - {}
          7 - 000000000147.zap - {}
          8 - 000000000148.zap - {}
          9 - 000000000149.zap - {}
          10 - 00000000014a.zap - {}
          11 - 00000000014b.zap - {}
          12 - 00000000014c.zap - {}
          13 - 00000000014d.zap - {}
          14 - 00000000014e.zap - {}
          15 - 00000000014f.zap - {}
          16 - 000000000150.zap - {}
          17 - 000000000151.zap - {}
          18 - 000000000152.zap - {}
          19 - 000000000153.zap - {}
          20 - 000000000154.zap - {}
          21 - 000000000155.zap - {}
          22 - 000000000158.zap - {}
          23 - 000000000159.zap - {}
          24 - 00000000015a.zap - {}
          25 - 00000000015b.zap - {}
          26 - 00000000015c.zap - {}
          27 - 00000000015d.zap - {}
          28 - 00000000015e.zap - {}
          29 - 00000000015f.zap - <nil>
          30 - 000000000163.zap - {}
          31 - 000000000165.zap - {}
          32 - 000000000166.zap - {}
          33 - 000000000167.zap - {}
          34 - 000000000168.zap - {}
          35 - 000000000169.zap - {}
          36 - 00000000016a.zap - {}
          37 - 00000000016b.zap - {}
          asciified:
          411 0: 22 / 22 - 
          411 1: 28 / 28 - 
          411 2: 28 / 28 - 
          411 3: 28 / 28 - 
          411 4: 29 / 29 - 
          411 5: 29 / 29 - 
          411 6: 29 / 29 - 
          411 7: 29 / 29 - 
          411 8: 30 / 30 - 
          411 9: 29 / 29 - 
          411 10: 29 / 29 - 
          411 11: 29 / 29 - 
          411 12: 30 / 30 - 
          411 13: 30 / 30 - 
          411 14: 30 / 30 - 
          411 15: 30 / 30 - 
          411 16: 30 / 30 - 
          411 17: 31 / 31 - 
          411 18: 32 / 32 - 
          411 19: 31 / 31 - 
          411 20: 34 / 34 - 
          411 21: 34 / 34 - 
          411 22: 36 / 36 - 
          411 23: 36 / 36 - 
          411 24: 35 / 35 - 
          411 25: 36 / 36 - 
          411 26: 35 / 35 - 
          411 27: 37 / 37 - 
          411 28: 38 / 38 - 
          411 29: 38 / 38 - 
          411 30: 309 / 309 - ...
          411 31: 309 / 309 - ...
          411 32: 321 / 321 - ...
          411 33: 330 / 330 - ...
          411 34: 336 / 336 - ...
          411 35: 346 / 346 - ...
          411 36: 376 / 376 - ...
          411 37: 2555 / 2555 - .........................

          Next, we want to know more about the contents of these segments.  I extracted the filenames in this snapshot:

          0000000000e7.zap 
          000000000141.zap 
          000000000142.zap 
          000000000143.zap 
          000000000144.zap 
          000000000145.zap 
          000000000146.zap 
          000000000147.zap 
          000000000148.zap 
          000000000149.zap 
          00000000014a.zap 
          00000000014b.zap 
          00000000014c.zap 
          00000000014d.zap 
          00000000014e.zap 
          00000000014f.zap 
          000000000150.zap 
          000000000151.zap 
          000000000152.zap 
          000000000153.zap 
          000000000154.zap 
          000000000155.zap 
          000000000158.zap 
          000000000159.zap 
          00000000015a.zap 
          00000000015b.zap 
          00000000015c.zap 
          00000000015d.zap 
          00000000015e.zap 
          00000000015f.zap
          000000000163.zap 
          000000000165.zap 
          000000000166.zap 
          000000000167.zap 
          000000000168.zap 
          000000000169.zap 
          00000000016a.zap 
          00000000016b.zap

          I put this in a file /tmp/files.  Now we want to run a zap command to list the term dictionary form the _id field, but we want to run it on each file, so we then use xargs, so I could ran zap command on all 37 segments:

          cat /tmp/files | xargs -n 1 -J % zap dict % _id > /tmp/output

          This output then contains a few lines we're not interested in, so we grep those out of the way, then we sort the file, and pass it to the uniq command to report and duplicate rows:

          grep -v "dictionary" /tmp/output| grep -v "vellum" |sort | uniq -d

          airline_1795 - 8764 (223c)
          airline_18529 - 8808 (2268)
          airport_1297 - 8852 (2294)
          airport_1307 - 8896 (22c0)
          airport_3527 - 8940 (22ec)
          airport_3723 - 8984 (2318)
          airport_3822 - 9028 (2344)
          airport_4026 - 9072 (2370)
          airport_4216 - 9116 (239c)
          airport_4338 - 9160 (23c8)
          airport_4386 - 9204 (23f4)
          airport_5959 - 9248 (2420)
          airport_7089 - 9292 (244c)
          airport_7119 - 9336 (2478)
          airport_7207 - 9380 (24a4)
          airport_8563 - 9424 (24d0)
          airport_9352 - 9468 (24fc)
          hotel_12238 - 9512 (2528)
          hotel_12243 - 9556 (2554)
          hotel_21652 - 9600 (2580)
          hotel_25594 - 9644 (25ac)
          hotel_25600 - 9688 (25d8)
          hotel_5847 - 9732 (2604)
          landmark_10134 - 9776 (2630)
          landmark_11780 - 9820 (265c)
          landmark_12343 - 9864 (2688)
          landmark_12355 - 9908 (26b4)
          landmark_15885 - 9952 (26e0)
          landmark_15961 - 9996 (270c)
          landmark_16121 - 10040 (2738)
          landmark_16204 - 10084 (2764)
          landmark_16269 - 10128 (2790)
          landmark_16394 - 10172 (27bc)
          landmark_16576 - 10216 (27e8)
          landmark_16653 - 10260 (2814)
          landmark_20946 - 10304 (2840)
          landmark_21744 - 10348 (286c)
          landmark_21752 - 10392 (2898)
          landmark_22201 - 10436 (28c4)
          landmark_22217 - 10480 (28f0)
          landmark_22565 - 10524 (291c)
          landmark_23548 - 10568 (2948)
          landmark_25530 - 10612 (2974)
          landmark_25615 - 10656 (29a0)
          landmark_25678 - 10700 (29cc)
          landmark_25785 - 10744 (29f8)
          landmark_25823 - 10788 (2a24)
          landmark_25835 - 10832 (2a50)
          landmark_25857 - 10876 (2a7c)
          landmark_25979 - 10920 (2aa8)
          landmark_26063 - 10964 (2ad4)
          landmark_26459 - 11008 (2b00)
          landmark_27765 - 11052 (2b2c)
          landmark_33174 - 11096 (2b58)
          landmark_33344 - 11140 (2b84)
          landmark_3383 - 11184 (2bb0)
          landmark_3561 - 11228 (2bdc)
          landmark_37126 - 11272 (2c08)
          landmark_37322 - 11316 (2c34)
          landmark_3850 - 11360 (2c60)
          landmark_40254 - 11404 (2c8c)
          landmark_4118 - 11448 (2cb8)
          landmark_5327 - 11492 (2ce4)
          landmark_5489 - 11536 (2d10)
          landmark_557 - 11580 (2d3c)
          landmark_7045 - 11624 (2d68)
          landmark_7737 - 11668 (2d94)
          landmark_8548 - 11712 (2dc0)
          landmark_97 - 11756 (2dec)
          route_10060 - 11800 (2e18)
          route_10209 - 11844 (2e44)
          route_10932 - 11888 (2e70)
          route_11020 - 11932 (2e9c)
          route_11054 - 11976 (2ec8)
          route_11608 - 12020 (2ef4)
          route_11726 - 12064 (2f20)
          route_11744 - 12108 (2f4c)
          route_11752 - 12152 (2f78)
          route_11798 - 12196 (2fa4)
          route_11899 - 12240 (2fd0)
          route_11909 - 12284 (2ffc)
          route_11964 - 12328 (3028)
          route_12387 - 12372 (3054)
          route_12429 - 12416 (3080)
          route_12789 - 12460 (30ac)
          route_12860 - 12504 (30d8)
          route_13028 - 12548 (3104)
          route_13164 - 12592 (3130)
          route_13241 - 12636 (315c)
          route_13533 - 12680 (3188)
          route_13616 - 12724 (31b4)
          route_1370 - 12768 (31e0)
          route_13984 - 12812 (320c)
          route_14127 - 12856 (3238)
          route_14131 - 12900 (3264)
          route_14145 - 12944 (3290)
          route_14220 - 12988 (32bc)
          route_14529 - 13032 (32e8)
          route_14552 - 13076 (3314)
          route_14637 - 13120 (3340)
          route_14643 - 13164 (336c)
          route_14655 - 13208 (3398)
          route_14689 - 13252 (33c4)
          route_14719 - 13296 (33f0)
          route_14818 - 13340 (341c)
          route_14875 - 13384 (3448)
          route_15171 - 13428 (3474)
          route_15254 - 13472 (34a0)
          route_15288 - 13516 (34cc)
          route_15526 - 13560 (34f8)
          route_17306 - 13604 (3524)
          route_17462 - 13648 (3550)
          route_17538 - 13692 (357c)
          route_20122 - 13736 (35a8)
          route_20207 - 13780 (35d4)
          route_20397 - 13824 (3600)
          route_20439 - 13868 (362c)
          route_20575 - 13912 (3658)
          route_20650 - 13956 (3684)
          route_20812 - 14000 (36b0)
          route_20866 - 14044 (36dc)
          route_20870 - 14088 (3708)
          route_20982 - 14132 (3734)
          route_21038 - 14176 (3760)
          route_21086 - 14220 (378c)
          route_21116 - 14264 (37b8)
          route_21162 - 14308 (37e4)
          route_21174 - 14352 (3810)
          route_21208 - 14396 (383c)
          route_21273 - 14440 (3868)
          route_21398 - 14484 (3894)
          route_21491 - 14528 (38c0)
          route_21501 - 14572 (38ec)
          route_21606 - 14616 (3918)
          route_21610 - 14660 (3944)
          route_21664 - 14704 (3970)
          route_21780 - 14748 (399c)
          route_21796 - 14792 (39c8)
          route_21826 - 14836 (39f4)
          route_23026 - 14880 (3a20)
          route_23788 - 14924 (3a4c)
          route_24197 - 14968 (3a78)
          route_24289 - 15012 (3aa4)
          route_24319 - 15056 (3ad0)
          route_24362 - 15100 (3afc)
          route_24410 - 15144 (3b28)
          route_24937 - 15188 (3b54)
          route_25033 - 15232 (3b80)
          route_25286 - 15276 (3bac)
          route_25316 - 15320 (3bd8)
          route_25464 - 15364 (3c04)
          route_25528 - 15408 (3c30)
          route_2623 - 15452 (3c5c)
          route_2641 - 15496 (3c88)
          route_2657 - 15540 (3cb4)
          route_27138 - 15584 (3ce0)
          route_27298 - 15628 (3d0c)
          route_27554 - 15672 (3d38)
          route_28062 - 15716 (3d64)
          route_28882 - 15760 (3d90)
          route_28912 - 15804 (3dbc)
          route_28969 - 15848 (3de8)
          route_29016 - 15892 (3e14)
          route_29186 - 15936 (3e40)
          route_30201 - 15980 (3e6c)
          route_30391 - 16024 (3e98)
          route_30498 - 16068 (3ec4)
          route_30508 - 16112 (3ef0)
          route_30789 - 16156 (3f1c)
          route_32478 - 16200 (3f48)
          route_32585 - 16244 (3f74)
          route_32611 - 16288 (3fa0)
          route_3617 - 16332 (3fcc)
          route_36473 - 16376 (3ff8)
          route_36529 - 16420 (4024)
          route_36638 - 16467 (4053)
          route_36643 - 16514 (4082)
          route_37690 - 16561 (40b1)
          route_38318 - 16608 (40e0)
          route_39323 - 16655 (410f)
          route_39451 - 16702 (413e)
          route_39655 - 16749 (416d)
          route_40687 - 16796 (419c)
          route_43121 - 16843 (41cb)
          route_43143 - 16890 (41fa)
          route_43155 - 16937 (4229)
          route_43536 - 16984 (4258)
          route_4513 - 17031 (4287)
          route_4567 - 17078 (42b6)
          route_4571 - 17125 (42e5)
          route_45796 - 17172 (4314)
          route_458 - 17219 (4343)
          route_4676 - 17266 (4372)
          route_46900 - 17313 (43a1)
          route_47956 - 17360 (43d0)
          route_48081 - 17407 (43ff)
          route_48107 - 17454 (442e)
          route_4874 - 17501 (445d)
          route_49338 - 17548 (448c)
          route_4938 - 17595 (44bb)
          route_49815 - 17642 (44ea)
          route_49985 - 17689 (4519)
          route_50626 - 17736 (4548)
          route_5170 - 17783 (4577)
          route_51765 - 17830 (45a6)
          route_51850 - 17877 (45d5)
          route_5238 - 17924 (4604)
          route_5255 - 17971 (4633)
          route_52988 - 18018 (4662)
          route_53171 - 18065 (4691)
          route_53239 - 18112 (46c0)
          route_53375 - 18159 (46ef)
          route_53407 - 18206 (471e)
          route_53597 - 18253 (474d)
          route_53603 - 18300 (477c)
          route_54388 - 18347 (47ab)
          route_54444 - 18394 (47da)
          route_54498 - 18441 (4809)
          route_54508 - 18488 (4838)
          route_54674 - 18535 (4867)
          route_54941 - 18582 (4896)
          route_55045 - 18629 (48c5)
          route_55099 - 18676 (48f4)
          route_55109 - 18723 (4923)
          route_55241 - 18770 (4952)
          route_5527 - 18817 (4981)
          route_55802 - 18864 (49b0)
          route_55879 - 18911 (49df)
          route_55992 - 18958 (4a0e)
          route_5602 - 19005 (4a3d)
          route_56060 - 19052 (4a6c)
          route_56264 - 19099 (4a9b)
          route_56328 - 19146 (4aca)
          route_56486 - 19193 (4af9)
          route_56516 - 19240 (4b28)
          route_56682 - 19287 (4b57)
          route_56712 - 19334 (4b86)
          route_56813 - 19381 (4bb5)
          route_56949 - 19428 (4be4)
          route_56983 - 19475 (4c13)
          route_57039 - 19522 (4c42)
          route_57087 - 19569 (4c71)
          route_57117 - 19616 (4ca0)
          route_57327 - 19663 (4ccf)
          route_57455 - 19710 (4cfe)
          route_57665 - 19757 (4d2d)
          route_57729 - 19804 (4d5c)
          route_57899 - 19851 (4d8b)
          route_57909 - 19898 (4dba)
          route_5792 - 19945 (4de9)
          route_5800 - 19992 (4e18)
          route_58647 - 20039 (4e47)
          route_58868 - 20086 (4e76)
          route_58946 - 20133 (4ea5)
          route_59042 - 20180 (4ed4)
          route_59088 - 20227 (4f03)
          route_59118 - 20274 (4f32)
          route_59209 - 20321 (4f61)
          route_59272 - 20368 (4f90)
          route_59399 - 20415 (4fbf)
          route_59490 - 20462 (4fee)
          route_59500 - 20509 (501d)
          route_59730 - 20556 (504c)
          route_59831 - 20603 (507b)
          route_5990 - 20650 (50aa)
          route_60016 - 20697 (50d9)
          route_60186 - 20744 (5108)
          route_6019 - 20791 (5137)
          route_60212 - 20838 (5166)
          route_60382 - 20885 (5195)
          route_6074 - 20932 (51c4)
          route_6189 - 20979 (51f3)
          route_62008 - 21026 (5222)
          route_62144 - 21073 (5251)
          route_62937 - 21120 (5280)
          route_6351 - 21167 (52af)
          route_63545 - 21214 (52de)
          route_63599 - 21261 (530d)
          route_63741 - 21308 (533c)
          route_63840 - 21355 (536b)
          route_64044 - 21402 (539a)
          route_6423 - 21449 (53c9)
          route_64274 - 21496 (53f8)
          route_64338 - 21543 (5427)
          route_64428 - 21590 (5456)
          route_64496 - 21637 (5485)
          route_64506 - 21684 (54b4)
          route_6696 - 21731 (54e3)
          route_6706 - 21778 (5512)
          route_6926 - 21825 (5541)
          route_6930 - 21872 (5570)
          route_6944 - 21919 (559f)
          route_8981 - 21966 (55ce)
          route_9085 - 22013 (55fd)
          route_9115 - 22060 (562c)
          route_9161 - 22107 (565b)
          route_9177 - 22154 (568a)
          route_9270 - 22201 (56b9)
          route_9492 - 22248 (56e8)
          route_9502 - 22295 (5717)
          route_9579 - 22342 (5746)
          route_9605 - 22389 (5775)
          route_9613 - 22436 (57a4)
          route_9667 - 22483 (57d3)
          route_9749 - 22530 (5802)
          route_9783 - 22577 (5831)
          route_9795 - 22624 (5860)
          route_9825 - 22671 (588f)
          route_9848 - 22718 (58be)

          This is the list of id's which appear in more than one segment in the snapshot.  Obviously this shouldn't ever happen, and implies we have some sort of race condition.  Most likely it has to do with the introducer/merger racing and not compensating correctly.

           

          Show
          mschoch Marty Schoch [X] (Inactive) added a comment - We have an index t2 (correct count) and t3 and t4 (incorrect counts).  To start with I compared the scorch doc counts on all the pindexes between t2 and t3.  Since pindex count is determined solely by the hashing, we expect these counts to match.  However, the '13aa53f3' pindex counts did not match:   scorch info t2_46fd31cc0f1c31b6_13aa53f3.pindex/store doc count: 5256 has 2 snapshot(s), root: 375 scorch info t3_4eabe84022488ca2_13aa53f3.pindex/store doc count: 5824 has 3 snapshot(s), root: 411 So, now we can focus on understanding this discrepancy.  We start with the assumption that t3 contains some sort of duplicate documents since the overall count is too high, and there is no mechanism for cbft to make up documents.  So, let's look at the scorch file: scorch snapshots . snapshots: 411 *** root *** 410 402 So, we see that the root bolt has 3 snapshots.  Let's look at the root snapshot: segments: 0 - 0000000000e7.zap - {} 1 - 000000000141.zap - {} 2 - 000000000142.zap - {} 3 - 000000000143.zap - {} 4 - 000000000144.zap - {} 5 - 000000000145.zap - {} 6 - 000000000146.zap - {} 7 - 000000000147.zap - {} 8 - 000000000148.zap - {} 9 - 000000000149.zap - {} 10 - 00000000014a.zap - {} 11 - 00000000014b.zap - {} 12 - 00000000014c.zap - {} 13 - 00000000014d.zap - {} 14 - 00000000014e.zap - {} 15 - 00000000014f.zap - {} 16 - 000000000150.zap - {} 17 - 000000000151.zap - {} 18 - 000000000152.zap - {} 19 - 000000000153.zap - {} 20 - 000000000154.zap - {} 21 - 000000000155.zap - {} 22 - 000000000158.zap - {} 23 - 000000000159.zap - {} 24 - 00000000015a.zap - {} 25 - 00000000015b.zap - {} 26 - 00000000015c.zap - {} 27 - 00000000015d.zap - {} 28 - 00000000015e.zap - {} 29 - 00000000015f.zap - <nil> 30 - 000000000163.zap - {} 31 - 000000000165.zap - {} 32 - 000000000166.zap - {} 33 - 000000000167.zap - {} 34 - 000000000168.zap - {} 35 - 000000000169.zap - {} 36 - 00000000016a.zap - {} 37 - 00000000016b.zap - {} asciified: 411 0: 22 / 22 - 411 1: 28 / 28 - 411 2: 28 / 28 - 411 3: 28 / 28 - 411 4: 29 / 29 - 411 5: 29 / 29 - 411 6: 29 / 29 - 411 7: 29 / 29 - 411 8: 30 / 30 - 411 9: 29 / 29 - 411 10: 29 / 29 - 411 11: 29 / 29 - 411 12: 30 / 30 - 411 13: 30 / 30 - 411 14: 30 / 30 - 411 15: 30 / 30 - 411 16: 30 / 30 - 411 17: 31 / 31 - 411 18: 32 / 32 - 411 19: 31 / 31 - 411 20: 34 / 34 - 411 21: 34 / 34 - 411 22: 36 / 36 - 411 23: 36 / 36 - 411 24: 35 / 35 - 411 25: 36 / 36 - 411 26: 35 / 35 - 411 27: 37 / 37 - 411 28: 38 / 38 - 411 29: 38 / 38 - 411 30: 309 / 309 - ... 411 31: 309 / 309 - ... 411 32: 321 / 321 - ... 411 33: 330 / 330 - ... 411 34: 336 / 336 - ... 411 35: 346 / 346 - ... 411 36: 376 / 376 - ... 411 37: 2555 / 2555 - ......................... Next, we want to know more about the contents of these segments.  I extracted the filenames in this snapshot: 0000000000e7.zap 000000000141.zap 000000000142.zap 000000000143.zap 000000000144.zap 000000000145.zap 000000000146.zap 000000000147.zap 000000000148.zap 000000000149.zap 00000000014a.zap 00000000014b.zap 00000000014c.zap 00000000014d.zap 00000000014e.zap 00000000014f.zap 000000000150.zap 000000000151.zap 000000000152.zap 000000000153.zap 000000000154.zap 000000000155.zap 000000000158.zap 000000000159.zap 00000000015a.zap 00000000015b.zap 00000000015c.zap 00000000015d.zap 00000000015e.zap 00000000015f.zap 000000000163.zap 000000000165.zap 000000000166.zap 000000000167.zap 000000000168.zap 000000000169.zap 00000000016a.zap 00000000016b.zap I put this in a file /tmp/files.  Now we want to run a zap command to list the term dictionary form the _id field, but we want to run it on each file, so we then use xargs, so I could ran zap command on all 37 segments: cat /tmp/files | xargs -n 1 -J % zap dict % _id > /tmp/output This output then contains a few lines we're not interested in, so we grep those out of the way, then we sort the file, and pass it to the uniq command to report and duplicate rows: grep -v "dictionary" /tmp/output| grep -v "vellum" |sort | uniq -d airline_1795 - 8764 (223c) airline_18529 - 8808 (2268) airport_1297 - 8852 (2294) airport_1307 - 8896 (22c0) airport_3527 - 8940 (22ec) airport_3723 - 8984 (2318) airport_3822 - 9028 (2344) airport_4026 - 9072 (2370) airport_4216 - 9116 (239c) airport_4338 - 9160 (23c8) airport_4386 - 9204 (23f4) airport_5959 - 9248 (2420) airport_7089 - 9292 (244c) airport_7119 - 9336 (2478) airport_7207 - 9380 (24a4) airport_8563 - 9424 (24d0) airport_9352 - 9468 (24fc) hotel_12238 - 9512 (2528) hotel_12243 - 9556 (2554) hotel_21652 - 9600 (2580) hotel_25594 - 9644 (25ac) hotel_25600 - 9688 (25d8) hotel_5847 - 9732 (2604) landmark_10134 - 9776 (2630) landmark_11780 - 9820 (265c) landmark_12343 - 9864 (2688) landmark_12355 - 9908 (26b4) landmark_15885 - 9952 (26e0) landmark_15961 - 9996 (270c) landmark_16121 - 10040 (2738) landmark_16204 - 10084 (2764) landmark_16269 - 10128 (2790) landmark_16394 - 10172 (27bc) landmark_16576 - 10216 (27e8) landmark_16653 - 10260 (2814) landmark_20946 - 10304 (2840) landmark_21744 - 10348 (286c) landmark_21752 - 10392 (2898) landmark_22201 - 10436 (28c4) landmark_22217 - 10480 (28f0) landmark_22565 - 10524 (291c) landmark_23548 - 10568 (2948) landmark_25530 - 10612 (2974) landmark_25615 - 10656 (29a0) landmark_25678 - 10700 (29cc) landmark_25785 - 10744 (29f8) landmark_25823 - 10788 (2a24) landmark_25835 - 10832 (2a50) landmark_25857 - 10876 (2a7c) landmark_25979 - 10920 (2aa8) landmark_26063 - 10964 (2ad4) landmark_26459 - 11008 (2b00) landmark_27765 - 11052 (2b2c) landmark_33174 - 11096 (2b58) landmark_33344 - 11140 (2b84) landmark_3383 - 11184 (2bb0) landmark_3561 - 11228 (2bdc) landmark_37126 - 11272 (2c08) landmark_37322 - 11316 (2c34) landmark_3850 - 11360 (2c60) landmark_40254 - 11404 (2c8c) landmark_4118 - 11448 (2cb8) landmark_5327 - 11492 (2ce4) landmark_5489 - 11536 (2d10) landmark_557 - 11580 (2d3c) landmark_7045 - 11624 (2d68) landmark_7737 - 11668 (2d94) landmark_8548 - 11712 (2dc0) landmark_97 - 11756 (2dec) route_10060 - 11800 (2e18) route_10209 - 11844 (2e44) route_10932 - 11888 (2e70) route_11020 - 11932 (2e9c) route_11054 - 11976 (2ec8) route_11608 - 12020 (2ef4) route_11726 - 12064 (2f20) route_11744 - 12108 (2f4c) route_11752 - 12152 (2f78) route_11798 - 12196 (2fa4) route_11899 - 12240 (2fd0) route_11909 - 12284 (2ffc) route_11964 - 12328 (3028) route_12387 - 12372 (3054) route_12429 - 12416 (3080) route_12789 - 12460 (30ac) route_12860 - 12504 (30d8) route_13028 - 12548 (3104) route_13164 - 12592 (3130) route_13241 - 12636 (315c) route_13533 - 12680 (3188) route_13616 - 12724 (31b4) route_1370 - 12768 (31e0) route_13984 - 12812 (320c) route_14127 - 12856 (3238) route_14131 - 12900 (3264) route_14145 - 12944 (3290) route_14220 - 12988 (32bc) route_14529 - 13032 (32e8) route_14552 - 13076 (3314) route_14637 - 13120 (3340) route_14643 - 13164 (336c) route_14655 - 13208 (3398) route_14689 - 13252 (33c4) route_14719 - 13296 (33f0) route_14818 - 13340 (341c) route_14875 - 13384 (3448) route_15171 - 13428 (3474) route_15254 - 13472 (34a0) route_15288 - 13516 (34cc) route_15526 - 13560 (34f8) route_17306 - 13604 (3524) route_17462 - 13648 (3550) route_17538 - 13692 (357c) route_20122 - 13736 (35a8) route_20207 - 13780 (35d4) route_20397 - 13824 (3600) route_20439 - 13868 (362c) route_20575 - 13912 (3658) route_20650 - 13956 (3684) route_20812 - 14000 (36b0) route_20866 - 14044 (36dc) route_20870 - 14088 (3708) route_20982 - 14132 (3734) route_21038 - 14176 (3760) route_21086 - 14220 (378c) route_21116 - 14264 (37b8) route_21162 - 14308 (37e4) route_21174 - 14352 (3810) route_21208 - 14396 (383c) route_21273 - 14440 (3868) route_21398 - 14484 (3894) route_21491 - 14528 (38c0) route_21501 - 14572 (38ec) route_21606 - 14616 (3918) route_21610 - 14660 (3944) route_21664 - 14704 (3970) route_21780 - 14748 (399c) route_21796 - 14792 (39c8) route_21826 - 14836 (39f4) route_23026 - 14880 (3a20) route_23788 - 14924 (3a4c) route_24197 - 14968 (3a78) route_24289 - 15012 (3aa4) route_24319 - 15056 (3ad0) route_24362 - 15100 (3afc) route_24410 - 15144 (3b28) route_24937 - 15188 (3b54) route_25033 - 15232 (3b80) route_25286 - 15276 (3bac) route_25316 - 15320 (3bd8) route_25464 - 15364 (3c04) route_25528 - 15408 (3c30) route_2623 - 15452 (3c5c) route_2641 - 15496 (3c88) route_2657 - 15540 (3cb4) route_27138 - 15584 (3ce0) route_27298 - 15628 (3d0c) route_27554 - 15672 (3d38) route_28062 - 15716 (3d64) route_28882 - 15760 (3d90) route_28912 - 15804 (3dbc) route_28969 - 15848 (3de8) route_29016 - 15892 (3e14) route_29186 - 15936 (3e40) route_30201 - 15980 (3e6c) route_30391 - 16024 (3e98) route_30498 - 16068 (3ec4) route_30508 - 16112 (3ef0) route_30789 - 16156 (3f1c) route_32478 - 16200 (3f48) route_32585 - 16244 (3f74) route_32611 - 16288 (3fa0) route_3617 - 16332 (3fcc) route_36473 - 16376 (3ff8) route_36529 - 16420 (4024) route_36638 - 16467 (4053) route_36643 - 16514 (4082) route_37690 - 16561 (40b1) route_38318 - 16608 (40e0) route_39323 - 16655 (410f) route_39451 - 16702 (413e) route_39655 - 16749 (416d) route_40687 - 16796 (419c) route_43121 - 16843 (41cb) route_43143 - 16890 (41fa) route_43155 - 16937 (4229) route_43536 - 16984 (4258) route_4513 - 17031 (4287) route_4567 - 17078 (42b6) route_4571 - 17125 (42e5) route_45796 - 17172 (4314) route_458 - 17219 (4343) route_4676 - 17266 (4372) route_46900 - 17313 (43a1) route_47956 - 17360 (43d0) route_48081 - 17407 (43ff) route_48107 - 17454 (442e) route_4874 - 17501 (445d) route_49338 - 17548 (448c) route_4938 - 17595 (44bb) route_49815 - 17642 (44ea) route_49985 - 17689 (4519) route_50626 - 17736 (4548) route_5170 - 17783 (4577) route_51765 - 17830 (45a6) route_51850 - 17877 (45d5) route_5238 - 17924 (4604) route_5255 - 17971 (4633) route_52988 - 18018 (4662) route_53171 - 18065 (4691) route_53239 - 18112 (46c0) route_53375 - 18159 (46ef) route_53407 - 18206 (471e) route_53597 - 18253 (474d) route_53603 - 18300 (477c) route_54388 - 18347 (47ab) route_54444 - 18394 (47da) route_54498 - 18441 (4809) route_54508 - 18488 (4838) route_54674 - 18535 (4867) route_54941 - 18582 (4896) route_55045 - 18629 (48c5) route_55099 - 18676 (48f4) route_55109 - 18723 (4923) route_55241 - 18770 (4952) route_5527 - 18817 (4981) route_55802 - 18864 (49b0) route_55879 - 18911 (49df) route_55992 - 18958 (4a0e) route_5602 - 19005 (4a3d) route_56060 - 19052 (4a6c) route_56264 - 19099 (4a9b) route_56328 - 19146 (4aca) route_56486 - 19193 (4af9) route_56516 - 19240 (4b28) route_56682 - 19287 (4b57) route_56712 - 19334 (4b86) route_56813 - 19381 (4bb5) route_56949 - 19428 (4be4) route_56983 - 19475 (4c13) route_57039 - 19522 (4c42) route_57087 - 19569 (4c71) route_57117 - 19616 (4ca0) route_57327 - 19663 (4ccf) route_57455 - 19710 (4cfe) route_57665 - 19757 (4d2d) route_57729 - 19804 (4d5c) route_57899 - 19851 (4d8b) route_57909 - 19898 (4dba) route_5792 - 19945 (4de9) route_5800 - 19992 (4e18) route_58647 - 20039 (4e47) route_58868 - 20086 (4e76) route_58946 - 20133 (4ea5) route_59042 - 20180 (4ed4) route_59088 - 20227 (4f03) route_59118 - 20274 (4f32) route_59209 - 20321 (4f61) route_59272 - 20368 (4f90) route_59399 - 20415 (4fbf) route_59490 - 20462 (4fee) route_59500 - 20509 (501d) route_59730 - 20556 (504c) route_59831 - 20603 (507b) route_5990 - 20650 (50aa) route_60016 - 20697 (50d9) route_60186 - 20744 (5108) route_6019 - 20791 (5137) route_60212 - 20838 (5166) route_60382 - 20885 (5195) route_6074 - 20932 (51c4) route_6189 - 20979 (51f3) route_62008 - 21026 (5222) route_62144 - 21073 (5251) route_62937 - 21120 (5280) route_6351 - 21167 (52af) route_63545 - 21214 (52de) route_63599 - 21261 (530d) route_63741 - 21308 (533c) route_63840 - 21355 (536b) route_64044 - 21402 (539a) route_6423 - 21449 (53c9) route_64274 - 21496 (53f8) route_64338 - 21543 (5427) route_64428 - 21590 (5456) route_64496 - 21637 (5485) route_64506 - 21684 (54b4) route_6696 - 21731 (54e3) route_6706 - 21778 (5512) route_6926 - 21825 (5541) route_6930 - 21872 (5570) route_6944 - 21919 (559f) route_8981 - 21966 (55ce) route_9085 - 22013 (55fd) route_9115 - 22060 (562c) route_9161 - 22107 (565b) route_9177 - 22154 (568a) route_9270 - 22201 (56b9) route_9492 - 22248 (56e8) route_9502 - 22295 (5717) route_9579 - 22342 (5746) route_9605 - 22389 (5775) route_9613 - 22436 (57a4) route_9667 - 22483 (57d3) route_9749 - 22530 (5802) route_9783 - 22577 (5831) route_9795 - 22624 (5860) route_9825 - 22671 (588f) route_9848 - 22718 (58be) This is the list of id's which appear in more than one segment in the snapshot.  Obviously this shouldn't ever happen, and implies we have some sort of race condition.  Most likely it has to do with the introducer/merger racing and not compensating correctly.  
          Hide
          mschoch Marty Schoch [X] (Inactive) added a comment -

          Manually checking these files confirms this:

          ➜ store git:(dbca22f) zap dict 000000000163.zap _id | grep airline_1795
          airline_1795 - 8764 (223c)
          ➜ store git:(dbca22f) zap dict 000000000165.zap _id | grep airline_1795
          airline_1795 - 8764 (223c)

          Show
          mschoch Marty Schoch [X] (Inactive) added a comment - Manually checking these files confirms this: ➜ store git:(dbca22f) zap dict 000000000163.zap _id | grep airline_1795 airline_1795 - 8764 (223c) ➜ store git:(dbca22f) zap dict 000000000165.zap _id | grep airline_1795 airline_1795 - 8764 (223c)
          Hide
          mschoch Marty Schoch [X] (Inactive) added a comment -

          Oddly, 163 and 165 contain the same number of docs, which makes it unlikely that it arose from a merge (there are no deletions or mutations going on, so merging would presumably only happen to actually make a larger sergment).

          ➜ store git:(dbca22f) zap footer 000000000163.zap 
          Length: 2974929
          CRC: 0xab619c7b
          Version: 1
          Chunk Factor: 1024
          Fields Idx: 2974381 (0x2d62ad)
          Stored Idx: 6273 (0x1881)
          Num Docs: 309
          ➜ store git:(dbca22f) zap footer 000000000165.zap 
          Length: 2983164
          CRC: 0x44f277ee
          Version: 1
          Chunk Factor: 1024
          Fields Idx: 2982616 (0x2d82d8)
          Stored Idx: 6273 (0x1881)
          Num Docs: 309

          Interestingly the files are different lengths, even though the doc count is the same, and the stored fields index is at the same place (implying that the stored field data took the same amount of space, since its written after the stored field data).

          Show
          mschoch Marty Schoch [X] (Inactive) added a comment - Oddly, 163 and 165 contain the same number of docs, which makes it unlikely that it arose from a merge (there are no deletions or mutations going on, so merging would presumably only happen to actually make a larger sergment). ➜ store git:(dbca22f) zap footer 000000000163.zap Length: 2974929 CRC: 0xab619c7b Version: 1 Chunk Factor: 1024 Fields Idx: 2974381 (0x2d62ad) Stored Idx: 6273 (0x1881) Num Docs: 309 ➜ store git:(dbca22f) zap footer 000000000165.zap Length: 2983164 CRC: 0x44f277ee Version: 1 Chunk Factor: 1024 Fields Idx: 2982616 (0x2d82d8) Stored Idx: 6273 (0x1881) Num Docs: 309 Interestingly the files are different lengths, even though the doc count is the same, and the stored fields index is at the same place (implying that the stored field data took the same amount of space, since its written after the stored field data).
          Hide
          mschoch Marty Schoch [X] (Inactive) added a comment -

          The fact that the files contain almost identical data but the files are different does suggest that one was written out by the persister and the other was written out by the merger (they write data out in different order).

          Show
          mschoch Marty Schoch [X] (Inactive) added a comment - The fact that the files contain almost identical data but the files are different does suggest that one was written out by the persister and the other was written out by the merger (they write data out in different order).
          Hide
          mschoch Marty Schoch [X] (Inactive) added a comment -

          ... though if thats the case it might also point to some other bug in the merge implementation code since the file sizes are so different.

          Show
          mschoch Marty Schoch [X] (Inactive) added a comment - ... though if thats the case it might also point to some other bug in the merge implementation code since the file sizes are so different.
          Hide
          steve Steve Yen added a comment -

          Had some chat with Marty Schoch [X] on adding unit tests around this, so some notes...

          ...for example, perhaps unit test on persisting some data and merging it with nothing ought to result in equivalent segments (sometimes in DCP we have snapshots with no mutations, but just internal key/val changes only for DCP metadata)

          Show
          steve Steve Yen added a comment - Had some chat with Marty Schoch [X] on adding unit tests around this, so some notes... ...for example, perhaps unit test on persisting some data and merging it with nothing ought to result in equivalent segments (sometimes in DCP we have snapshots with no mutations, but just internal key/val changes only for DCP metadata)

            People

            • Assignee:
              steve Steve Yen
              Reporter:
              mschoch Marty Schoch [X] (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Gerrit Reviews

                There are no open Gerrit changes

                  PagerDuty

                  Error rendering 'com.pagerduty.jira-server-plugin:PagerDuty'. Please contact your Jira administrators.