Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-25839

cbrecovery thows out exception when running recovery from failover nodes in XDCR

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Cannot Reproduce
    • 5.0.0
    • None
    • tools
    • Centos 7.x 64-bit
    • Untriaged
    • Centos 64-bit
    • Unknown

    Description

      Install Couchbase Server 5.0.0-3506 on 8 Centos 7.x servers.
      Create a cluster A with 3 nodes
      Create a cluster B with 3 nodes
      Create default bucket on 2 clusster.
      Create bidirectional replication on 2 clusters.
      Load data to default bucket on cluster A
      Wait for replication is done.
      Failover 2 nodes on cluster B.
      Run command cbrecovery to recovery data from cluster A to cluster B

      /opt/couchbase/bin/cbrecovery http://172.17.0.73:8091 http://172.17.0.76:8091 -b default -B default -u Administrator -p password -U Administrator -P password
      

      Got error in pump_cb.py

       /opt/couchbase/bin/cbrecovery http://172.17.0.73:8091 http://172.17.0.76:8091 -b default -B default -u Administrator -p password -U Administrator -P password
      , 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913, 914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 957, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1022, 1023]}]
      Traceback (most recent call last):
        File "/opt/couchbase/lib/python/cbrecovery", line 233, in <module>
          pump_transfer.exit_handler(Recovery().main(sys.argv))
        File "/opt/couchbase/lib/python/cbrecovery", line 96, in main
          err = pump_transfer.Transfer.main(self, temp_argv)
        File "/opt/couchbase/lib/python/pump_transfer.py", line 80, in main
          rv = pumpStation.run()
        File "/opt/couchbase/lib/python/pump.py", line 112, in run
          rv, source_map, sink_map = self.check_endpoints()
        File "/opt/couchbase/lib/python/pump.py", line 175, in check_endpoints
          rv, sink_map = self.sink_class.check(self.opts, self.sink_spec, source_map)
        File "/opt/couchbase/lib/python/pump_cb.py", line 175, in check
          error = CBSink.map_recovery_buckets(sink_map, sink_bucket_name, opts.vbucket_list)
        File "/opt/couchbase/lib/python/pump_cb.py", line 132, in map_recovery_buckets
          server_vb_map["vBucketMap"][vb][idx] = 0
      IndexError: list assignment index out of range
      

      Test could be run automatically as be low. Make sure ini file has setup two 3 nodes clusters.

      ./testrunner -i b/resources/8-nodes-template-2clusters-3nodes.ini -t cbRecoverytests.cbrecovery.restart_cbrecover_multiple_failover_swapout_reb_routine,items=300000,rdirection=bidirection,ctopology=chain,failover=destination,fail_count=2,add_count=2,max_verify=10000,when_step=create_bucket_when_recovery,extra_buckets=1
      

      I will check with build is the last one passed this test in spock.
      This test passed in 4.6.3
      I will upload logs soon.

      Attachments

        1. 172.17.0.73-20170828-1203-diag.zip
          4.53 MB
        2. 172.17.0.74-20170828-1206-diag.zip
          4.04 MB
        3. 172.17.0.75-20170828-1208-diag.zip
          4.13 MB
        4. 172.17.0.76-20170828-1210-diag.zip
          4.34 MB
        5. 172.17.0.77-20170828-1212-diag.zip
          3.48 MB
        6. 172.17.0.78-20170828-1214-diag.zip
          3.61 MB
        7. 172.17.0.79-20170828-1216-diag.zip
          3.03 MB
        8. 172.17.0.80-20170828-1217-diag.zip
          3.28 MB
        9. console.rtf
          2.17 MB
        10. console2.rtf
          375 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          This is not the same error that you originally posted this bug about. The original problem you mentioned was a crash and posted the stack trace. Now you're saying you can reproduce it but the problem is not my vbucket errors. If you are not able to reproduce the crash issue then this bug should be closed as "cannot reproduce" and you should open a different bug based on this issue.

          mikew Mike Wiederhold [X] (Inactive) added a comment - This is not the same error that you originally posted this bug about. The original problem you mentioned was a crash and posted the stack trace. Now you're saying you can reproduce it but the problem is not my vbucket errors. If you are not able to reproduce the crash issue then this bug should be closed as "cannot reproduce" and you should open a different bug based on this issue.
          thuan Thuan Nguyen added a comment - - edited

          Let me try on original build 3506, I ran the same test on build 3507 and it shows different errors

          thuan Thuan Nguyen added a comment - - edited Let me try on original build 3506, I ran the same test on build 3507 and it shows different errors

          I could not reproduce this issue in build 3506 in rerun tests

          thuan Thuan Nguyen added a comment - I could not reproduce this issue in build 3506 in rerun tests
          thuan Thuan Nguyen added a comment -

          Reopen to update the fix version

          thuan Thuan Nguyen added a comment - Reopen to update the fix version
          thuan Thuan Nguyen added a comment -

          Close this ticket as I could not reproduce it in later run.

          thuan Thuan Nguyen added a comment - Close this ticket as I could not reproduce it in later run.

          People

            mikew Mike Wiederhold [X] (Inactive)
            thuan Thuan Nguyen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty