Details
-
Bug
-
Resolution: Fixed
-
Major
-
6.6.0
-
Untriaged
-
1
-
No
Description
What's the issue?
By default Rift will use the Castagnoli polynomial when calculating the crc32 checksums for documents before writing them to disk. We chose this polynomial at the time believing it would be the most performant (at which point we didn't have benchmarks for Rifts 'checksumAndWrite' function). This only appears to be the case for very small documents, once the documents value becomes slightly larger the IEEE hardware implementation is significantly faster. See below for a comparison of the three available polynomials:
Castagnoli
BenchmarkChecksumAndWrite/checksumAndWrite-50b-16 16473608 71.5 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-1.00kb-16 6084844 196 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-100.00kb-16 94528 12606 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-1.00mb-16 5176 232232 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-5.00mb-16 964 1195856 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-10.00mb-16 492 2375209 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-20.00mb-16 243 4755485 ns/op
|
IEEE
BenchmarkChecksumAndWrite/checksumAndWrite-50b-16 11363775 104 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-1.00kb-16 7448734 159 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-100.00kb-16 138271 8636 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-1.00mb-16 7297 157291 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-5.00mb-16 1449 788362 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-10.00mb-16 733 1579441 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-20.00mb-16 369 3170521 ns/op
|
Koopman
BenchmarkChecksumAndWrite/checksumAndWrite-50b-16 271353 4274 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-1.00kb-16 187891 6130 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-100.00kb-16 5550 205316 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-1.00mb-16 550 2144216 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-5.00mb-16 111 10664919 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-10.00mb-16 55 21359242 ns/op
|
BenchmarkChecksumAndWrite/checksumAndWrite-20.00mb-16 27 42893064 ns/op
|
We can clearly see that is the majority of cases (excluding 50 bytes) IEEE proves to be the most performant. When we consider that for a large backup we write all the document data + metadata though the 'checksumAndWrite' function, this could add up to be a not insignificant gain. We should also consider that Rift is as of now, unreleased meaning we can make this change without having to bump its version (and introduce unnecessary complexity to handle dynamically determining which CRC table to use).