Details
-
Task
-
Resolution: Cannot Reproduce
-
Critical
-
Beryllium
-
Security Level: Public
-
None
-
2
Description
Test:
/// 13. TestSubquantizersValidation |
/// Description |
/// Test that the PQ’s subquantizers value is validated with dimensions correctly. |
/// The invalid argument exception should be thrown when the vector index is created |
/// with invalid subquantizers which are not a divisor of the dimensions or zero. |
/// Steps |
/// 1. Copy database words_db. |
/// 2. Create a vector index named "words_index" in _default.words collection. |
/// - expression: "vector" |
/// - dimensions: 300 |
/// - centroids: 8 |
/// - PQ(subquantizers: 2, bits: 8) |
/// 3. Check that the index is created without an error returned. |
/// 4. Delete the "words_index". |
/// 5. Repeat steps 2 to 4 by changing the subquantizers to |
/// 3, 4, 5, 6, 10, 12, 15, 20, 25, 30, 50, 60, 75, 100, 150, and 300. |
/// 6. Repeat step 2 to 4 by changing the subquantizers to 0 and 7. |
/// 7. Check that an invalid argument exception is thrown. |
func testSubquantizersValidation() throws { |
let collection = try db.collection(name: "words")! |
var config = VectorIndexConfiguration(expression: "vector", dimensions: 300, centroids: 8) |
config.encoding = .productQuantizer(subquantizers: 2, bits: 8) |
try collection.createIndex(withName: "words_index", config: config) |
|
let names = try collection.indexes() |
XCTAssert(names.contains("words_index")) |
|
// Step 5: Use valid subquantizer values |
for numberOfSubq in [3, 4, 5, 6, 10, 12, 15, 20, 25, 30, 50, 60, 75, 100, 150, 300] { |
try collection.deleteIndex(forName: "words_index") |
config.encoding = .productQuantizer(subquantizers: UInt32(numberOfSubq), bits: 8) |
try collection.createIndex(withName: "words_index", config: config) |
|
// Query: |
let sql = "select meta().id, word from _default.words where vector_match(words_index, $vector, 20)" |
let parameters = Parameters()
|
parameters.setValue(dinnerVector, forName: "vector") |
|
let q = try self.db.createQuery(sql) |
q.parameters = parameters
|
|
let explain = try q.explain() as NSString |
XCTAssertNotEqual(explain.range(of: "SCAN kv_.words:vector:words_index").location, NSNotFound) |
|
let rs = try q.execute() |
XCTAssertEqual(rs.allResults().count, 20) |
XCTAssert(checkIndexWasTrained())
|
}
|
|
// Step 7: Check if exception thrown for wrong subquantizers: |
for numberOfSubq in [0, 7] { |
try collection.deleteIndex(forName: "words_index") |
config.encoding = .productQuantizer(subquantizers: UInt32(numberOfSubq), bits: 8) |
expectExcepion(exception: .invalidArgumentException) {
|
try! collection.createIndex(withName: "words_index", config: config) |
}
|
}
|
}
|
Warning message:
WARNING clustering 300 points to 256 centroids: please provide at least 9984 training points |
However, the strange path is that the index was trained even with that warning.
Need to check if this PR changes the behavior.
https://github.com/couchbaselabs/mobile-vector-search/pull/40