Description
I just started a 2 node cluster, loaded beer-sample, created a default index beer-search, and ran a query string query for "water". Then I observed some unusual scoring:
Last login: Tue Mar 15 09:37:39 on ttys002 |
Martys-MacBook-Pro:@fts mschoch$ curl -XPOST -H "Content-Type: application/json" \ |
> http://localhost:9200/api/index/beer-search/query \ |
> -d '{
|
> "indexName": "beer-search", |
> "size": 10, |
> "from": 0, |
> "explain": true, |
> "highlight": {}, |
> "query": { |
> "boost": 1, |
> "query": "water" |
> },
|
> "fields": [ |
> "*" |
> ],
|
> "ctl": { |
> "consistency": { |
> "level": "", |
> "vectors": {} |
> },
|
> "timeout": 0 |
> }
|
> }' | jq .
|
% Total % Received % Xferd Average Speed Time Time Time Current
|
Dload Upload Total Spent Left Speed
|
100 14622 0 14343 100 279 465k 9274 --:--:-- --:--:-- --:--:-- 482k |
{
|
"status": { |
"total": 32, |
"failed": 0, |
"successful": 32 |
},
|
"request": { |
"query": { |
"query": "water", |
"boost": 1 |
},
|
"size": 10, |
"from": 0, |
"highlight": { |
"style": null, |
"fields": null |
},
|
"fields": [ |
"*" |
],
|
"facets": null, |
"explain": true |
},
|
"hits": [ |
{
|
"index": "beer-search_6da6fa7260987da9_0ffd4517", |
"id": "appalachian_brewing_company-water_gap_wheat.json", |
"score": 6.111515760512589, |
"explanation": { |
"value": 6.111515760512589, |
"message": "sum of:", |
"children": [ |
{
|
"value": 6.111515760512589, |
"message": "product of:", |
"children": [ |
{
|
"value": 6.111515760512589, |
"message": "sum of:", |
"children": [ |
{
|
"value": 6.111515760512589, |
"message": "product of:", |
"children": [ |
{
|
"value": 6.111515760512589, |
"message": "sum of:", |
"children": [ |
{
|
"value": 6.111515760512589, |
"message": "weight(_all:water^1.000000 in appalachian_brewing_company-water_gap_wheat.json), product of:", |
"children": [ |
{
|
"value": -1, |
"message": "queryWeight(_all:water^1.000000), product of:", |
"children": [ |
{
|
"value": 1, |
"message": "boost" |
},
|
{
|
"value": -36.541550359822686, |
"message": "idf(docFreq=4793636741649763430, maxDocs=238)" |
},
|
{
|
"value": 0.027366107626881006, |
"message": "queryNorm" |
}
|
]
|
},
|
{
|
"value": -6.111515760512589, |
"message": "fieldWeight(_all:water in appalachian_brewing_company-water_gap_wheat.json), product of:", |
"children": [ |
{
|
"value": 2, |
"message": "tf(termFreq(_all:water)=4" |
},
|
{
|
"value": 0.08362419903278351, |
"message": "fieldNorm(field=_all, doc=appalachian_brewing_company-water_gap_wheat.json)" |
},
|
{
|
"value": -36.541550359822686, |
"message": "idf(docFreq=4793636741649763430, maxDocs=238)" |
}
|
]
|
}
|
]
|
}
|
]
|
},
|
{
|
"value": 1, |
"message": "coord(1/1)" |
}
|
]
|
}
|
]
|
},
|
{
|
"value": 1, |
"message": "coord(1/1)" |
}
|
]
|
}
|
]
|
},
|
"locations": { |
"description": { |
"water": [ |
{
|
"pos": 43, |
"start": 239, |
"end": 244, |
"array_positions": null |
},
|
{
|
"pos": 50, |
"start": 285, |
"end": 290, |
"array_positions": null |
},
|
{
|
"pos": 73, |
"start": 441, |
"end": 446, |
"array_positions": null |
}
|
]
|
},
|
"name": { |
"water": [ |
{
|
"pos": 1, |
"start": 0, |
"end": 5, |
"array_positions": null |
}
|
]
|
}
|
}
|
}
|
Question:
1. This should be a simple single term search, why do we see a coordination factor (and 2 of them)
2. Why do we see obviously wrong docFreq=4793636741649763430
Attachments
Issue Links
- is duplicated by
-
MB-18727 [FTS] dictionary counts incorrect, leads to incorrect scoring
- Closed