I have two fields, and would like to match terms in the field IF in the same position i.e. span near query with zero slop. However, the spannear query is matching WHEN IT SHOULDN'T - i.e. based on my own design AND certified by looking at the term_vectors!
Here is an example of the incorrect matches I've found shouldn't match but does - I've abstracted to "field1/2" and "value1/2":
{
"query": {
"span_near" : {
"clauses" : [
{ "span_term" : { "text.field1" : "value1" } },
{ "field_masking_span": {
"query": { "span_term" : { "text.field2" : "value2" } }
,
"field": "text.field1"
}
}
],
"slop" : 0,
"in_order" : false
}
}
}
However, this isn't working as expected and seems buggy - deep diving into one of the erroneous hits and using term_vectors for the hit shows that the values are NOT in the same position!
They are next to each other - value1 is at position 165 in field1 and value2 is at 164 in field2
The term vectors returned by elasticsearch for the values using:
curl -X GET "localhost:9200/test/_doc/1/_termvectors" -H 'Content-Type: application/json' -d'
{
"fields" : ["text.field1", "text.field2"],
"offsets" : true,
"positions" : true,
"term_statistics" : true,
"field_statistics" : true
}
'
are:
"value1": {
"doc_freq": 61,
"ttf": 87,
"term_freq": 2,
"tokens": [
{
"position": 165,
"start_offset": 954,
"end_offset": 962
},
{
"position": 431,
"start_offset": 2535,
"end_offset": 2543
}
]
},
"value2": {
"doc_freq": 3029,
"ttf": 72118,
"term_freq": 14,
"tokens": [
{
"position": 33,
"start_offset": 184,
"end_offset": 187
},
{
"position": 68,
"start_offset": 382,
"end_offset": 385
},
{
"position": 69,
"start_offset": 386,
"end_offset": 389
},
{
"position": 163,
"start_offset": 946,
"end_offset": 949
},
{
"position": 164,
"start_offset": 950,
"end_offset": 953
},
{
"position": 227,
"start_offset": 1354,
"end_offset": 1357
},
{
"position": 228,
"start_offset": 1358,
"end_offset": 1361
},
{
"position": 261,
"start_offset": 1522,
"end_offset": 1525
},
{
"position": 262,
"start_offset": 1526,
"end_offset": 1529
},
{
"position": 334,
"start_offset": 1958,
"end_offset": 1961
},
{
"position": 335,
"start_offset": 1962,
"end_offset": 1965
},
{
"position": 382,
"start_offset": 2257,
"end_offset": 2260
},
{
"position": 383,
"start_offset": 2261,
"end_offset": 2264
},
{
"position": 445,
"start_offset": 2629,
"end_offset": 2632
}
]
},