Hi everyone. I have the weird issue when i use simple bool query
curl -X GET "localhost:9200/people/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"should": [
{"match":{"name":"andrew"}}
],
"must": [
{"match":{"birth_year":"2005"}}
]
}
},
"sort":{"_score":{"order":"desc"}},"from":0,"size":"15","min_score":5
}
' | python -m json.tool
In this case i get null_pointer_exception error. The weird thing is, with different options i may not get it, so, for example with birth_year = 2019 i don't get it. I thought that this can be because of wrong usage of bool query or wrong settings or something like that, but wasn't managed to find why exactly that happens. I fixed that issue using "filter" clause instead of "must" - this way that works and i don't get the error, but, that is not a solution.
ES version which i use: 7.1.0
docker-compose.yml:
elasticsearch:
image: elasticsearch:7.1.0
restart: always
ports:
- "9200:9200"
- "9300:9300"
volumes:
- ./esdata71:/usr/share/elasticsearch/data
environment:
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- discovery.type=single-node
ulimits:
memlock:
soft: -1
hard: -1
Index settings:
[
"max_ngram_diff" => 10,
"analysis" => [
"tokenizer" => [
"edge_trigrams_tokenizer" => [
"type" => "edge_ngram",
"min_gram"=> 3,
"max_gram"=> 20,
"token_chars" => ["letter", "whitespace", "punctuation", "symbol", "digit"]
],
"trigrams_tokenizer" => [
"type" => "ngram",
"min_gram"=> 3,
"max_gram"=> 10,
"token_chars" => ["letter", "whitespace", "punctuation", "symbol", "digit"]
],
"short_trigrams_tokenizer" => [
"type" => "edge_ngram",
"min_gram" => 1,
"max_gram" => 3,
"token_chars" => ["letter", "whitespace", "punctuation", "symbol", "digit"]
]
],
"analyzer"=> [
"edge_trigrams"=> [
"type"=> "custom",
"tokenizer"=> "edge_trigrams_tokenizer",
"filter"=> [
"lowercase", "asciifolding"
],
"char_filter" => ["synonym"]
],
"trigrams"=> [
"type"=> "custom",
"tokenizer"=> "trigrams_tokenizer",
"filter"=> [
"lowercase", "asciifolding"
],
"char_filter" => ["synonym"]
]
],
"char_filter" => [
"synonym" => [
"type" => "mapping",
"mappings" => [
"v.d => van de",
"v/d => van de",
"vd => van de",
"vh => van het",
"v/h => van het",
"v.h => van het",
"dl => de la",
"d/l => de la",
"d'la => de la",
"du => de le",
"de l' => de la"
]
]
]
]
]
Mapping settings:
"name" => [
'type' => 'text',
'analyzer' => 'edge_trigrams'
],
"last_name" => [
'type' => 'text',
'analyzer' => 'edge_trigrams'
],
"exact_name" => [
'type' => 'text',
'analyzer' => 'trigrams'
],
"birth_year" => [
'type' => 'text',
'analyzer' => 'standard'
],
"sex" => [
'type' => 'text',
'analyzer' => 'standard'
],
"active" => [
'type' => 'boolean'
]
The error which i get:
{
"error": {
"caused_by": {
"caused_by": {
"reason": null,
"type": "null_pointer_exception"
},
"reason": null,
"type": "null_pointer_exception"
},
"failed_shards": [
{
"index": "people",
"node": "3beKgQxDRxe5V5Ln6ZwmUA",
"reason": {
"reason": null,
"type": "null_pointer_exception"
},
"shard": 0
}
],
"grouped": true,
"phase": "query",
"reason": "all shards failed",
"root_cause": [
{
"reason": null,
"type": "null_pointer_exception"
}
],
"type": "search_phase_execution_exception"
},
"status": 500
}
Another weird thing, even with the same options (like the same name and birth_year) but with different "min_score" in query, in one case i get error but with different "min_score" i don't get error.
I assume that this is something related to score calculation, or to some kind of data manipulation inside the ES engine, so probably something wrong inside my index and that cause an exception in different specific cases, so basically some documents inside my index have something which should not be there.