Hello everybody,
The goal I'm trying to achieve is to aggregate on all words which contain a specific character. Query example:
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{"match_phrase": {"word.ngrams": "sex"}}
]
}
}
}
},
"aggs": {
"clusters": {
"terms": {
"field": "word",
"include": ".*?.*",
"size": 100
}
}
},
"size": 0
}
It works fine if the include parameter contains regex in a form: ".*?" or "?.*", but when searching from both sides ".?.", the request tends to time out (error: Gateway Time-out). I guess too much resources are used for processing such a query.
Is there any other way to achieve this?
I also tried using regexp, as an aggregation, it does return the total amount of hits, but not a list of keywords that actually contain the searched character. Example also below:
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{"match_phrase": {"word.ngrams": "sex"}}
]
}
}
}
},
"aggs": {
"regex_query": {
"filter": {
"regexp": {
"word": {
"value": ".*?.*"
}
}
}
}
},
"size": 0
}