Hi, all!
I have an index with a field "_tokens" which has relevant tokens associated
with a document. This field is configured as follows:
"_token" : {
"type" : "multi_field",
"fields" : {
"_token" : {
"type" : "string",
"index" : "not_analyzed",
...
},
"folded" : {
"type" : "string",
"analyzer" : "folded",
...
},
"folded_edge_ngram" : {
"type" : "string",
"index_analyzer" : "folded_edge_ngram",
"search_analyzer" : "folded",
...
}
}
}
}
The analyzer "folded" and "folded_edge_ngram" are ICU folded and the latter
has edge_ngram as well.
I'm tring to do a search using the following code:
{
"size": 0,
"query": {
"bool": {
"must": [
{
"term": {
"_token.folded_edge_ngram": "bar"
}
}
]
}
},
"facets" : {
"tokens" : {
"terms" : {
"field" : "_token"
}
}
}
}
It returns all tokens beginning with "bar" with ICU folding, such as "Bär"
or "bar". But it also returns related tokens (remember that there can be
more than one token in "_tokens"), so I want to restrict the facets with
something like:
"exclude": doesn't work, because it only supports a full term match
"regex": it works to an extent (match beginning, case insensitive), but it
doesn't do ICU folding
"scripts": OMG, how does this work?
So, my question is: Is there a form to reduce the facets based on a match
with the ICU folding analyer? Or, am I totally wrong and should be using
something else (more probable)?
P.S. : Afterwards, I also need the opposite. That is: search all documents
containing a (ICU folded) word and do a faceting among the other terms
(this has to do with autocompletion).
Thanks!
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.