Hi!
I'm really enjoying all the possibilities brought about by the move from
facets to aggregations. However, I still can't figure out the relationship
between facets or buckets and analyzers. Is it not possible at all to get
the buckets out of an analyzed field?
Specifically, I need to get list of most common words, but I want to use my
stopword list to exclude those that do not matter to me.
I am using a stop word filter:
index.analysis.filter.fnstop:
type: stop
stopwords: ["my", "it", "the", "likes"]
And a custom analyzer:
index.analysis.analyzer.test:
type: custom
tokenizer: whitespace
filter: lowercase, asciifolding, fnstop
I then map my field with the custom analyzer:
...
"Clean_Message" : {{"type" : "string", "analyzer" : "test"}
And request list of top 100 most common terms, using the search API:
{
"query": { "bool": { "must": [ { "match_all": {} } ] } },
"aggs": {
"Message": {
"terms": {
"field": "Clean_Message",
"size": 100,
"order": { "_count": "desc" }
}
}
}
}
However, some words in my stop filter appear in that list.
Is it by design? Are we not supposed to run facets or aggregations agains
an analyzed field?
Is it possible to get the list of most common terms against an analyzed
field?
Thank you very much for your attention and for your work!
André Morais
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0b9f0874-6f79-46d6-8e9b-5393b0b3cd10%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.