Hello guys,
I'm trying to run a simple terms aggregation:
GET test3/_search
{
"aggs" : {
"agg_name" : {
"terms" : {
"field" : "words"
}
}
},
"size" : 0
}
My goal is to get a simple doc_count for every word (token) within the field words. But I keep getting the raw value of the whole words field.
For example:
"words" : "This is a sentence"
I'm expecting to get separate tokens like ["this", "is", "a", "sentence"] and count the occurrences of each token. What I get is ["this is a sentence"] for every words field with the doc_count resulting 1.
I have tried using different analysers and tokenisers, but whichever combination I use, the result is the same, so I'm really confused at the moment, as it seems that tokenisers don't have any effects on the aggregation.
This is my latest (current) index mapping configuration:
{
"order": 0,
"index_patterns": [
"words-data-*"
],
"settings": {
"index": {
"max_result_window": "200000",
"refresh_interval": "-1",
"analysis": {
"analyzer": {
"my_analyzer": {
"filter": [
"lowercase",
"trim",
"reverse"
],
"type": "custom",
"tokenizer": "standard"
}
}
},
"number_of_shards": "1",
"number_of_replicas": "0"
}
},
"mappings": {
"keywords": {
"_all": {
"enabled": false
},
"properties": {
"words": {
"ignore_above": 256,
"store": true,
"eager_global_ordinals": true,
"type": "keyword",
"fields": {
"reverse": {
"search_analyzer": "my_analyzer",
"analyzer": "my_analyzer",
"type": "text"
}
}
}
}
}
},
"aliases": {}
}
I would be expecting lots of different tokens, but it looks like there are none. I want the results of a standard tokeniser for the words field when running aggregations.
By the way, any other suggestions for the mapping?