Hi there
We have a problem with our ES terms aggregation query, it took 10-12s to execute.
here is our cluster information
- we have 3 client nodes, 3 master nodes, 5 data nodes, 1 ingest node
- each node, its 20 cores(40vCore) and 64GB memeory, we assigned 31GB to the heap
- index have 5 shards and 1 replica
- index size is around 340GB (primary - 165GB) and document size is around 934.4m
- ES version is 6.5.4
index have document's tag information.
{
"mapping": {
"_doc": {
"_field_names": {
"enabled": false
},
"properties": {
"blogId": {
"type": "keyword"
},
"tag": {
"type": "keyword",
"boost": 30,
"eager_global_ordinals": true,
"copy_to": [
"tagNgram"
]
},
"tagNgram": {
"type": "text",
"analyzer": "ngram_analyzer",
"search_analyzer": "standard"
}
}
}
}
}
data seems like:
{
"blogId": "00001",
"tag": "APPLE"
},
{
"blogId": "00001",
"tag": "BANANA"
},
{
"blogId": "00001",
"tag": "ORANGE"
},
{
"blogId": "00002",
"tag": "APPLE"
},
{
"blogId": "00003",
"tag": "PEACH"
},
{
"blogId": "00003",
"tag": "BANANA"
}
here is my query
GET /tag_search_index/_doc/_search
{
"size": 0,
"query": {
"bool": {
"filter": {
"match": { "tagNgram": "A" }
},
"must_not": {
"term": { "tag": "A" }
}
}
},
"aggs": {
"most_popular": {
"terms": {
"field": "tag",
"size": 10
}
},
"count":{
"cardinality": {
"field": "tag"
}
}
}
}
response is
{
"took" : 12380,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 64946917,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"count" : {
"value" : 7202919
},
"most_popular" : {
"doc_count_error_upper_bound" : 46346,
"sum_other_doc_count" : 61546148,
"buckets" : [
// ...
]
}
}
}
Searching with more than two characters speeds up your search. But if searching with one letter, it takes about 10 seconds.
Is there any way to make it faster?
thanks in advance.