Hi sir,
i'm new to elasticsearch, i have more than 2.5 crores of data in elasticsearch index on a single node cluster, when i perform aggregation query it takes lot of time and even get timeout. i think i'm doing something wrong here, please help me.
ES INFO:
nodes: 1,
primary-shards: 30,
replica-shards: 30,
total index size: 90gb,
each shards holding 3gb of data,
SYSTEM INFO:
250 SSD ram 28GB
What type of aggregation are you running?
MY MODEL QUERY:
FIRST I PERFORM A SIMPLE TEXT SEARCH AND THEN I PERFORM AGGREGATION TO FIND WHICH AUTHOR NAME OCCURUS MOST FREQUENTLY FROM THE RETURNED DOCUMENT
{
"query":{
"simple_query_string":{
"query":"search text"
}
},
"aggs":{
"most_frequent_authors":{
"terms":{
"field":"AuthorList.Name.keyword"
}
}
}
}
MY MODEl MAPPING:
{
"Abstract":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"ArticleTitle":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"AuthorList":{
"properties":{
"Name":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
}
}
Is there anything in the logs, e.g. around long or frequent GC? How large heap do you have configured? How does CPU and disk I/O look like when you are querying?