Hello All,
I have 40 TB of index having about 6 Billion documents in single index.
My ES query is
Fetching 100000 unique values of uniqueId values (applying terms aggregations) from single ES query.
Currently, we initialized index having 900 shards across 35 data nodes. And most of the time is spending on coordinating nodes.
How can i configure the Elasticsearch cluster for better performance?
Please suggest.
What type of data do you have in that index? What is the structure of the data? What does the query/aggregation look like? How many unique ids are there in total?
Which version of Elasticsearch are you using?
How have you determined this? What is the specification of the nodes? What exactly is the performance issue? What latency are you experiencing?
Is this the size of primary and replica shards? If so, how many replica shards do you have configured?
@Christian_Dahlqvist
Please find the reply:
What type of data do you have in that index? What is the structure of the data?
Currently, we are storing all data (Total 6.3 Billion) in single index. We are storing as flat data with around 500 fields in each document. And we are storing the different type of data in single index.
What does the query/aggregation look like?
We are aggregating the total amount of each unique ids.
How many unique ids are there in total?
There present around 11M unique ids in single type.
Which version of Elasticsearch are you using?
We are using ES 7.17
How have you determined this?
We executed the ES profiling API and found the response of shards is less than 1 sec. And usage of data node is low while the usage of coordinating node is high.
What is the specification of the nodes?
We are using c7g.8xlarge (64 GB total memory and 32 vCPU) for data node and coordinating node.
What exactly is the performance issue? What latency are you experiencing?
On analyzing the profiling response, the time taking section is from Coordinating node and the response time from ES is high.
Is this the size of primary and replica shards? If so, how many replica shards do you have configured?
Currently we are using primary shards only (i.e. without replica shards). Does replica shards also improved on performance?
What does CPU usage look like on the different node types when you run a query? What size and type of storage do you have attached?
Elasticsearch is generally limited by disk I/O and not CPU, so I tend to use memory optimised instances for Elasticsearch clusters unless I am running a lot of CPU heavy processing, e.g. complex ingest pipelines. Have you run iostat -x
on the data nodes when you are querying to verify that the storage is not a limiting factor (the coordinating node can only process data as fast as it comes off the data nodes after all)?
Regarding the CPU usage, usage on data node is normal. But high CPU and Memory in Coordinating nodes.
Currently, we are using st1 disk type.
Please find the result of one of the data node from iostat -x
What is high and normal in terms of concrete numbers? How many cores are fully utilised?