Cluster status and performance


(The Bronx) #1

Hello, a newbie here. I am trying to understand if the yellow cluster status can affect performance or not.

I am storing logs in ElasticSearch with Logstash and I really don't care about high availability. I mean, if the cluster dies and I lost all logs it's not a big deal. As long as this does not happen too frequently hehe. It's on AWS by the way.

So the question is, if I only have one data node (m4.large.elasticsearch) so the cluster status is always yellow, should I expect ElasticSearch to perform slower?

I have stored around 30GB of data, 10 million documents (and growing), and kibana is taking a lot of time to show just the discover page for the last 24h. The dashboard timeouts, cause every panel (even the most simple ones) takes seconds to query.
Copying the queries to postman shows similar results: requests take seconds. Sometimes, if I send the same request again and again, it takes a few milliseconds. But most of the time it is over 1 or 2 seconds.

TLDR; Is cluster status=yellow a problem in terms of search/index performance?


(Emmanuel Rouby) #2

Hello,

In fact, if you have a lot of indexing, and you are searching/ aggregating at the same time, it may take more time.

if high availability is not a problem, you may consider to configure your index with 3-4 primary shards and 0 replica.

The node will not have to duplicate data and handle replicas when you make requests

Yellow node does not mean poor performance, but you must have sufficient heap dedicated to elasticsearch if you want a minimum of performance.

(some doc exists about this topic in the elastic website)

hope this helps..


(The Bronx) #3

Yeah, I am indexing and searching at the same time... CPU usage is really high. Looks like 2 vCPUs are not enough, or maybe the indexing can be improved to consume less CPU, I have to investigate further.

I've seen that yellow just means "not replicated" in the docs, but I've also seen questions like "what's your cluster status" when talking about performance issues. Just wanted to confirm that cluster status is not what I have to solve for this particular case.

Regarding heap space, I don't know what Amazon is doing with that, all I know is that the instance has 8GB of RAM, and that the JVM is moving between 25% and 75% regularly.

Thank you @Tetrapack


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.