High performance

cyberzlo · March 19, 2021, 1:56am

How can I increace high performance of ES? Is this possible to do high performance cluster? or cluster give me "only" data security / availability? Most important for me is not data stored in ES, but abiliti doing queries on last data.

Additional question, how to measure EPS ability of my ELK stack or something like it, how "strong" is my cluster?

Currently I am using 2*12 thread CPUs, 64 GB ram and SSD drive.

dadoonet · March 19, 2021, 3:02am

How "slow" it is today?
What is the target number you must meet?

Do you have any numbers, query and response to share here?

cyberzlo · March 19, 2021, 9:39am

To be honest I tested with "worst" PC:
12 cores, 16 GB ram
and with same data I have "visible" better performance with this
24 cores, 64 GB ram.

So I am just trying figureout what to do when I will have much more EPS.

dadoonet · March 19, 2021, 10:20am

If you really have search issues when you get more and more queries, you can increase the read throughput by adding more replicas to your index. More nodes that means.

Like if you have a 10 nodes cluster, you can set the number of replicas to 9.

Are you planning to run your service on your own machines or on internet? If the later, I'd just try cloud.elastic.co and see how fast it works for you.

Of course, there are many source of optimizations, like having the best mapping for your use case, writing queries in an efficient way... But I can't really tell more here because I have no idea of what you are doing.

May I suggest you look at the following resources about sizing? Although old, it is still very accurate.

And Using Rally to Get Your Elasticsearch Cluster Size Right | Elastic Videos

Finally, you can attend one of our trainings:

cyberzlo · March 19, 2021, 10:52pm

Thanks for all this information.

If I have full E+L+K stack on single server, can I do second E+L stack on different sever and it will work like this? ->

on first server L is reporting to local E and (local) K is used to view dashboards etc
on second server L is report to local E, somehow (cluster?) it is connected to E from first server and share data, so it is possible to view data from both E on this K from first server?

Sorry for entry level questions but I still not fully understand how deal with this clusters.

dadoonet · March 19, 2021, 11:28pm

Ideally one single elasticsearch instance per machine and no other service running on it.
As many nodes as required by your use case.

All nodes form a cluster.

Kibana on another machine. It can speak to whatever nodes of the cluster.

But have a look at the links i shared. I think that most of answers are there.

What are you using Logstash for?

cyberzlo · March 20, 2021, 12:37am

With Logstash I am parsing own logs in JSON to ES.

If I will have 2 x ES in cluster, can I have for example in each ES node dedicated indexes generated by local Logstash like: index-1 in ES #1 instance and index-2 in ES #2 instance, and both will be visible in both ES when connected to one cluster?

dadoonet · March 20, 2021, 3:11am

Did you try to do that with an ingest pipeline instead?

You must have 3 nodes at least to avoid split brain issues.

Then elasticsearch will decide on which the data will be allocated whatever the node you are connected to.

cyberzlo · March 28, 2021, 10:10am

So it is better NOT to do cluster if I have only 2 servers and just use one ES in my ELK stack?

dadoonet · March 28, 2021, 2:20pm

If you don't care about High Availability and data integrity, you can use onde node only or 2.

Quote from docs:

High availability (HA) clusters require at least three master-eligible nodes, at least two of which are not voting-only nodes. Such a cluster will be able to elect a master node even if one of the nodes fails.

DavidTurner · March 28, 2021, 2:56pm

You only need three nodes for the high availability side of this. All sizes of cluster will protect the integrity of your data, but might reject some requests if the cluster is not HA and a node fails.

dadoonet · March 28, 2021, 3:22pm

Thanks.

I meant that in the past, when the cluster is split (split brain) and your application sends data to one or the other node, you might have indices which contained not the same data.

That might not be the case nowadays anymore.

cyberzlo · March 28, 2021, 3:28pm

Thanks for information!

But in matter of better performance, is there some diffrence between if I will have:

server #1: full ELK stack
server #2: only Logstash who will report to ES on #1 server
or
server #1: full ELK stack
server #2: Logastash + ES in cluster with ES from #1 server

I need have logstash which software take care about my data and parse/transport it to ES.

Some of this solutions is better for performance? I think second approach in theory is better but in fact it can not be big diffrence. Additionaly this "problem of only two nodes" can be worst than little less performance according to first solution.

Please advice :), because currently I can have only 2 x ES, if I will can have 3 x ES then will have cluster for sure :).

I tottaly understand problem of high AVABILITY - this is like with MySQL databases. But my biggest problem is PERFORMANCE :). If I will lost data, it will not be worst - my system is for real time analysis, old data will gone, we will live collect new - this is fine in critical situation like some problem with server etc.

So my question is more for performance purposes. Is (much) better to make cluster of 2 ES than have only one alone ES in this case?

dadoonet · March 29, 2021, 1:51pm

If you are looking for performance, you need at least to make sure that only Elasticsearch runs on the machine you have. Don't run any other service on the same machine.
You can start with a single node and check how good the performance is for your use case.

system · April 26, 2021, 1:52pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cluster sizing Elasticsearch	7	407	July 6, 2017
Elastic Cluster tuning for optimal performance Elasticsearch	1	460	July 6, 2017
Cluster from virtual machines Elasticsearch	5	770	July 5, 2017
Cluster optimization(indexing/query performace) Elasticsearch	4	312	July 6, 2017
Redesigning ES Cluster, questions about optimization Elasticsearch	4	340	July 6, 2017

High performance

Related topics