We have elasticsearch 1.5.2 cluster with following distribution:
Node1: Master+Data
Node2: Master+Data
Node3: Master+Data
Node4: Data
Node5: Data
Node6: Client (For Kibana)
Node7: Client
Node3, Node4, Node5 => On same physical node
Node6, Node7 => One same physical node
Our ES is very slow (takes a minute or sometimes gets struck forever) and does not respond well when we retrieve data for last 15 days or more.
We tried adding extra client node (Node 7), but it did not change any response time.
Need suggestions in improving performance. Can adding Client nodes and dedicated master nodes help?
This sounds like a question that should be on the Elasticsearch discuss list. Did you post something there?
You could try to do some queries like what Kibana is doing directly to elasticsearch to see if the performance problem is Kibana or Elasticsearch. On current versions of Kibana, on the Discover tab, if you click the ^ icon under the chart you see a "Statistics" button that shows you;
Query Duration
Request Duration
Hits
That might be useful information for debugging the performance issue.
But if the issue is Elasticsearch you should ask on that discuss list.
Oh, and if you can upgrade your Elasticsearch and Kibana, there's always performance improvements and awesome new features!
What kind of hardware is your cluster deployed on? Why do you have 3 out of 5 data nodes on a single server? What is the rationale behind having 2 client nodes on the same server? How much data do you have? How many indices/shards do you have?
We have limited physical number of physical servers (4) but powerful enough. Type-1: 2 Servers: Intel(R) Xeon(R)/x86_64 - 24 CPUs * 24 Cores - 500 GB RAM Type-2: 2 Servers: Intel(R) Xeon(R)/x86_64 - 64 CPUs * 8 Cores - 250 GB RAM
As the servers are powerful, we thought we would run multiple nodes of ES on single server.
3 Nodes are running on one of Type-1 server above.
Same reason as above. Powerful server. Only one client was running. So thought adding one more would speed up the kibana. Please clarify if this works?
About 1000 indices. We have large data in one of the index patterns (about 8 GB per index per day). So we close indices older than 2 months of this index pattern.
We don't have replicas for large indices of one index pattern because we don't have enough space on those servers.
But for other indices, we have a replica. We use the default config and 5 shards per index.
The 3 nodes you have on the single server are likely to generate roughly 3 times the I/O as well compared to the nodes that are hosted on their own servers, assuming the data is well distributed. Are you using dedicated disks for each of the the 3 nodes that are hosted on the same server? What does I/O look like on this server when you are querying?
@Christian_Dahlqvist
Also forgot to mention the fact that, Node6 and Node7 (Clients) are in same physical server which is located in a different data center from other physical servers.
Do you think that would affect the performance?
Not necessarily, but I do suggest you looking at I/O stats during operation to see if this might be a limeting factor causing increased latencies.
DEploying across data centers is generally not recommended, and will affect cluster performance. Even though client ondes do not hold data and this limits the amount of data transferred between data centers, client ondes do hold the fluster state and need to be updated when this changes. High latencies can slow dow this process, which can be a problem.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.