TL; DR;
How to route all indices to some nodes except some other indices to other dedicated nodes ?
Or... explain me why am I on the wrong way
We use ELK for a few months mostly for logging centralization and very happy with it.
We recently added some metrics from various modules and since then we've seen a significant increase of response time and search durations. Even the navigation within kibana is quite slower (the pages take a bit longer to load).
It is still acceptable and I our end-users doesn't yet complains but we plan to add more and more logs and metrics (and even ML..) in the near future.
Therefore we want to make sure our infrastructure will handle the load as it increases.
Our guess is that the load that our metricbeats generates on the data nodes slows them down.
We would like to split our data nodes into 2 groups:
group 1 (default): all indices (including system indices) goes on these nodes
group 2 : only some indices that we chose (metricbeat for the moment but some other that are as intensive as metricbeat later)
For testing purposes I setup a small cluster with 4 nodes:
2 nodes with this attribute : node.attr.indexLoad: heavy
2 nodes without any attribute
Then I setup the cluster setting:
PUT _cluster/settings
{ "persistent.cluster.routing.allocation.exclude.indexLoad": "heavy" }
Then in the metricbeat index template I added this setting (because I thought there might be an "override" mechanisms at index level that overrides what is set at the cluster level) :
I already read a bit about the hot-wram-cold architecture but my understanding is that this is not addressing the some problem (but once again I maybe misunderstanding and feel free to explains me where I'm wrong).
I read again a bit this page before this answer : Hot-warm-cold architecture with Elasticsearch
I still understand that the hot-warm-cold architecture helps splitting indices into 3 sub-clusters of nodes:
hot nodes: read-write intensive
warm nodes: read(and write ?) moderate
cold nodes: read-only? (or write also ?) with very few activity
I also understand that the hot, warm or cold nodes are chosen based on the index life-cycle (how old the index is)
But what I want is not exactly that, I want to have system indices (kibana, elastics, etc...) and few more small indices always on nodes that have a very few load so they answer as fast as if there were no load at all on the whole cluster
If I do that, there are some other indices on these nodes so these indices will still be affected by the load on the metricbeat onces and this is what I try to avoid
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.