Elasticsearch efficient architecture

(Jerome83136) #1

We are using ELK to analyze our website logs.

It allows us to see what happened / is happening on our website.

We have 2 elasticsearch dedicated servers. Each one runs 2 data nodes, 2 masters and 2 clients.

All our logs are injected into elasticsearch from a single logstash agent. This agent is on a machine which can access all webservers logs as flat files.

As soon as a new line appears logstash parses it, extracts fields, and sends it to the elasticsearch througt the client nodes over the http protocol (requests are load balanced between the 2 client nodes).
Since the data ndoes are member of the same cluster; each new log indexed on node1 gets replicated to node 2. We have 1 shard per node with one replica.

Kibana4 is also using the 2 client nodes to access elasticsearch.

I would like the client nodes to be able to:

  • "prefer" a data node for indexing purpose
  • "prefer" the other data node for querying purpose (kibana4 requests)
  • support a "fallback" mode when a node become unavailable and use the remaining node for queries and indexing (degraded mode)

So Kibana users will not send queries on a node which is slower because it is indexing logs.

Is there a way to achieve that ?

Thank you for your help
Best regards

(Christian Dahlqvist) #2

As you have 1 replica configured, each document need to be indexed on both data nodes. Note that primary and replica shards to the same amount of work. There is therefore no need to prefer a certain node over another. Treat the cluster as a black box and let elasticsearch rout the requests as it seem appropriate.

Given that you only have 2 servers, why have you chosen to deploy three different node types when a single node per server that is both master eligible and hold data would most likely suffice?

(system) #3