How to keep Shards geographically bound

Hi,

I've got an Elasticsearch 5.3 cluster that I essentially want to store logs in for archival and search purposes, made up of two nodes, each in a different data centre. I would like to be able to search both nodes from one Kibana instance (hence the cluster), but not ship the logs between the data centres.

So far, I've been able to disable replicas, but haven't been able to figure out the right settings to stop my shards from being distributed across the cluster. I've been looking at the cluster.routing.allocation.require.* directives, but haven't had much luck.

Thanks,
Simon

We don't recommend that, ES is latency sensitive.

Hmm. Sites are <10ms apart, according to ping, but I acknowledge that might be an issue.

Can you suggest an alternative architecture? Kibana doesn't seem to be able to query more than one elasticsearch (which, to be honest I'm not expecting it to be able to), and I'm looking to not having to shovel raw log files between sites if I can help it.

Have you looked at allocation awareness as opposed to routing?

You can set the nodes up as separate clusters and use a tribe node to query them. This allows you to keep the data locally, but adds complexity.

Cross cluster search would be better - https://www.elastic.co/blog/tribe-nodes-and-cross-cluster-search-the-future-of-federated-search-in-elasticsearch

I've looked at allocation awareness (this: https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-awareness.html ) and it seems to be about keeping replica shards outside of the 'awareness zone' that the primary shards reside in, rather than allocating all the primary shards for an index on the same cluster node where the data is ingested.

Cross-cluster search seemed to be just the thing, but Kibana doesn't support it yet :frowning:

So that leaves a tribe node, or just wait for the next Kibana release.

You can't do that unless you manually route all the shards and then disable re-allocation.

Given that's likely to still leave me with a cluster latency issue, I think to best achieve my goal is to stop trying to fight elasticsearch, create two separate nodes for my data, and use Kibana and a tribe node to knit them together until Kibana supports cross-cluster search.

Thanks for your help, and happy forum birthday.

5.4 isn't far off :wink:

you could set 1 shard and 0 recplicas per index (so you effectively have only 1 primary shard).
However, you loose parallel processing and you can only store 2 billion documents per index in this configuration.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.