How to keep Shards geographically bound

Simon_Oxwell · May 2, 2017, 6:48am

Hi,

I've got an Elasticsearch 5.3 cluster that I essentially want to store logs in for archival and search purposes, made up of two nodes, each in a different data centre. I would like to be able to search both nodes from one Kibana instance (hence the cluster), but not ship the logs between the data centres.

So far, I've been able to disable replicas, but haven't been able to figure out the right settings to stop my shards from being distributed across the cluster. I've been looking at the cluster.routing.allocation.require.* directives, but haven't had much luck.

Thanks,
Simon

warkolm · May 2, 2017, 6:52am

We don't recommend that, ES is latency sensitive.

Simon_Oxwell · May 2, 2017, 7:07am

Hmm. Sites are <10ms apart, according to ping, but I acknowledge that might be an issue.

Can you suggest an alternative architecture? Kibana doesn't seem to be able to query more than one elasticsearch (which, to be honest I'm not expecting it to be able to), and I'm looking to not having to shovel raw log files between sites if I can help it.

warkolm · May 2, 2017, 7:55am

Have you looked at allocation awareness as opposed to routing?

Christian_Dahlqvist · May 2, 2017, 8:02am

You can set the nodes up as separate clusters and use a tribe node to query them. This allows you to keep the data locally, but adds complexity.

warkolm · May 2, 2017, 8:20am

Cross cluster search would be better - https://www.elastic.co/blog/tribe-nodes-and-cross-cluster-search-the-future-of-federated-search-in-elasticsearch

Simon_Oxwell · May 3, 2017, 4:47am

I've looked at allocation awareness (this: https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-awareness.html ) and it seems to be about keeping replica shards outside of the 'awareness zone' that the primary shards reside in, rather than allocating all the primary shards for an index on the same cluster node where the data is ingested.

Simon_Oxwell · May 3, 2017, 4:50am

Cross-cluster search seemed to be just the thing, but Kibana doesn't support it yet

So that leaves a tribe node, or just wait for the next Kibana release.

warkolm · May 3, 2017, 5:12am

You can't do that unless you manually route all the shards and then disable re-allocation.

Simon_Oxwell · May 3, 2017, 6:04am

Given that's likely to still leave me with a cluster latency issue, I think to best achieve my goal is to stop trying to fight elasticsearch, create two separate nodes for my data, and use Kibana and a tribe node to knit them together until Kibana supports cross-cluster search.

Thanks for your help, and happy forum birthday.

warkolm · May 3, 2017, 6:05am

5.4 isn't far off

fortikeco · May 5, 2017, 3:13pm

you could set 1 shard and 0 recplicas per index (so you effectively have only 1 primary shard).
However, you loose parallel processing and you can only store 2 billion documents per index in this configuration.

system · June 2, 2017, 3:17pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
To cluster or not to cluster? Elasticsearch ccs-cross-cluster-search	4	957	June 18, 2019
Cross-cluster-search setup and r/w to ES Elasticsearch	5	1010	June 28, 2017
Cross datacenter cluster for logstash backend Elasticsearch	3	449	July 6, 2017
Shard dividing advise Elasticsearch	4	550	July 5, 2017
Adding a new and different ElasticSearch node to Kibana Elasticsearch	4	359	June 7, 2022

How to keep Shards geographically bound

Related topics