ES indexing strategy

cgswong · August 19, 2016, 12:18am

Hi all! I'm curious as to how others are doing, or would recommend doing indexing. Specifically, we have several indexes based on document structure such as for syslog, audit logs (two here since the document structures are different), Apache logs, jvm logs, tomcat logs, app logs, etc.

So we've got about 9 indexes (here). I'm thinking maybe we could use less indexes and use more types for some of the documents instead. Thoughts? How are you doing indexing?

Appreciate any feedback.

warkolm · August 19, 2016, 12:20am

Why are you thinking that?

cgswong · August 19, 2016, 2:40am

Well, each index uses resources and for a lot of our queries we have to go across multiple indexes which we've seen perform poorly for one user running such a query (across a few million documents). I do need to go back and review the docs and run some tests but was wondering what others are doing, recommend, and gain further insight.

Thanks.

warkolm · August 19, 2016, 3:50am

Querying one index with 10 shards is the same as querying 10 indices with 1 shard. It also costs the same to maintain those shards, irrespective of the amount of data in those shards/indices (operational cost, not including application stuff like fielddata).

That said, you really want to keep things separate, it's better data hygiene and you can still query across multiple indices. If you use a pattern like logstash-$type-$time structure, you can still use logstash-* as the pattern in KB.

So, separate indices, reduce the shard count for them, and you should be good.

Topic		Replies	Views
Indexing different type logs Elasticsearch	2	358	May 13, 2018
Questions from a newbie Elasticsearch	15	417	July 6, 2017
[Help!] Number of indexes and shards per node Elasticsearch	9	3435	May 5, 2017
Indexing strategy advice Elasticsearch	2	548	July 6, 2017
Dealing with large index collection strategy? Elasticsearch	6	1555	July 5, 2017

ES indexing strategy

Related topics