ES indexing strategy

Hi all! I'm curious as to how others are doing, or would recommend doing indexing. Specifically, we have several indexes based on document structure such as for syslog, audit logs (two here since the document structures are different), Apache logs, jvm logs, tomcat logs, app logs, etc.

So we've got about 9 indexes (here). I'm thinking maybe we could use less indexes and use more types for some of the documents instead. Thoughts? How are you doing indexing?

Appreciate any feedback.

Why are you thinking that?

Well, each index uses resources and for a lot of our queries we have to go across multiple indexes which we've seen perform poorly for one user running such a query (across a few million documents). I do need to go back and review the docs and run some tests but was wondering what others are doing, recommend, and gain further insight.

Thanks.

Querying one index with 10 shards is the same as querying 10 indices with 1 shard. It also costs the same to maintain those shards, irrespective of the amount of data in those shards/indices (operational cost, not including application stuff like fielddata).

That said, you really want to keep things separate, it's better data hygiene and you can still query across multiple indices. If you use a pattern like logstash-$type-$time structure, you can still use logstash-* as the pattern in KB.

So, separate indices, reduce the shard count for them, and you should be good.

1 Like