Should you split indexes by category

I've been researching this for quite a while and asking people in the community. I'm finding very conflicting responses.

I have several types of servers all pushing logs to several kafka topics

  • mail
  • web
  • indexing
  • database

Currently I have all these topics pushing to daily logstash indexes

logstash.{YYYY-MM-DD}

Doing about 10GB of data per day across 3 elasticsearch nodes. This will be increasing to about 100GB per day soon. The intended use case is to let IT and Developers look through logs easily.

Is it best practice to have all topics report to one index, or to create an index for each server type / kafka topic?

Splitting indexes makes data more contained, but comes at a higher administrative cost. (Having to run multiple curators per day)

Depends. Are the formats all similar, do you have similar fields with conflicting mappings (see GitHub - elastic/elasticsearch-migration: This plugin will help you to check whether you can upgrade directly to the next major version of Elasticsearch, or whether you need to make changes to your data and cluster before doing so.)?

But I'd suggest it makes more sense to split them because then you can better allocate resources to the sets that need them, plus also restrict access to them accordingly.

Who cares? It's automated :wink: