What is the recommendation for indexing and sharding in Elasticsearch in complex use case?

Hi All Elasticsearch Gurus,

I am a newbie and learning the Elastic Stack and hence need inputs and thoughts from experts..

Assuming an Elastic Stack environment wherein the Logstash component parses/transforms data from various sources (web apps, mobile apps, switches, routers, OS logs etc) and ships to Elasticsearch cluster to be further viewed in Kibana for monitoring and troubleshooting....

I plan to create index ( in the output section of logstash conf file) in the following manner for all input sources

  1. webapp-index-yyyy.mm.dd
  2. mobileapp-index-yyyy-mm.dd
  3. syslog-index-yyyy-mm.dd

and similarly for other sources. (i.e. an index per source per day)

I have some questions for the above use

Q1. Is this a good practice? What are the disadvantages or potential future problems if data grows? ( since no. of indices will increase and replica shards will increase too)
Q2. What is recommendation for such use case?
Q3. Is creating an index/source better approach? ( e.g. webapp-index instead of daily webapp-index-yyyy.mm.dd)

Any pointers or suggestions is greatly appreciated.

Thank you in advance,

You should look at ILM - https://www.elastic.co/guide/en/elasticsearch///reference/current/index-lifecycle-management.html

It's the best way forward!

I think it depend on capacity of data. for example i have webapp index with 50gb log per day, i will use webapp-yyyy.xx (one index per week).
and if i have mobileapp with 500mb log per day, i will use mobileapp-yyyy.mm.
I dont like use yyyy.mm.dd :smiley:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.