Indexes and time to keep information

cachito · January 20, 2017, 9:56am

Hi

I am storing a lot of logs from an cloud application (architecture logs, application logs and syslogs) on Elasticsearch.

My idea was to keep -at least- logs of last 30 days on Elasticsearch to allow analysis on application performance by using Kibana dashboards. I am also using one index for all the data.

Some days ago, an engineer told me that if I keep such amount of data and use just one index, the performance will be por because Elasticsearch, if I have 1TB of data using one index, ill try to get 1TB of RAM to load such index. It sounds a Little bit weird to me then I come here to ask to experienced people.

Could somebody tell me what is the best practice? Should I Split my Elasticsearch indexes by rolling them by date or keep just one index for the whole month? Is it true that such only index will eat all the RAM?
Also, is it OK to store data on Elasticseach for historic analysis or shall I export it to a bigdata DB?

Thanks in advance
J

JKhondhu · January 20, 2017, 10:14am

Hi,

Is the last 30days data in a single days index or do you have an index for every single day spanning a total of 30 days?
What is your shard sizing per index?
How much RAM is allotted to the (physical/virtual) server?

cachito · January 23, 2017, 9:15am

Hi

Thanks for your answer.

Regarding the index, well, thats my question. We have now just one index for all the days but I am wondering/asking which is the best practice (one index for the whole month or one index per day).

Regarding the sizing, I do not have that value at the moment.

We have 8GB of RAM on the server.

Thanks in advance.
J

JKhondhu · January 23, 2017, 9:55am

Typically, what queries are you making to your current indices? Are they they type of query that looks at a single days (cloud application) log data, or a couple of days?

Are you seeing decent request and response times currently?

How much diskspace can you allot to elasticsearch data, from the overall server diskspace?

Christian_Dahlqvist · January 23, 2017, 10:02am

Explicitly deleting documents from an index, e.g. using delete-by-query, can be quite expensive, and it is often a lot cheaper to manage retention by using time-based indices and simply delete a whole index when all data in that index has exceeded the retention period. This can be automated using Curator.

system · February 20, 2017, 10:02am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How many days the data can be placed in the one index of ES? Elasticsearch	7	569	July 24, 2023
One large index vs. many smaller indexes Elasticsearch	5	10617	July 6, 2017
Elasticsearch Storage Elasticsearch	4	767	June 13, 2017
One index vs multiple indexes? Elasticsearch	7	4986	February 26, 2019
[Help!] Number of indexes and shards per node Elasticsearch	9	3435	May 5, 2017

Indexes and time to keep information

Related topics