How does ES know or FileBeat know -
- How much data to be loaded to store ? i.e. Data within this month only or Data within last 3 months will be loaded only. By loaded I meant stored for searching when a text is searched in Kiabana
- How is the data retention policy set for ES? i.e. a month of data to be stored and rest to be deleted.
Which line of code explains the above ?
Data retention policy isnt a thing elasticsearch has natively. You can use
tools like curator to manage creating daily or weekly indices.
By default how much data is loaded into ES via FB ? Is there any time frame setting to configure this ?
Elasticsearch will keep all the data you through at it. It won't delete things unless you tell it to. A tool like curator will tell it to do just that.
In addition to what @nik9000 said... each index consists of N shards and M replicas. Each shard is equivalent to one Lucene index and each Lucene index can hold roughly 2B documents. From here, you can do your own calculation in terms of how many shards one index should have when you want to have 1) one big index 2) one index per month and so on.
I want to calculate Retention rate of Apache access log data for one month. Can you help me how to get it done. I have been told to use curator , but I don't know how to do .
Can you suggest any better approach or any document / link related to curator, which can be helpful in my case ?
Any advice would be appreciated .
If you are looking to purge records using rest api calls is the easiest option. I faced some issues while configuring curator and its been a long while hence don't remember the issue.
Now, i can suggest to use the rest api calls to purge all the data.