Hey Guys, I have recently joined a project which is integrated with elasticsearch to store the JSON Request Response payload for each api call and transformation activities for each api call by a application. We are using default setting with below infra.
Platform: Linux
Ram: 15GB
Disk Space: 600 GB
App. Logging Activities : 4009267 (expected to grow with increase in more user to start using the API).
ElasticSearch Index: 1
"number_of_shards":"5"
"number_of_replicas":"1"
Mapping columns : 17
With these settings, we filled up our file-system within a week which causing loss of data.
- What is the best setting at elasticSearch level or more specific at index level should be used to maintain such heavy data.
- Data is required to maintain for 90 days, so in this case even if i use delete api query to delete 91st day of data and then reclaim query, it doesn't work or take long long time to process.
- If i understand correctly, deleting the index or empty the index is much much faster, hence i am thinking to have multiple indexes creating on weekly basis so that data purging can be achieved much faster. Any thoughts on this?