we would like to deal with one year of various source of logs. that range from 80^6 events per day to 1000 per day with kibana on top to have reporting and dashbord of activity from those different sources.
we plan to have different sources of input so at the end we will have to deal with a collection of 365*5 index
all index collection will be standardized as much as possible in terms of idexed fields
as a POC we tried to index 6 month of logs from only one source and hited "memory heap" and "too many files open" and encountered some latency in the kibana search.
as far as I understand ES + kibana is most used for short time analysis not realy for long lasting log analysis.
does ES is suitable for this kind of task and will it support thhis kind of scaling
and what will be the best architecture we can eploy to cover this kind of task.
currently I am testing it on my desktop machine (16Gb ram, 8cpu) and I have set up a cluster with 4 nodes
this is the initial setup for the toy study and to go further we will scale up over various VM in order to set up a more robust cluster.
I try to push the toy case as far as possible and stress it to check the robustness.
How many shards do you have in the cluster and what is the average shard size? Each shard in Elasticsearch is a separate Lucene index and carries with it a certain amount of memory and file descriptor overhead. For logging use cases a reasonable shard size is often from a few GB to tens of GBs, although we generally recommend keeping it below 50GB as very large shards can have a negative impact on recovery. If your shards are quite small, it may make sense to have applications/streams share indices (assuming mappings allow this), reduce the number of shards for the indices or even go from daily indices to weekly or monthly. One of the benefits of using time-based indices is that you can change the number of shards for an index for the next period if volumes change.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.