I would like to know how much processing power does node ingestion take?
We like to preprocess our logs during sending in elasticsearch and I would like to create pipelines with grok patterns.
The problem here is that I have a lot of data. There are like 25M lines per day in elastic.
This basically boils down to the question, where you want to spend CPU time. Do you want to do it on each system before sending the logs or do you want to do it centrally. How much CPU is used really depends on the regexes you are going to use and thus is really hard to tell. On top of that there is a certain overhead of having an ingest pipeline compared to having none.
I'd go ahead with a test to check if you have enough CPU power available for 300 events per second (on average without peaks). The good thing is, you can have dedicated ingest nodes in Elasticsearch, so if you need more of those you can easily add either more nodes or add more CPUs to the existing ingest nodes. So scaling becomes easy.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.