I was searching on Google and this forum but I am still undecided.
I would like to develop a system which collects logs from several servers and then analyze them.
I thinked about Beats to collect logs (and maybe Logstash to parse logs or whatever) and Elastic as a centralized store. Then, I would use Spark for reading from ES and processing data to create some machine learning models.
But, during the previous search, some people use Spark to preprocess the data before to write into ES. Why do not they use Logstash for that? Is Spark better than Logstash?
Is there any way to know a possible bottleneck in this kind of steps?