I have a lot of application logs to collect, and data conversion is like, filebeat => kafka => logstash => elasticsearch ,Everything is smooth, But sometimes the application log will suddenly increase, such as when the user volume becomes very large.Kafka produces much faster than logstash,I want to change that but I don't know where to start.
Logstash can only send as fast as Elasticsearch can accept the data. How have you determined that Logstash is the bottleneck and not Elasticsearch?
I tried to do a stress test on Elasticsearch (use "esrally"), and the result is that the write speed will be twice or more than the logstash write.So I began to wonder if the bottleneck at Logstash affected the write.
Did you test with the type of data you are indexing or with one of the standard tracks? How do you measure indexing throughput?
- I think I may not be testing enough, I used a simpler structure, the field will be less than the real, and some of the word-breakout fields are not taken into account. I feel like I need to test it with data close to the real and give you an answer.
- "esrally" usually gives post-test reports, and kibana's monitoring can be seen
If you use documents per second as a measurement this will vary quite a lot depending on the document size as Elasticsearch need to do more work for larger documents and there is more disk I/O for larger documents too.
I see what you mean, maybe as you said, my document is sometimes larger, such as the nginx log or java program's log stack, which may affect the larger
I think I should try to try to test again with near-real data and observe the performance of elasticsearch, and before that, thank you very much for your answer, I may ask you again after the test:100:
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.