I have deploy an elastic stack mostly using just kibana and elasticsearch (3 master, 5 data, 5 ingest nodes) version 7.5 . I have about 20 application servers and that I installed the beats on each of these servers. It seems to work for a while but after a week or so the beats would stop send beats data and store it locally. This cause the application servers to crash because it would consume majority of the disk drive (use default beats configuration with only modification to elasticsearch and kibana host url).
I should also mention that the elasticsearch cluster, each individual host disk space still have enough disk space.
I didn't seem to have this issue before on older edition. Does this have to do with how I configure the beats or is this due to something else?
I am see that the logs are store /var/logs on the application server. however the elasticsearch still have over 40% space remainding and I have a serverless function to expand the ebs of the elasticsearch servers.
I don't think it is the application fault because we ran the application for over a year now and it didn't have an effect and when we turn off the beats and remove the logs that are stored locally. The application server would start working fine and didn't display any issues.
I am using the 7.2 version for elasticsearch, kibana and beats.
Do you mean that Beats logs are stored in /var/logs and they take all the disk space?
This shouldn't happen, could you check what is the size of Beats logs, and if there is something flooding the logs?
How are you deploying Beats in these machines? If you are deploying them in docker, or in machines with systemd you can configure Beats to log to standard output/error and then they don't log to files.
The beats are stored in /var/logs/filebeats/ or what their corresponding beats name. They are rpm install onto the machine. The some of beats logs size were as big as 58GB. The beats messages says the elasticsearch is no longer available. Even though we can see elasticsearch still have space and plenty of room still.
We have destroy the environment and respin a new elasticsearch cluster. but I am just trying to figure out what would have cause this to happen and how to avoid it.
I think you should focus on investigating this, beats are going to continue logging errors if the problem connecting with Elasticsearch persists.
It is weird in any case that the beats logs reach 58GB, they should be rotated before according to the default log settings Configure logging | Metricbeat Reference [7.5] | Elastic Have you tried to modify these logging options?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.