Hi, I am planning to deploy metricbeat to monitor servers' basic metrics. Is it fine to ship metricbeat data to data node directly (without logstash or kafka) under this situation? Anything I should take into consideration as well?
Elasticsearch/metricbeat version: 7.8.1
Data node: 3
Monitoring servers: 4000
Monitoring server platform: Linux & WIndows
Metrics to collect: "system" dataset only
Metric sampling rate: 1 min.
Yes metricbeat directly ingesting to an elasticsearch cluster is fine and a supported architecture.
At your scale you will want to have a well designed / architected cluster with perhaps some extra capacity and perhaps some ingest nodes to absorb spikes and back pressure.
You can also do some tuning on both the beats and elasticsearch side to help with spikes and back pressure.
Metrics tend to less spikey than logs.. but some can still happen during network issues etc.
If I were you I would POC with 20+ hosts with metricbeat and observe the amount of storage needed daily for a few days to a week.
Then calculate the number of nodes you will need for storage to ingest and retain the metrics and make sure you have some extra capacity.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.