I have a few machines that provide web services, and they use ES as a backend. Specifically, I install a node on the service machine (node.master: false, node.data: false, node.ingest: true) so I need not worry about what node to connect to. i just connect to localhost:9200
Can I do this at a larger scale, about 200+ machines?
I currently have 200+ ETL machines submitting their logs to a single ES node. There are numerous times that ES is taking over 5 minutes to respond (my timeout is set to 5min). If each ETL machine had their own node localhost node, then ingestion will not have this problem, but other problems my appear because the number of nodes.
My alternative is to have the ETL machines randomly choose a node to submit logs to, but this is not ideal: I will have to write code that will identify what nodes are available, and manage that over time, because most nodes are ephemeral.
Thank you, i figured there may be a "few reasons".
Putting ingestion behind a load balancer is also an option; it very similar to setting up a ingestion node just for logs. I just wanted to avoid yet-another-moving-part in the constellation of machines.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.