I need a help on my setup mentioned below. Can someone please share some insight [ I tried searching a lot through previous discussions but unable to get the exact details hence posting it here]
I have around 5-6k Nodes/devices/servers - I am planning to use
1 Logstash instance
3 ES instances [ one dedicated master node, other data as well as master]
1 Kibana
My main concerns as as below
Where do I configure kafka broker? On logstash?
Is there any good information available or How to? for integrating logstash with Kafka
How does my single instance of logstash connects to ES cluster?
Are there any other steps that I need to perform for data resiliency?
Logstash is a data pipeline engine that allows you to configure one or more data pipelines, each of which has at least one input, zero or more filters, and at least one output.
If you have a kafka topic that you wish to subscribe to, I suspect you would create a Logstash pipeline that looks something like this:
input {
kafka {
# kafka connection config
# see: https://www.elastic.co/guide/en/logstash/6.7/plugins-inputs-kafka.html
}
}
filter {
# one or more filters to mutate and/or enrich the events
# see: https://www.elastic.co/guide/en/logstash/6.7/filter-plugins.html
}
output {
elasticsearch {
# elasticsearch connection config
# see: https://www.elastic.co/guide/en/logstash/6.7/plugins-outputs-elasticsearch.html
}
}
Typically you would list several nodes in a cluster, which may or may not be masters. The output will balance the requested between them, and each Elasticsearch node will route the requests to the best node to handle the indexing.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.