Hi All,
I have use case like i have to fetch data from kafka topic by using logstash and pushing to ES since i am using logstash as a forwarder to ship logs to ES in this case how we can achieve failover if one logstash is down we have to communicate with other logstash node in order to achieve failover. can cluster formation is possible in logstash like elasticsearch or please suggest me how can we achieve above scenario.
Thanks
phani
Logstash currently doesn't have any clustering ability of the kind you describe. I don't know Kafka that well, but can't you point multiple Logstash instances at the same Kafka topic and/or partition or would that cause duplication?
Hi @magnusbaeck,
no duplication will not cause becuase we have designed index like that (_id unique) but i am thinking logstash can pull each time all the data from topic ,i have the following concers to point out to multiple logstash instances.
- how can we specify the _id can be index to Es from logstash . letsay i have id attribute has some value this should be apply to _id then we can remove duplicates.
- how can we get the latest logs only from kafka or from any log file .please advise me if any parameter is there to do this from logstash side.
no duplication will not cause becuase we have designed index like that (_id unique) but i am thinking logstash can pull each time all the data from topic
Yes, that's what I meant.
how can we specify the _id can be index to Es from logstash . letsay i have id attribute has some value this should be apply to _id then we can remove duplicates.
Put document_id => "%{id}"
in your elasticsearch output configuration.
@magnusbaeck thank you i will try to point out multiple logstash instances with kafka and i will check my scenarios.
and is scroll parameter for input can help me to pull latest records each time ?