Hello,
Im collecting logs via filebeat and send them to kafka which is on cloud.
I need a proxy for my server to reach the kafka output.
Is it possible ?
There is nothing in the documentation regarding kafka output and proxy.
I tried proxy_url: ´http://ip:port´ but filebeat does not even try to use proxy.
Not expert on filebeat at all, but I see nothing in the filebeat documentation that makes it clear that the kafka output supports a proxy. The docs are explicit for (eg) filebeat's elasticsearch output (via proxy_url).
And, e.g. logstash's Kafka output explicitly does not support a proxy, as set out in its documentation.
That's at least a little suggestive that you might be unlucky here.
Thanks for the answers.
Kafka output is in azure confluence cloud.
The server running filebeat does not have direct access to the internet. We can add firewall rules to reach the kafka server (this works), but the logs are not published because it needs to communicate with other kafka brokers as well. We added rules for these brokers as well, logs of some kafka topics started to be published, but not all, since there were other brokers in use that we were not aware of. We are worried that from time to time more brokers may come / something may change and the connection will not work again. That is why we came up with a proxy that will reach all brokers - our HTTP proxy.
In the meantime, I did some more research and as you said, it cannot be used for Kafka. However, I found Kafka proxy. According to some documentation I read, this is exactly what we need.
One of the situations where this can help is - Network topologies preventing direct access to broker nodes
Don't you know if it could really solve our issue ?
Thanks
This isn't really an elastic stack / filebeat question now (that was answered).
It's more to do with your own topology, network setup, and dynamics of your environment/organisation.
That's pretty imprecise, but anyways relates to some documentation about some other, unspecified, third party tool ?
Personally speaking, when I read
then I'd consider that as a communication issue within your organization/partners. If your fear is that something will be changed that impacts you, but such changes are outwith your knowledge and control, there is no technical solution that can absolutely protect you from problems that might create.
Yes, it is a third party too. There seems to be more "Kafka proxies", one for example is from Confluent. Nevertheless as you said, it is outside the scope of elastic/filebeat.
Yes, the communication is the main issue here, there are many middlemen without enough information, therefore the best solution is to have something like a proxy that can possibly connect to all the brokers on port 9092 without needing to know their IPs.
I found the answer that HTTP proxy cannot be used. I will search for some other solution.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.