I have built a test environment with 1 master node, 2 data/master nodes, and 2 logstash client nodes (no master / no data).
The logstash client nodes are configured to send output using elasticsearch with the http protocol and send to the local host. I have elasticsearch on each of the logstash servers running and pointing to the 3 servers (master, data1, and data2).
elasticsearch {
host => "logstash1"
cluster => "elasticsearch"
protocol => "http"
}
When I look at the elasticsearch log on the 2 logstash servers, I see 'disconnected from [[master] due to explicit disconnect call. When I look at the master, I see '[[master] timed out waiting for all nodes to process published state [1010]...
The logstash nodes keep getting dropped and then added again about once or twice every two hours. Is this normal behavior?
There is a firewall between the logstash nodes and the ES servers. I have read that can cause issues, but I have tried setting the timeout window thresholds higher and setting up a heartbeat plugin. Nothing has stopped the disconnects.
What am I missing? Is there a better way to do this?
Also don't include the master nodes in the list. You can also try the transport protocol which uses port 9300. This may or may not stop your firewall from interfering.
One more note, you should change the cluster name to something else then elasticsearch to avoid any issues with unknown nodes joining your cluster accidentally.
Thank you. I was running elasticsearch on the logstash nodes as well to make the connection to the cluster. But with this information, I now see that it is unnecessary and I can use the elasticsearch to take care of the connection.
I do have one other question. In the documentation for the elasticsearch plugin, it says:
"If you want to set other Elasticsearch options that are not exposed directly
as configuration options, there are two methods:
Create an elasticsearch.yml file in the $PWD of the Logstash process
Pass in es.* java properties (java -Des.node.foo= or ruby -J-Des.node.foo=)"
I need to use the network.publish_host setting for elasticsearch (so that the other nodes can communicate back with the Logstash server). Given the above, I know that I have to place the elasticsearch.yml file in the $PWD but is there anything else I need to do? Does the elasticsearch plugin automatically use the file if it is there, or do I have to specify it somewhere?
Right now, I am getting 'failed to send join request to master [[master1]... nested: ConnectTimeoutException...' I am guessing that is because I had to set the network.publish_host and network.bind_host settings in elasticsearch.yml file before to get it to work.
Where logstash looks for elasticsearch.yml can be answered by using strace like this:
$ pwd
/opt/elastic/logstash
$ strace -f -s99 -o strace.log /opt/elastic/logstash/bin/logstash -f complete.conf
$ grep elasticsearch.yml strace.log
8959 stat("elasticsearch.yml", 0x7fc2cd4bdb60) = -1 ENOENT (No such file or directory)
8959 stat("/opt/elastic/logstash/config/elasticsearch.yml", 0x7fc2cd4bdb60) = -1 ENOENT (No such file or directory)
8959 stat("/opt/elastic/logstash/elasticsearch.yml", 0x7fc2cd4bce20) = -1 ENOENT (No such file or directory)
8959 stat("/opt/elastic/logstash/config/elasticsearch.yml", 0x7fc2cd4bce20) = -1 ENOENT (No such file or directory)
So its better to say that logstash looks in the $CWD of where you start logstash. So in this instance it would be /opt/elastic/logstash. If you are not sure what your CWD of your logstash process is, check ls -l /proc/logstash_pid/cwd.
Thanks for the clarification about $CWD and not $PWD. I am starting logstash as a service using an init script, so with the information you gave me, I was able to determine that I needed to add a cd "/opt/elastic/logstash" at the top of the init script to set the $CWD properly. It is now picking up the elasticsearch.yml file.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.