When I ingest document via logstash, I found the documents are created twice.
Setup:
Filebeat -> Logstash -> Elasticsearch
Result:
Same messages are ingested into Elasticsearch with different _id
So I tried the following testings:
Testing 1: Inget Document directly from Filebeat
i.e. Filebeat -> Elasticsearch
Result: 1 document is created
Finding: the issue should be related to logstash or elasticsearch
Are you running Logstash as a service? Do you have more than one logstash config file in the config directory? Be aware that Logstash concatenates all config files it finds in the directory into a single logical pipeline, so if you have more than one file containing the same Elasticsearch output, all data will go to all outputs. If you therefore update one to use the fingerprint but not the other, you will see exactly what you are describing.
If you want to use multiple configuration files and keep them separate you need to either control the flow through conditionals or configure them as multiple separate pipelines.
My logstash is running as a service. I installed it via yum. just upgraded the version to 6.3.2
2. When stop the logstash, I found the listen port is closed. When I start the logstash, the dedicated TCP port is opened and listening. I think only one process can open and listen with a TCP port only.
For more than one process is running... here is my though:
Case 1. More than one Filebeat Process is running
I just configure one filebeat.yml about this machine. In logstash, I can see one message is sent into logstash.
Case 2. More than one Logstash Process is running
I doubt it really works because the 2nd process should unable to be started because the TCP is listened by a process already. So I think just one Logstash process will take care the message sent via that TCP Port.
I have two conf files under /etc/logstash/conf.d for two purposes:
Logstash Config 1
Open and Listen TCP 5044
Filebeat will be send logs from nginx and apache to Logstash:5044.
Logstash Config 2.
Open and Listen TCP 5055
Filebeat will watch a folder and send logs to Logstash:5055.
I tried to disable the config 1. (i.e. just one config file in /etc/logstash/conf.d) and I found Logstash just ingest message for one time only. (i.e. "duplicated messages" is NOT found)
If I enable both config files, "duplicated messages" issue will be found.
Note:
I tried to search logstash.yml in the machine and found only one logstash.yml is found. i.e. /etc/logstash/logstash.yml.
Exactly. If you have two files there, each with an Elasticsearch output, these will be concatenated into one single pipeline and all events will go to all outputs. As mentioned earlier you need to use either conditionals or multiple pipelines to control this.
Let me double check the log in Logstash whether it will show the "concentrated" pipeline or try to warn me "same output appeared twice and a single message will be sent to same destination for more than one time".
That is how it works and it will not warn you. If you have X-Pack monitoring installed you should be able to see the resulting pipeline using the pipeline viewer and be able to verify that it contains 2 separate elasticsearch outputs.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.