Is my Data being received by Elasticsearch?


#1

I'm quite new to this ELK stack stuff, and have reached my limit on searching for an answer, but I have the issue where I'm attempting to index some logs that I'm assuming have been sent over through logstash, but I'm getting the error "Unable to fetch mapping. Do you have any indices matching the pattern?". I can perform a tcpdump and see that traffic is going to port 9200 on the loopback interface, so I'm assuming this is Logstash sending my syslog traffic to elasticsearch, as I do not have a Kibana web page pulled up that would create that traffic. The issue though is I do not know how to verify if Elasticsearch has received my data or not. In the elasticsearch.yml file, I declared a path for my 'path.data', which I assume is where the logs should go, but it doesn't appear that any of my files are growing which tells me, if I'm understanding this correctly, that Elasticsearch is not receiving the data. Does ES possibly store this data somewhere else? I also couldn't find a configuration for Kibana to direct kibana where to look for these logs other than directing them to port 9200 for ES. Is this all the configuration needed for kibana to find the data?

Also, the kibana service runs under a user named kibana, and when logging into kibana, I'm using an account created for kibana. Elasticsearch is running under a different user, so wouldn't the Kibana user need to receive permissions to read that data as well?

Also, another question for confirmation. In the logstash.conf file, there are three sections: input, filter, output. I'm assuming that Logstash only sends the data to the output configuration if the log matches the filter configuration?

Sorry for jumping around, if I need to create another thread I will, thanks!


(Mark Walkom) #2

What's your various configs look like?


#4

These are the only configurations that have been made for the logstash.conf, logstash.yml, and elasticsearch.yml.


Logstash.conf:

input {
tcp {
port => 5044
codec => "line"
type => "WindowsEventLog"
}
syslog {
type => 'syslog'
port => 5544
}
}

filter {
if [type] == 'syslog' {
syslog_pri { }
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}

}
}
output {
elasticsearch {
hosts => ["127.0.0.1:9200"]
sniffing => true
manage_template => false
index => "logstash-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
}


logstash.yml file:

path.data: /var/lib/logstash
path.config: /etc/logstash/conf.d
path.logs: /var/log/logstash


elasticsearch.yml file:

node.name: WFELK01
path.data: /home/data
path.logs: /var/log/elasticsearch
network.host: 127.0.0.1
http.port: 9200


#5

Also, to add additional information, when running a debug on the configuration file, it appears logstash is starting up correctly. Here are the entries that appear at the end:

13:11:15.586 [Api Webserver] INFO logstash.agent - Successfully started Logstash API endpoint {:port=>9600}
13:11:20.566 [Ruby-0-Thread-11: /usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:514] DEBUG logstash.pipeline - Pushing flush onto pipeline
13:11:25.565 [Ruby-0-Thread-11: /usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:514] DEBUG logstash.pipeline - Pushing flush onto pipeline
13:11:30.564 [Ruby-0-Thread-11: /usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:514] DEBUG logstash.pipeline - Pushing flush onto pipeline
13:11:35.565 [Ruby-0-Thread-11: /usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:514] DEBUG logstash.pipeline - Pushing flush onto pipeline

These entries continue on until I stop the debug. I attempted to add stdout { codec => rubydebug }, to my output, but never received any additional logs.

I'm assuming this 'Pushing flush onto pipeline' is from the syslog entries that are being sent from logstash. If I go look at the logs for elasticsearch though, but I'm not seeing an entry for creating the new index.


(Mark Walkom) #6

How do you know data is being sent to the input ports in LS?


#7

We had a Nagios Log server that we were moving away from, and one of the machines that was sending logs to it, we changed the destination for Syslog. A tcpdump confirms we are receiving traffic on that port and I can verify those ports are at least listening on our ELK stack. The only configuration change we really made, other than default configurations, is the location where data is being stored in elasticsearch. I've triple checked permissions on that directory too.


(Mark Walkom) #8

Ok, so add a stdout to your output and see if anything makes it through.


#9

I attempted this, but it doesn't appear that anything made it through. I'm in the process of rebuilding this from scratch.


(Mark Walkom) #10

Rather than go back to square one for everything, just start with a basic input and output config for LS and see what happens.


#11

I've attempted this, this was my .conf file:

input {
syslog {
type => 'syslog'
port => 5544
}
}

filter {

}
output {
elasticsearch {
hosts => ["127.0.0.1:9200"]
sniffing => true
manage_template => false
index => "logstash-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
stdout { codec => rubydebug }
}

I do not know what I'm supposed to be expecting from the rubydebug entry. The debug only spits out the 'Pushing flush onto pipeline' over and over again. There is no documentation as to what this means, besides the pipeline.rb file which appears to be only a function that gets called in a method called @flusher_thread. I do not know enough about ruby to take it any further than that.


(Mark Walkom) #12

If you are not seeing anything else then that suggests the input is not receiving data.
Rubydebug should show each event that is received.


#13

Thank you. If that is the case, I've verified that 5544 traffic is being received by the ELK stack Server, and I've verified that the local host is listening on 5544, I've flushed iptables, and have accept all for troubleshooting and have verified SElinux is disabled. What else would be causing this issue of logstash not picking up the logs?


#14

Not exactly what I was expecting but I received a handful of logs from the thousands that were sent last night that allowed me to fetch the mapping... So let me explain the setup real quick. I have a snort sensor with a local.rule that is essentially just monitoring for any network traffic to a specific machine. I had constant traffic going to this specific machine, and could see the snort sensor dumping the syslog files to the ELK stack server. This generates a very large amount of syslog data. In my frustration last night I took a short break while all these services were still running and then came up with the idea that I would just completely blow out the output configuration and ended up with a configuration file like so, got the idea from some post I found randomly:

input {
syslog {
type => 'syslog'
port => 5544
host => "192.168.1.100"
}
}

filter {

}
output {
elasticsearch {
}

stdout { }
}

I logged back into Kibana and to my surprise, the ability to index was there and everything seemed like it was working. So I went back to sleep with a feeling of accomplishment, with the knowledge that there must have been something in my output configuration that was causing this issue.

Because of how we are attempting to set this up and the end goal is to have clusters, I needed to figure out what my issue was with my output configuration. So I deleted my index, and started adding stuff one by one back into my output configuration. After the first attempt nothing happened, so I thought I had identified the issue, so I rolled back my output configuration to the configuration above that was working and nothing happened. I started and stopped again about 3 different times and never got anything to work again....

So I went back through my logstash logs and identified when I began receiving logs, and realized that my configuration never worked with the config above. It actually worked a handful of times, but only after logstash had been started for about 15 minutes. It didn't catch all the logs either, as it only caught about 3 of the hundreds that were sent between the time it actually started catching the logs and the time that I stopped to load the blank output configuration file, which was about 5-10 minutes. The configuration that was working was the:

I've attempted this, this was my .conf file:

input {
syslog {
type => 'syslog'
port => 5544
}
}

filter {

}
output {
elasticsearch {
hosts => ["127.0.0.1:9200"]
sniffing => true
manage_template => false
index => "logstash-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
stdout { codec => rubydebug }
}

So I also went back through logstash's log files when my output configuration was blank and there was nothing but the 'Pushing flush onto pipeline' messages, which I still do not know what it means.

I'm in the process of rolling back to the other configuration and I'm going to let it run for an hour or so, but does this help identify any issues? Am I sending too many logs for logstash to see due to some configuration of logstash? Or something like that?


#15

So after another full day of troubleshooting, I can not get this too work. Still looking for assistance.


(Aaron Mildenstein) #16

This is my choice for the most likely suspect. I don't see anything filling this in the Logstash config you have here, so I presume it must be upstream from the Beats. Are you 100% certain that this field is populated? If it is not, then the event will not be able to be inserted into Elasticsearch as the document type (_type) is necessary. I'm even guessing that an event without [@metadata][type] would result in mapping failures if you tried to send it to Elasticsearch.

This suggests why it worked when you had it going through an empty elasticsearch output configuration, but not with this extra configuration.

You can check in the stdout to see if the @metadata nested object has any fields in it by changing:

stdout { codec => rubydebug }

to

stdout { codec => rubydebug { metadata => true } }

This is the only way to see if the @metadata field is populated. This field is otherwise hidden in all other outputs.

All these things aside, when Elasticsearch 6.0 comes out later this year, you won't be able to have different types in the same index any more. I recommend arranging your data flow such that only one type exists per index now. You can name the document_type in the output, or have a type field (as set by type => foo in the input plugin), as that becomes the document_type if not overridden.


#17

Sorry for the late response, but I do not believe this is the case but I will attempt to try your suggestion. My wording was bad on my last explanation, but let me try to explain again. I was able to get 3 logs input of the several hundred that were sent when I had the following configuration, and none when I had a blank configuration:

input {
syslog {
type => 'syslog'
port => 5544
host => "192.168.1.100"
}
}

filter {

}
output {
elasticsearch {
hosts => ["127.0.0.1:9200"]
sniffing => true
manage_template => false
index => "logstash-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
stdout { codec => rubydebug }
}

This only worked 15-20 minutes after logstash had started working. I didn't realize it worked with this configuration as I had given up earlier on that same config, and attempt to load the blank configuration. By the time I had loaded the blank configuration the index was created from the previous config, at no time did I ever receive any logs with the blank configuration.

I'm beginning to narrow down the issue though, and I'm sure the issue has to do with either the input or the filter, as Mark had mentioned earlier. I can verify that the port is listening on the expected port through a netcat connection, as well as verifying through a netstat -ano | grep 5544, but I do not think my syslog messages are making it all the way to the output. I attempted to use the standard UDP input with a blank filter, and couldn't get any alerts pushed through there either:

Initially it was my understanding that the logstash input creates the listener, the filter parses the data, and the output ships the data to the destination. Is this true? Or does the Input have regular expressions like I'm assuming?


(Aaron Mildenstein) #18

Before I go any farther, my recommendation to drop the document_type => "%{[@metadata][type]}" and go with a statically named document_type still stands. Planning now for single-type indices will be better preparation for future releases.

That said:

To be specific, the syslog input has no processing with regular expressions on the input. In general, though, unless an input is specially coded to allow for the use of regular expressions (can't think of any off the top of my head), there is generally no processing done by filters (except codecs) on input streams. As data enters an input it gets processed in some marginal way and becomes an event as it exits the input. Filters and outputs can act on events.


#19

Before I go any farther, my recommendation to drop the document_type => "%{[@metadata][type]}" and go with a statically named document_type still stands. Planning now for single-type indices will be better preparation for future releases.

I definitely yanked this out as it was a recommendation I found in another walk-through, but didn't see any benefit to it. For now I'm chalking this ELK Stack instance up to have issues beyond repair.

I spun up another centos instance, did the filebeat route and it seems to be working for the time being, but I still can not get it to work with syslog, which will be an issue when I start including our infrastructure.


(system) #20

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.