The fact is that I got an index per day (Great!), but an additional index of the current day (2017-07-07) is created with a lot of documents (looks like the sum of all the others)
Is there anyway to fix this, in order to get rid of the current-date index? This is the expected behaviour?
No, this is not expected. Do you have the configuration snippet above in a file in /etc/logstash/conf.d? Do you have any other files in that directory? Show an example of a document that ends up in the current day index.
Hi Magnus! Thanks a lot for your response!!!
At this moment, I have the following files in a "pipeline" directory. The logstash process is started pointing to that directory
Just by way of explanation, Logstash's Elasticsearch output plugin does not actually create indices. I know. It sounds confusing, right?
Logstash sends repeated batch requests of "index" requests. Each line in these bulk requests will specify the index into which is should write that event's data, in your example that will be an index named "sas-%{+YYYY.MM.dd}". Logstash derives the values for YYYY.MM.dd directly from the @timestamp field of each event as it streams by.
After that, it is up to Elasticsearch to handle the bulk requests. If the specified index does not exist, Elasticsearch will automatically create it (unless you've disabled that feature).
So, what you've described is completely normal and expected. Because filebeat does not extrapolate or convert any data, you must override the @timestamp value provided at ingest time with the converted value (which you have with your date filter).
This brings up the question, "Why am I getting that data that is going into an index with the wrong date?" The most logical explanation is that the data there is not having the @timestamp value successfully overridden by the date filter, which means that the time of ingest will be reflected in the bulk request. This will happen in the event that the grok filter and/or the date filter fail. I suggest having a look at the data in that index, which will explain what the problem is. In particular, look for a _grokparsefailure tag in the tags field.
Aaron, Magnus.... Thanks a lot for your help! I'm on holydays right now but as soon as i have a computer available, i Will give that a try
As I said before... I'm a bit confused... Because It looks like elk creates the indexes right, but creates an additional one with all the documents (or at least, many of them)
I've include an exclude regexp in filebeat, in order to exclude certain patterns. In addition, there is the one for the grok expression, but every other entry not fitting into the grok pattern, created the index for the current date.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.