Filebeat: Wrong index structure

Hi.

I am having some difficulties with the indexing of filebeat instead of using the "filebeat-YYYY.MM.dd" it sporadically indexes as %{[@metadata][beat]}-%{+YYYY.MM.dd}.. like it does not handle the metadata:

** curl 'localhost:9200/_cat/indices?v'**

health status index pri rep docs.count docs.deleted store.size pri.store.size
green open %{[@metadata][beat]}-2016.01.06 5 1 15 0 132.3kb 66.1kb
green open filebeat-2016.01.07 5 1 2257 0 1.9mb 978.6kb
green open .kibana 1 1 4 2 49.8kb 24.9kb
green open filebeat-2016.01.06 5 1 772300 0 567.5mb 284mb
green open %{[@metadata][beat]}-2016.01.07 5 1 2 0 18.3kb 9.1kb

Logstash output conf:
> output {
> elasticsearch {
> hosts => [somehosts]
> manage_template => false
> index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
> document_type => "%{[@metadata][type]}"
> sniffing => true
> }
> }

filebeat conf:

> ################### Filebeat Configuration Example #########################

> ############################# Filebeat ######################################
> filebeat:
>   # List of prospectors to fetch data.
>   prospectors:
>     -
>       paths:
>       # - /var/log/*.log
>        - C:/tools/wildfly-8.2.0.Final/standalone/log/server.log
>        - C:/tools/wildfly-8.2.0.Final/standalone/log/systemevent.log
>        - C:/tools/wildfly-8.2.0.Final/standalone/log/access.log
>       encoding: latin1

>       input_type: log

>       force_close_files: true

>   registry_file: "C:/ProgramData/filebeat/registry"


> ############################# Output ##########################################
> # Configure what outputs to use when sending the data collected by the beat.
> # Multiple outputs may be used.
> output:


>   ### Logstash as output
>   logstash:
>     # The Logstash hosts
>     hosts: ["mylogstashhostsip:andport"]

>     tls:
>       # List of root certificates for HTTPS server verifications
>       certificate_authorities: ["C:/Tools/filebeat-1.0.1/cicerologs_ca_systematic_com.crt","C:/Tools/filebeat-1.0.1/somecert.crt"]




> ############################# Logging #########################################

> logging:

>   to_files: true

>   files:

>     path: C:/logs/filebeat

>     name: filebeat.log

>     rotateeverybytes: 10485760 # = 10MB

Any insight??

Best Regards
Peter

you directly have filebeat -> logstash -> elasticsearch?

can you share your full logstash config (hope it's not too much). Filebeat always outputs [@metadata][beat] and [@metadata][type]. In Logstash, if fields are missing, the 'original pattern' is used. This makes me wonder if you've non filebeat inputs or if some filter is overwriting the @metadata. Somewhere in logstash @metadata get's lost.

@Steffens.

Yeah directly filebeat --> Logstash --> El

i just merged the configuration to this: (i will just add that it made the indexes before i added the filters as well)

> input {
> 	beats{
> 	 codec => multiline {
> 		pattern => "(^[a-zA-Z.]+(?:Error|Exception): .+)|(^\s+at .+)|(^\s+... \d+ more)|(^\s*Caused by:.+)"
> 		what => "previous"
> 		}
> 	 port => 3515
> 	 ssl => true
> 	 ssl_certificate => "/etc/pki/tls/certs/somecert.crt"
> 	 ssl_key => "/etc/pki/tls/private/somepriv.key"
> }
> }

> filter {
>         mutate {
>                 add_field => { "logstash_host" => "the logstash hostname for debug purposes"  }
>         }
> }

> filter {
>   if [source]{
>         grok {
>                 match => { "source" => '(?<log_type>[^\\/:+?"]+$)' }
>   }
> }
> }

> output {
>   elasticsearch {
>         hosts => ["Somehosts"]
> 	manage_template => false
> 	index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
> 	document_type => "%{[@metadata][type]}"
> 	sniffing => true
> }
> }

Makes me wonder if multiline drops the @metadata.

Will move to topic logstash forum.

It believe it did this before we added the multiline as well.

*edit: My colleague confirms that the indexes was seen before the multiline filter was added to the configuration as well.

I have the exact same issue.
It seems like it is only some messeges send from filebeat that does not contain the metadata information.
It is, as far as I know, only an issue when this tag is present: beats_input_flushed_by_end_of_connection

Did anyone ever get an answer to this?

I did not get any answer, however we are currently updated to Filebeat 1.2 and hopefully this will fix it.

Hi @byoung0589

After running with logstash and elasticsearch for a while we have seen a pattern, that every time we restart one of our logstash servers it creates this index, when the logstash host is up to speed it allocates the data correct.

So, I am not actually experiencing the "index" problem, but what I am seeing is that the tags on certain log messages are being set with "beats_input_flushed_by_end_of_connection". This is completely overwriting the tags for some of the logs I am sending and makes searching by tags a complete PITA. I have googled the crap out of this, and can not find a clear indication as to why this is happening.

I do have a Logstash->Elasticsearch setup, but what's funny is most of my logs are coming in fine. But! I have not found a pattern as to why some of my logs are coming in with this new tag "beats_input_flushed_by_end_of_connection"

Does anyone know why this is happening?

Hmmm.... I am also seeing this "multiline_codec_max_lines_reached"

So I tracked the error down to being inside the logstash-input-beats ruby gem. It seems to be called when there is a new thread connection, but I am not sure why anyone would want to overwrite the exiting tags with this message on a new thread? Kinda lame.

I am having the same issue with filebeat 5.0.1 and logstash 5.0.1
Most messages are fine, but some are missing all metadata

can i ask plz ?
what is (codec => multiline {

  pattern => "(^[a-zA-Z.]+(?:Error|Exception): .+)|(^\s+at .+)|(^\s+... \d+ more)|(^\s*Caused by:.+)"
  what => "previous"
  })

and ( mutate {

            add_field => { "logstash_host" => "the logstash hostname for debug purposes"  }
    })

what do they do i want to understand how can i configure a filter with them .
thank you


  1. a-zA-Z. ↩︎