Logs are overwritten in the specified index under the same _id

madhanbaskar · May 22, 2019, 10:59am

Hi There,

I'm using Logstash - 6.5.1 and elasticsearch - 6.5.1.

Below is my Filebeat.yml

filebeat.prospectors:

type: log
paths:
- var/log/message
  fields:
  type: apache_access
  tags: ["ApacheAccessLogs"]
type: log
paths:
- var/log/indicate
  fields:
  type: apache_error
  tags: ["ApacheErrorLogs"]
type: log
paths:
- var/log/panda
  fields:
  type: mysql_error
  tags: ["MysqlErrorLogs"]
  output.logstash:
The Logstash hosts

hosts: ["logstash:5044"]

Below is my logstash config file -

input {
beats {
port => 5044
tags => [ "ApacheAccessLogs", "ApacheErrorLogs", "MysqlErrorLogs" ]
}
}
filter {
if "ApacheAccessLogs" in [tags] {
grok {
match => [
"message" , "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}",
"message" , "%{COMMONAPACHELOG}+%{GREEDYDATA:extra_fields}"
]
overwrite => [ "message" ]
}
mutate {
convert => ["response", "integer"]
convert => ["bytes", "integer"]
convert => ["responsetime", "float"]
}
geoip {
source => "clientip"
target => "geoip"
add_tag => [ "apache-geoip" ]
}
date {
match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
remove_field => [ "timestamp" ]
}
useragent {
source => "agent"
}
}
if "ApacheErrorLogs" in [tags] {
grok {
match => { "message" => ["[%{APACHE_TIME:[apache2][error][timestamp]}] [%{LOGLEVEL:[apache2][error][level]}]( [client %{IPORHOST:[apache2][error][client]}])? %{GREEDYDATA:[apache2][error][message]}",
"[%{APACHE_TIME:[apache2][error][timestamp]}] [%{DATA:[apache2][error][module]}:%{LOGLEVEL:[apache2][error][level]}] [pid %{NUMBER:[apache2][error][pid]}(:tid %{NUMBER:[apache2][error][tid]})?]( [client %{IPORHOST:[apache2][error][client]}])? %{GREEDYDATA:[apache2][error][message1]}" ] }
pattern_definitions => {
"APACHE_TIME" => "%{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{YEAR}"
}
remove_field => "message"
}
mutate {
rename => { "[apache2][error][message1]" => "[apache2][error][message]" }
}
date {
match => [ "[apache2][error][timestamp]", "EEE MMM dd H:m:s YYYY", "EEE MMM dd H:m:s.SSSSSS YYYY" ]
remove_field => "[apache2][error][timestamp]"
}
}
if "MysqlErrorLogs" in [tags] {
grok {
match => { "message" => ["%{LOCALDATETIME:[mysql][error][timestamp]} ([%{DATA:[mysql][error][level]}] )?%{GREEDYDATA:[mysql][error][message]}",
"%{TIMESTAMP_ISO8601:[mysql][error][timestamp]} %{NUMBER:[mysql][error][thread_id]} [%{DATA:[mysql][error][level]}] %{GREEDYDATA:[mysql][error][message1]}",
"%{GREEDYDATA:[mysql][error][message2]}"] }
pattern_definitions => {
"LOCALDATETIME" => "[0-9]+ %{TIME}"
}
remove_field => "message"
}
mutate {
rename => { "[mysql][error][message1]" => "[mysql][error][message]" }
}
mutate {
rename => { "[mysql][error][message2]" => "[mysql][error][message]" }
}
date {
match => [ "[mysql][error][timestamp]", "ISO8601", "YYMMdd H:m:s" ]
remove_field => "[apache2][access][time]"
}
}
}

output {
if "ApacheAccessLogs" in [tags] {
elasticsearch { hosts => ["elasticsearch:9200"]
index => "apache"
document_type => "apacheaccess"
}
}
if "ApacheErrorLogs" in [tags] {
elasticsearch { hosts => ["elasticsearch:9200"]
index => "apache"
document_id => "apacheerror"
}
}
if "MysqlErrorLogs" in [tags] {
elasticsearch { hosts => ["elasticsearch:9200"]
index => "apache"
document_type => "sqlerror"
}
}
stdout { codec => rubydebug }
}

The data is sent to elastic search but only 3 records are getting created for each document_id in the same index.

Only 3 records are created and every new logs incoming are overwritten onto the same document_id and the old one is lost.

Can you guys please help me out? @magnusbaeck

Christian_Dahlqvist · May 22, 2019, 11:39am

You are specifying multiple document types for the same index which would cause errors for recent Elasticsearch versions. You also have a fixed document id specified for one output which will cause the same document to be updated repeatedly.

madhanbaskar · May 22, 2019, 11:43am

@Christian_Dahlqvist - What is the best way to split data, which field will help me out instead of document_type or document_id?

Also my exact output block is -

output {
if "ApacheAccessLogs" in [tags] {
elasticsearch { hosts => ["elasticsearch:9200"]
index => "apache"
document_id => "apacheaccess"
}
}
if "ApacheErrorLogs" in [tags] {
elasticsearch { hosts => ["elasticsearch:9200"]
index => "apache"
document_id => "apacheerror"
}
}
if "MysqlErrorLogs" in [tags] {
elasticsearch { hosts => ["elasticsearch:9200"]
index => "apache"
document_id => "sqlerror"
}
}
stdout { codec => rubydebug }
}

I'm using only the document_id! How can I write my output block inorder to avoid overwriting?

Christian_Dahlqvist · May 22, 2019, 1:06pm

You can not use document id that was as it is a unique identifier for each document. Remove it and let Elasticsearch assign it.

madhanbaskar · May 22, 2019, 1:28pm

@Christian_Dahlqvist : Thats Right... But what if there are 2 fields of the same name from 2 different sources? That will clash right?

I need to put all the data into one index & I should have another field which helps me to segragate data of one source from other 2 sources..

Badger · May 22, 2019, 2:13pm

What is driving this requirement? You can run queries against multiple indexes. You are likely to be better off keeping different document types in different indexes. You should read about why document types are being removed from elasticsearch.

madhanbaskar · May 23, 2019, 7:38am

@Badger : Ok I shall use separate Index for each source.

Thanks! Specifying individual index works well!!

system · June 20, 2019, 7:38am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Output logs to multiple index from logstash Logstash	7	5836	December 20, 2019
Logstash sending same message to multiple indexes in ES Logstash	2	2921	August 9, 2017
Seperate indexes for each log path field error Logstash	4	374	January 15, 2020
Multiple Elasticsearch types output for a same log Logstash	8	1681	July 6, 2017
Filebeat output to logstash Beats filebeat	17	2177	June 20, 2017

Logs are overwritten in the specified index under the same _id

The Logstash hosts

Related topics