Whenever I am trying to index log data, some of the log data is being wrongly indexed into a new index. The logstash configuration file is as follows:
input {
file {
path => "/home/user/DATA/*.log"
start_position => "beginning"
#sincedb_path => "/dev/null"
}
}
filter {
if [message] =~ "^#" {
drop {}
}
grok {
match => ["message", "%{TIMESTAMP_ISO8601:log_timestamp} %{WORD:serviceName} %{WORD:serverName} %{IP:serverIP} %{WORD:method} %{URIPATH:uriStem} %{NOTSPACE:uriQuery} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientIP} %{NOTSPACE:protocolVersion} %{NOTSPACE:userAgent} %{NOTSPACE:cookie} %{NOTSPACE:referer} %{NOTSPACE:requestHost} %{NUMBER:response} %{NUMBER:subresponse} %{NUMBER:win32response} %{NUMBER:bytesSent} %{NUMBER:bytesReceived} %{NUMBER:timetaken}"]
}
date {
match => [ "log_timestamp", "YYYY-MM-dd HH:mm:ss" ]
timezone => "UTC"
}
mutate {
convert => ["bytesSent", "integer"]
convert => ["bytesReceived", "integer"]
convert => ["timetaken", "integer"]
remove_field => [ "log_timestamp", "serviceName", "serverName", "serverIP", "port", "username", "protocolVersion", "requestHost", "subresponse", "win32response"]
}
}
output {
elasticsearch {
index => "log-%{+YYYY.MM.dd}"
}
}
The contents of the data folder are as follows.
u_ex180101.log u_ex180105.log u_ex180109.log u_ex180113.log u_ex180117.log
u_ex180102.log u_ex180106.log u_ex180110.log u_ex180114.log
u_ex180103.log u_ex180107.log u_ex180111.log u_ex180115.log
u_ex180104.log u_ex180108.log u_ex180112.log u_ex180116.log
And when I run logstash the indexes being created are :
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open log-2018.01.12 3DHyNchJTVaSQvtAiZ6TuA 5 1 1362426 0 1gb 1gb
yellow open log-2018.01.17 ltxUNXFJSFyJVAVz_KG_YQ 5 1 1155733 0 864mb 864mb
yellow open log-2018.01.05 WBwmPX3nQF2_ZvhkTPZ2zA 5 1 1552917 0 1.1gb 1.1gb
yellow open log-2018.01.10 MKJRVuPmQ4qjnuqvwu9_YA 5 1 1391996 0 1gb 1gb
yellow open log-2018.01.04 6jpc83lURHuGI8KkZPBMSg 5 1 1489481 0 1.1gb 1.1gb
yellow open log-2018.01.08 ownjIazZRCCl63EYC6gwwA 5 1 1379748 0 1gb 1gb
yellow open log-2018.01.11 n5adTn1MS5SK-Jiq7LPEXg 5 1 1337156 0 1015.6mb 1015.6mb
green open .kibana KPWKemzqR86CFvXAcaWv5A 1 0 30 2 48.8kb 48.8kb
yellow open test-2018.01.17 9XHtlc9URRq-MteTiW30MA 5 1 1336650 0 1018mb 1018mb
yellow open log-2018.02.16 Ylm_0gphQq-ywc4eHK113w 5 1 2934940 0 955.7mb 955.7mb
yellow open log-2018.01.03 Aa6K-Y-pQMKYeCiJZvYh3A 5 1 1511993 0 1.1gb 1.1gb
yellow open log-2018.01.02 -uF0z-2VS_2s0rhgJeTh6A 5 1 203437 0 159.8mb 159.8mb
yellow open test2-2018.02.20 xAyQ7xjfSXOWyx0A2O8KXg 5 1 5 0 32.2kb 32.2kb
yellow open log-2018.01.06 OH-V5lFpT4al0DnPCvA4mA 5 1 1675121 0 1.2gb 1.2gb
yellow open log-2018.01.14 4NlViLLqRPiSd996-KSjJA 5 1 1314180 0 997mb 997mb
yellow open test-2018.02.20 IE7moiPPTkSA4cl18NQzog 5 1 5 0 32.2kb 32.2kb
yellow open log-2018.01.15 7Ahq-r8SREepwKEs-Uv7BA 5 1 1264421 0 968.9mb 968.9mb
yellow open test2-2018.01.16 bg2zFOZdQmWnElWLJVSNPA 5 1 180917 0 145.4mb 145.4mb
yellow open log-2018.01.07 KIdM2tr1QVGmSCOjMo68WQ 5 1 1366416 0 1gb 1gb
yellow open log-2018.01.13 YTjQscXeT66a97c1YtIabQ 5 1 1338467 0 1022.8mb 1022.8mb
yellow open test2-2018.01.17 4nEBJvXsR1GcF6rB92Hj0w 5 1 1155733 0 868.9mb 868.9mb
yellow open log-2018.01.09 njq_Ju3QQkuJPDoOHqMXoQ 5 1 1449409 0 1gb 1gb
yellow open log-2018.01.16 Zs8FQwCDRpKgyf87x1fODA 5 1 1312545 0 998.2mb 998.2mb
For every log file some of its contents are being indexed into a new index of current date. And u_ex180101.log's is altogether sent to a new index of current date as well. After that I did some testing and for every log file around 5 logs are being sent to a new index of current date. Why is this happening? And what do I need to do to prevent this from happening?
Thanks in advance.