Bonjour,
un peu trop souvent, logstash n'arrive pas à traiter convenablement ce qu'il reçoit en entrée, et ainsi fausse les résultats dans elastic.
Par exemple, je lui donne ceci :
root@s00vl9925220:/apps/toolboxes/backup_restore/logs/temp # cat tiths532.crs.160210.csv
tiths532;TSM-ARZ203;2016/02/10;02:25:32;00:24:59;607319;10690;4;1226.82;0%;4884.48;Voir fichier d rreurs;;TSM_backup_INCR_APPLI_20160210_3378.log;;INCR_APPLI;dsm_BT.opt;v4.0.0
tiths532;TSM-ARZ203;2016/02/10;02:30:01;00:03:25;606922;90;3;33.41;0%;181.68;Voir fichier derreurs;;TSM_backup_INCR_APPLI_20160210_3378.log;;INCR_APPLI;dsm_BT.opt;v4.0.0
le filtre qui est appliqué :
if [source] =~ "backup_restore" {
if [type] == "crs" {
# cas pour les fichiers crs
csv {
separator => ";"
columns => [ "host","instance","date_injection","heure_injection","duree","inspected","backup","failed","transfert_time","taux_cp","volume","sauve","classarch","log","app_save","save_mode","dsm_opt","version_tbx" ]
}
}
if [type] == "err" {
# cas pour les fichiers err
csv {
separator => ";"
columns => [ "host","instance","date_injection","heure_injection","codeans","poids","fichier_err" ]
}
}
date {
match => [ "date_injection", "YYYY/MM/DD", "YY/MM/DD" ]
}
if "_dateparsefailure" in [tags] {
drop { }
}
mutate {
lowercase => [ "host" ]
gsub => [ "date_injection", "/", "-" ]
add_field => { "timestamp" => "%{date_injection} %{heure_injection}" }
remove_field => [ "date_injection","heure_injection" ]
}
}
date {
match => [ "timestamp", "YYYY-MM-dd HH:mm:ss" ]
timezone => "Europe/Paris"
target => "@timestamp"
remove_field => [ "timestamp" ]
}
mutate {
remove_field => [ "[beat]","input_type","offset" ]
}
et la log debug :
{
"message" => ".0.2",
"@version" => "1",
"@timestamp" => "2016-02-10T14:10:44.678Z",
"count" => 1,
"fields" => nil,
"source" => "/apps/toolboxes/backup_restore/logs/s00va9933715.crs.160209.csv",
"type" => "crs",
"host" => ".0.2",
"tags" => [
[0] "beats_input_codec_plain_applied",
[1] "_dateparsefailure"
],
"timestamp" => "%{date_injection} %{heure_injection}"
}
Un traitement normal donnerait ceci :
{
"message" => "s00va9933715;TSM-ARZ108;2016/02/09;07:06:04;00:01:12;15;15;0;70.96;0%;0.285547;;ARCH01M;TSM_archive_MBA1FRP0_30J_ListOra_20160209_24838318.log;;Ora_MBA1FRP0;dsm.opt;v5.0.2",
"@version" => "1",
"@timestamp" => "2016-02-09T06:06:04.000Z",
"count" => 1,
"fields" => nil,
"source" => "/apps/toolboxes/backup_restore/logs/s00va9933715.crs.160209.csv",
"type" => "crs",
"host" => "s00va9933715",
"tags" => [
[0] "beats_input_codec_plain_applied"
],
"instance" => "TSM-ARZ108",
"duree" => "00:01:12",
"inspected" => "15",
"backup" => "15",
"failed" => "0",
"transfert_time" => "70.96",
"taux_cp" => "0%",
"volume" => "0.285547",
"sauve" => nil,
"classarch" => "ARCH01M",
"log" => "TSM_archive_MBA1FRP0_30J_ListOra_20160209_24838318.log",
"app_save" => nil,
"save_mode" => "Ora_MBA1FRP0",
"dsm_opt" => "dsm.opt",
"version_tbx" => "v5.0.2"
}
J'ai lotissé par lot de 2500/3000 logs environ, soit un total de 26847 logs en tout. J'ai 5% de rejet (+5000 hits). J'ai bien des logs dont le contenu ne matche pas, mais sur du contenu OK, je vois pas pourquoi ça devrait claquer.
La JVM pèse 500M, peut-être n'est pas assez ?
D'ailleurs j'ai beau avoir un drop dans le filter
if "_dateparsefailure" in [tags] {
drop { }
}
ou un test dans l'output, c'est quand même envoyé à elastic ...
if [type] == "crs" or [type] == "err" {
if "_dateparsefailure" not in [tags] or "_csvparsefailure" not in [tags] {
elasticsearch {
hosts => "10.255.55.91"
action => "index"
index => "sibr"
document_type => "%{[@metadata][type]}"
}
}
Any idea ??