i want to get the result of those 3 patterns in elastic search in one line whish have
the informations
but the problem is i get each pattern in a single line and in the table of elasticearch i have 3 lines
someeone help me to add a command to regroup the result of parsing in one line
Please read what you posted, it is really hard to read. I do not understand why you have a "Extraction\sbatch\sID" when none of your messages match it.
It sounds an aggregate filter might be able to do this.
For the grok, personally I would prefer dissect, but that's not a big deal. Do a grok to split the timestamp off the message, then do a grok to ignore the "ExtractDwhData" and ignore the log level. Then you really do have unstructured data that grok might work for.
if "_grokparsefailure" in [tags] {
drop {}
}
This tends to be a bad idea. If your configuration fails to parse the data then you will usually be better off tagging it for review than dropping it.
The reason I prefer not to use grok is that a GREEDYDATA anywhere except at the end of the message, such as
can get really expensive. The regexp processor will have to step through the message one character at a time seeing if the rest of the message matches. This can lead to timeouts.
Also, removing temporary fields (message2 etc.) should be deferred until you know the patterns are working. And do not even try to index them into elasticsearch until you get good output when you run logstash on the command line with 'output { stdout { codec => rubydebug } }'.
2018-02-07T10:42:06,865 [ ExtractDwhData] [INFO ] Extraction batch ID : 28
2018-02-07T10:42:02,832 [ ExtractDwhData] [INFO ] Starting DWH Data Extraction with run timestamp : 2018-02-07 10:42:02
2018-02-07T12:24:45,167 [ ExtractDwhData] [INFO ] Solife DWH data EXTRACTION finished in 1 hours, 42 minutes, 42.368 seconds
$ those are the correct lines i made a mistake
thanks for the answer but i have a log in which there is 1600 lines
i put those three lines because
those where i have the informations
yes i want the result in one line in the table of the index
not threee lines in which i can had an empty colons in some lines
so how my code will be if i use the filter aggregate ? thanks a lot
If there are ever multiple extractions occuring then this will not work, since there appears to be nothing in the log messages allowing you to correlate which is which.
This would allow you to combine those three lines.
filter {
dissect { mapping => [ "message", '%{ts} [%{f1}] [%{loglevel}] %{text}' ] }
mutate { add_field => { "static" => "1" } }
if [text] =~ /^Extraction batch ID/ {
grok { match => [ "text", "Extraction batch ID : %{NUMBER:ID_extraction_globale}" ] }
aggregate {
task_id => "%{static}"
code => "map['id'] = event.get('ID_extraction_globale')"
}
drop {}
}
if [text] =~ /^Starting DWH Data Extraction with run timestamp/ {
grok { match => [ "text", "Starting DWH Data Extraction with run timestamp : %{TIMESTAMP_ISO8601:runtimestamp}" ] }
aggregate {
task_id => "%{static}"
code => "map['runtimestamp'] = event.get('runtimestamp')"
}
drop {}
}
if [text] =~ /^Solife DWH data EXTRACTION finished/ {
mutate { gsub => [ "text", "Solife DWH data EXTRACTION finished in ", "", "text", " hours, ", ":", "text", " minutes, ", ":", "text", " seconds", "" ] }
mutate { rename => { "text" => "duration" } }
aggregate {
task_id => "%{static}"
code => "event.set('ID_extraction_globale', map['id'])
event.set('runtimestamp', map['runtimestamp'])"
map_action => "update"
}
}
date { match => [ "ts" , "YYYY-MM-dd'T'HH:mm:ss,SSS" ] }
}
Look at line 36. You have joined the two lines into one without a semicolon to separate the two statements. That will get you unexpected tIDENTIFIER all day long.
Yeah. I used different names and formatted temp_totales differently. The important thing is that it joins data from the three lines into one event. Which data does not really matter.
the first code that i made it work but the problem is i get the result in 3 lines the GREEDYDATA where i put "message 1 or 2 or.. " i don't need those informations , every information i need i give it a significant name like (id_extraction_globale,END_time etc ..)
could you modify it by adding the command "aggregate" in the correct places to put the result in one line in the index without complicating it ? thank you so much badger i m really blocked in this step in my intership..
hello it's works but the problem is a have a log in which i have 2000 lines
when i make as un put all the log it parse all the lines
which command should i add to parse only the 3 lines i want to
(like grok parse failure ) thank you for your time
grok {
match =>["message","%{TIMESTAMP_ISO8601:Start_time_conversion}%{GREEDYDATA:message1}\s+Starting\sstep\s2%{GREEDYDATA:message2}",
"message","%{TIMESTAMP_ISO8601:ENd_Time_Conversion}%{GREEDYDATA:message3}\s+End\sof\sXML\sto\sCSV\sconversion%{GREEDYDATA:message4}%{NUMBER:total_number}%{GREEDYDATA:message5}\s+were%{GREEDYDATA:status}\s+processed\sin\s%{GREEDYDATA:duration}"
]
}
and those are the lines
2018-02-07T12:24:18,215 [ ExtractDwhData] [INFO ] Starting step 2 : XML transformation to CSV...
2018-02-07T12:24:36,071 [ XmlToCsvConverter] [INFO ] End of XML to CSV conversion. 31 XML files were successfully processed in 17.828 seconds
i need only
start_time_conversion
end_time_conversion
duration
status
how to get those fields in one index like you did in the first time
thank you i am really appriciated
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.