Hi, everyone! I'm newest in ELK, so I need help =)
Could some one help me?
I try to search some data from my email message. I used imap input plugin , but what should I use in filter?
For example if i use rubydebug in stdout I see that :
"message-id" => "",
"from" => "",
"x-ms-exchange-organization-authas" => "Anonymous",
"x-ms-exchange-organization-authsource" => "",
"date" => "",
"@version" => "1",
"@timestamp" => ,
"to" => "",
"subject" => "Alert Summary: ",
"received" => "",
"content-type" => "text/html; charset=UTF-8",
"mime-version" => "1.0",
"message" => "SOME HTML TAGS",
"return-path" => "",
"content-transfer-encoding" => "7bit",
"type" => "new_type"
And now I want if I see something in "subject" Then I must to parse HTML tags. How can I do this?
Logstash has no built-in filter for HTML stripping. While you might be able to ignore the fact that it's HTML and just treat it as plain text (it depends on what you want to do) I'd probably write a custom filter plugin for stripping the HTML markup from the field.
Did I understadn correct, I must write some regex if I want extract some data from message? Or I can use something else? Cause it will be a very huge regex =)
Could you, please, write some examle of config file in this case?
ohhh, ok,
thanks for the quick answers! I will read how to write my own plugins. I like the idea of deleting HTML tags and trying to parse the text itself. I hope it will not be very difficult....
Magnus! Can you help me, again ? =)
I'm stuck again. How can I parse the text now? I did so that all key words are separated by a comma. And I know for sure that all words will always be in the same order. How do I now pull out certain keywords and assign them to other fields?
For example :
"Some text ,some text,INEED-THIS-ONE,next text,AND-I-ALSO-NEED-THIS "
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.