Logstash filtering. Extract data between two strings

I have a UNIX log looks like:

ACTION started Lorem Ipsum is simply dummy text of the printing and typesetting industry.
Lorem Ipsum has been the industry's standard dummy text ever since the 1500s finished
when an unknown printer took a galley of type started and scrambled it to make a type specimen book. finished It has survived not only five centuries started but also the leap into electronic typesetting, remaining essentially unchanged started but also the leap into electronic typesetting, remaining essentially unchanged finished

I have to extract all the data between:

  1. started and finished words (ex. and scrambled it to make a type specimen book)
  2. if no finished for the started then between started and the next started (ex. but also the leap into electronic typesetting, remaining essentially unchanged)

Thanks.

Maybe

 grok { match => { "message" => "started%{DATA:someText}(finished|started)" } }

Thank you Badger, I'll check it later.

Hi Badger,
What is the meaning of "someText" in this case? Could you please explain?

I tried the grok. It seems working but if there are multiple rows between started and ended words, Logstash creates for each row a different document in the index.
I need the text between these two words to be in the one document.
Maybe the setting in input plugin are incorrect?

Then you would have to use a multiline codec on the input or the multiline option in filebeat (if applicable) to combine them.

1 Like

Thank you!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.