@magnusbaeck why I am using GREEEDYDATA there is because;
GREEDYDATA .*. Greedydata matches everything. I want everything inside < and >, and I dont want to perform validation. I believe that .* takes less effort compared to others.
My entire message is inside <tags> so that I have a start point < and end point > for messages.
It's still inefficient since it's greedy and will first attempt to stuff <data1> <data2> <data3> into the data1 field, but then it discovers that there's no text left for the two GREEDYDATA pattern to match against, so it backtracks and tries to match <data1> <data2> but then there's still one GREEDYDATA that doesn't get anything so... you get the idea.
Using DATA should be much more efficient, but I'd still expect it to be outperformed by (?<data1>[^>]+).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.