I'm trying to load a file to structured table in Athena. I am using GROK pattern to load it to the table but not able to find the correct pattern. The file format is as below:
L1127 ACTUALS 214171 ON 27649075 -00000000000000000409618.02 601 MBS DAILY VISION - CAN OS
L1127 ACTUALS 412821 ON 27649075 002060 -00000000000000000002657.33 521 MBS DAILY VISION - CAN OS
GROK pattern I'm using : (?.{5})%{SPACE}(?.{7})%{SPACE}(?.{6})%{SPACE}(?.{2})%{SPACE}(?.{8})%{SPACE}(?.{6})%{SPACE}(?.{27})%{SPACE}(?.{3})%{SPACE}(?.{35})
I'm having trouble when the ProductId has no value.
You cannot use . everywhere since that will always match. If you use NOTSPACE for the fields and make the sixth field (and preceding space) optional by appending ? to the pattern then it will work.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.