I am trying to parse out a log entry that is comma separated, but if the SOMEUSER field exists will have a comma in it as well that should be ignored.
The fields that exist in some fashion are SOMEUSER, SOMENETWORK, SOMENETWORK
Examples of the format of the logs are:
"Smith, John A. - Some Business Title (Smi,SOMENETWORK,SOMECOMPUTER"
"Smith, John A. - Some Business Title (Smi,SOMECOMPUTER"
"SOMECOMPUTER,SOMENETWORK"
"SOMECOMPUTER"
"SOMENETWORK"
I do have a identity_type field that will tell me what kinds of identities are in the csv list that I need to split out, but I am not sure how to skip the first comma if the AD User identity field is present.
Formats I have seen in the logs:
SOMEUSER,SOMECOMPUTER
SOMEUSER,SOMENETWORK,SOMECOMPUTER
SOMECOMPUTER,SOMENETWORK
SOMEUSER,SOMENETWORK
SOMENETWORK
SOMECOMPUTER
SOMEUSER
I am looking to see if there is a way to skip the first comma if the SOMEUSER value exists.
My thoughts were to use an if statement to process things depending on what identity types are in the appropriate field, which works for everything except when the SOMEUSER field with the comma in it messes it all up.
That was the first thing I had used and it does not skip the first comma when the username is there. That is what I am trying to figure out how to handle.
Smith, John would be taken as the first and second fields instead of just the first field.
I have a feature request in to fix the upstream data, but I am not holding my breath.
User names are mixed character with spaces.
Computer names are in uppercase, but also contain numbers. Something like a Dell Service Tag.
Network names have a mix of upper, lower and contains spaces.
I'll see if I can figure it out with grok, was just hoping for some awesome command I had not been able to find that says skip the first comma.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.