I have a slightly different requirement where i have a directory structure something like this
/data/aws-test1
/data/aws-test2 .. and so on
Is there a way when using the file input template to capture aws-test1 and aws-test2 and use that name as the name of the index when it tries to ingest files from that directory.
@magnusbaeck
One more problem i have now is ingesting the *.gz files of aws cloudtrails. As these gzip files have json inside it which does not have a new line , Logstash thinks that it is expecting more input as the file has no end to it.
logstash.inputs.file - each: file grew --> This is what i see and it goes on and on forever.
Thanks @magnusbaeck for the feedback. So what i have right now is like pre-process the s3 files , untar it, add a new line to every file with a script and let remain the existing files ( *.gz ) so i can revert back if required to gz types. The ingestion is working fine now with few of the fields showing in kibana as not recognized but i believe those are due to mapping issues.
@magnusbaeck I think i am stuck again. I am flipping back and forth between input file and input s3 to compare performance and the hassle of processing data. Here is my config with the pattern to grab for index creation. But somehow it is not able to grab the output i want.
@magnusbaeck i was in an assumption that i can extract from the "prefix" variable that is being used in the s3 input. Now i feel that it has to be an actual field. If this is the issue, is there a way i can grab the value what i see in "prefix" variable in input. The reason i want this is because there are no other fields in cloudtrail events that can give me this value.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.