I need to read to read filenames available in a directory in unix server and send it to elastic search to create dashboards in Kibana. Filename itself contains all the information that is required to create my dashboards.
this is one of the filename which follows the below format:
...D<Date_MMDDYY>.T<Time_HHMMSS>.,..
I think I can use logstash with file input plugin and tweak multiline codec to read whole file as one entry and then use "path" in grok fliter and read all required data.
I can also use a shell script to first read all files in directory and put them in a different file. Then use logstash to send that file data. Or directly write a shell script to send data directly to ES.
My folder is regularly getting updated with new files and I want those filenames to go in to ES.
Can anyone else suggest me a better way to do this with logstash or any other suggestions?
Set the document id as a fingerprint of the fields? See this thread for some ideas on how to do that.
If you do not care about the contents of the file (and you say you just need the name) then using a file input plugin and discarding the contents seems rather wasteful. However, periodically running an ls has the same problem.
You want to be able to say whether you have see the file before. An ignore_older on the file input might help, but there will be a window for duplicates when logstash restarts. If you do not care about overwriting old entries then that may not be a problem.
If I were doing this on Windows I would write a PowerShell script that listed all items in a directory tree since the last time it had been run, then pipe that into something that could inject into a logstash input. I cannot think of a similar scripted design on Linux off the top of my head.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.