I'm brand new to my Elastic journey. I am collecting some firewall logs, which accumulate really quickly. What would I like to do is create a new index based on the counts of some of my tags.
A daily record that says 10,000,000 'Standard - Denied by policy' takes up exponentially less space and is a lot easier to chart behaviour over a year, than storing 10,000,000 documents simply to count them.
My first thought is to run a cron/scheduled task on an outside server that simple runs the query then puts it into its own index, but I feel there must be a native way and I'm simply missing the terminology.
How can I run a daily a journal based on counts of my tags? Or at least correct my terminology so I can find what I'm looking for in the amazing documentation.
I think you might want to check Index lifecycle policy
where I saw thing called "maximum document" that means it will roll over and create new index when it will reach that count.
I think I'm understanding the Lifecycle Management portion (although I'm sure it could be optimized as I learn more), what I am trying to do is demonstrate Firewall and network perimeter statistics, month over month and year over year.
I don't need the full firewall logs other than for a very short period for troubleshooting / investigations. 10,000,000 *365 = a LOT.
I think what I'm after is a workflow kind of like this:
Create an hourly/daily schedule
Create a Watcher that:
Uses the created schedule
Runs an Aggregation for my Tags
Has an Index Action to Put into /CyberStats or whatever boardroom / powerpointy index I create
Then I have a daily journal that I can show trending reports without maintaining a huge inventory of useless logs. Getting the syntax right looks like I'll need a cup of coffee.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.