I am new to elasticsearch and am looking for guidance on how to do
faceting on our fairly large (1.2 billion syslog records per month) log
file collection, which we are currently loading into ES. We just need to
keep 3 months worth of logs (maximum 6 billiion records). My schema for
each line of syslog is just a timestamp, host-IPaddress and Message
field.. But I definitely need to do reporting (ranking) of busiest
host-IP and top 20 or even 50 log messages. Understanding that the ranking
can sum up into the hundreds of millions (and I only have a 48GB RAM
server), I have read that there is a way to do this off heap
I have tried following the instructions here,
but am looking for more examples, and initial setup, especially for our ES
setup which is basically new and I have not setup any mappings yet.
Any advice or recommended links will be helpful.
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to email@example.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/957ae8d4-7a04-4263-b9ad-6e06cdb405da%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.