I'm relatively new to ElasticSearch and we have a small cluster running which is collecting all of our logging and metrics data. Currently it's suffering from pretty bad performance (particularly memory usage) and after doing some reading I suspect it has to do with all the fields which are getting indexed (which is basically everything). And some of our messages are fairly heterogenous (we are using SeriLog for .NET which captures fields in messages and makes them fields in the log output for example, and Elmah which generates a large number of fields.)
I'm trying to understand how to balance between what should be indexed and what should be searchable. We do define message templates which assign appropriate data types to all of the fields, but the sparseness of some of the fields is definitely suboptimal. For example, when there is a critical error, Elmah will log a full stack trace, server variable dump, etc - these only apply to those messages though. If I am an ops person who wants to look at error messages over time, I might be interested in stack traces and their contents (searching for keywords in the stack traces for example), but it doesn't seem like I want everything indexed in that blob - or maybe not indexed as that specific field. I'm not sure.
Can anyone who is using ES as a logging endpoint for heterogenous, sparse messages recommend some to-dos or specific actions I could take in designing my schema so that I am not blowing out ES ram requirements? For example, should i just log the Elmah block as one big string field and let ES do text analysis on it for searchability? Or should i put those Elmah messages in their own index, and cross-reference them with the master log only when I need to using a sort of join? What has been your experience using ES in this fashion?