I have a "message" field which is not appearing in kibana when I select aggregation by "Terms" and open the "Field" drop-down.
Sadly I'm a pre-beginner at ES but I'm guessing it might have something to do with the mappings. Here is the beginning of the result of the query "GET logstash-2017.02.17":
Your guess is correct — it does have to do with mappings. Basically, the "message_field" mapping in the dynamic template is telling Elasticsearch to index the message field as type text. What this means is that the contents of the message field will be analyzed before they are indexed (as opposed to indexing the contents as-is, unanalyzed, as one large token).
The terms aggregation only operates on unanalyzed fields. And that's why you don't see message as an option when you select the terms aggregation.
Typically, when using Logstash, the contents of the message field tend to be a log line or some other long piece of text. It doesn't make sense usually to do aggregations (e.g. counts) on such long strings as-is. What sort of information is contained in your message field?
Thanks Shaunak. Well it's a big mix. I'm funneling all of our logs into elastic for easy viewing and combined searches. So the message field includes short logs from heroku sinks, long error message, and, in this particular case that I'm trying to visualize, very specific (and short) messages from custom database audits.
The reason I like having all these different things in the message field is that it allows for quick human readability via the discover view. I can see lots of different types of messages at a glance. But they are very heterogenous. I suppose I can duplicate the message in this case to another field for aggregation.
Just for my education, when you say "the contents of the message field will be analyzed before they are indexed", what does that mean? Is there a link where I can read more about this behavior?
It sounds like you want to keep the message field around for full-text searching and readability in the Discover view. But in the case of some message sources (e.g. from the custom database audits), it might also make sense to aggregate on these.
So, yes, I would suggest copying messages from these sources into a separate field so they could be aggregated. As you are using Logstash, you could perform this step in your filter section using an if condition on the message source (assuming you have some way to know this).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.