Now I want to create a histogram in Kibana with the following exemplary result:
textA: 6
textB: 4
textC: 2
How can I achieve that within Logstash and/or Kibana?
I thought about extracting the individual text counts into individual fields (e.g. nTextA, nTextB, nTextC) per log entry. But how can I create the histogram mentioned above out of these fields over all log entries in the next step?
Note: The above log entries are only examples. I'm looking for a solution for about 100K different texts, all of which can occur once or several times in one or multiple log entries.
I think the proposed solution only works for a few terms (see example above). But I am interested in a generic solution with a lot of terms (I don't even know how many terms there are in total). As a result, potentially many fields (nTextA, nTextB, nTextC, ..., nTextX-1, nTextX) per log entry have to be created in the logstash. So far, so good.
But how can a histogram be created from the many fields (nTextA, nTextB, nTextC, ..., nTextX-1, nTextX) in Kibana, which sums up all fields of the same name over all log entries? In the proposed sum aggregation you need to select only one field. You cannot select many fields by regex like "nText*".
@s7ygian yes, you make a good point. Perhaps a better way would be to structure a new index for the particular query that you need. You could create a document for each term using logstash split - https://www.elastic.co/guide/en/logstash/current/plugins-filters-split.html - which would make querying and counting with kibana very simple.
Please note that I'm linking to docs for the latest release and you might be running a different version.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.