I have the following 3 exemplary JSON log entries:
{
hint_lists: [{
"hint_list": ["Entity X with number 1 is unknown!", "Entity Y with number 2 is unknown!", "Entity Z with number 3 unkown!"]
}, {
"hint_list": ["Entity X with number 1 is unknown!", "Entity Y with number 2 is unknown!", "Entity Z with number 3 unkown!"]
}
]
}
{
hint_lists: [{
"hint_list": ["Entity X with number 1 is unknown!", "Entity Y with number 2 is unknown!"]
}, {
"hint_list": ["Entity X with number 1 is unknown!", "Entity Y with number 2 is unknown!"]
}
]
}
{
hint_lists: [{
"hint_list": ["Entity X with number 1 is unknown!"]
}, {
"hint_list": ["Entity X with number 1 is unknown!"]
}
]
}
Now I want to create a histogram (unknown entity number -> count) in Kibana with the following result:
1: 6
2: 4
3: 2
How can I achieve that within Logstash and/or Kibana?
I'm not entirely sure I follow, but if you're trying to extract the numeric value from your text fields, then run a histogram on that, it's going to be tricky. The easiest thing is to write the numbers as separate fields when the data is written. If you can't do that, you may be able to do something with scripted fields to do the extraction and then crunch some stats on that, but I'm not sure.
I also thought about extracting the count of every unknown entity into individual fields (e.g. unknown_1, unknown_2, unknown_3) per log entry. But how can I create the histogram mentioned above in Kibana out of these fields in the next step?
Note: The above log entries are only examples. There are about 100K entities with different entity numbers, all of which may be unknown once or several times in one or multiple log entries.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.