I'm trying to visualize the latest status of a news articles using Kibana.
Here's a brief example of what I'm trying to do:
I have a database of news. Each piece of news contains a headline, a timestamp and a status of whether the article has been printed.
I want the get the last (timestamp based) headline status for each available unique headline and visualize it in Kibana (possibly a pie chart).
#!/bin/bash
export ELASTICSEARCH_ENDPOINT="http://localhost:9200"
# Create indexes
curl -XPUT "$ELASTICSEARCH_ENDPOINT/news" -d '{
"mappings": {
"news": {
"properties": {
"headline": { "type": "object" },
"timestamp": { "type": "date" },
"status": { "type": "string" }
}
}
}
}'
# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"news","_type":"news"}}
{"status": "Pending", "headline": "Great news", "timestamp": "2015-07-28T00:07:29.000"}
{"index":{"_index":"news","_type":"news"}}
{"status": "Pending", "headline": "Great news", "timestamp": "2015-07-28T00:08:23.000"}
{"index":{"_index":"news","_type":"news"}}
{"status": "Pending", "headline": "Sports news", "timestamp": "2015-07-28T00:09:32.000"}
{"index":{"_index":"news","_type":"news"}}
{"status": "Printing", "headline": "Sports news", "timestamp": "2015-07-28T00:10:35.000"}
{"index":{"_index":"news","_type":"news"}}
{"status": "Printing", "headline": "Crazy news", "timestamp": "2015-07-28T00:11:54.000"}
{"index":{"_index":"news","_type":"news"}}
{"status": "Printed", "headline": "Crazy news", "timestamp": "2015-07-28T00:12:31.000"}
More specifically, I would like to know the count of latest Pending, Printing and Printed statuses for every unique headline article without printing anything else, preferably a simple pie chart showing the counts for the three statuses. For instance, in the given example the stats would be:
- Pending = 1 (since "Great news" has latest pending status)
- Printing = 1 (since "Sports news" has latest printing status)
- Printed = 1 (since "Crazy news" has latest printed status)
I tried writing a query for it as well in elastic search, but could only get the latest headlines using terms
and top_hits
aggregations. Also, if another terms
aggregation on status was applied first then it would give the unique headline within each status which was resulting in duplicate results.
So, how could I get the count of latest Pending, Printing and Printed statuses for every unique headline article without printing anything else? Any help would be appreciated!!