Getting Top 5 values from an Array


(Jon) #1

I have a document that represents a Task, it looks similar to the below but with several more fields than shown.
{
"_index": "index",
"_type": "type",
"_id": "1245",
"_version": 1,
"_score": null,
"_source": {
"field": "example",
"PeopleWorked": "[bob, fred, Jon]",
"Tags": "[tag1, tag2, tag3]",
},
"fields": {
"create_day": [
"2018-10-24T00:00:00.000Z"
],
"TotalResolutionDays": [
0.5917958101851851
],
"create_date": [
"2018-10-24T10:22:08.000Z"
]
},
"sort": [
1540376528000
]
}

I have 2 fields that are essentially Arrays, PeopleWorked , and Tags. They are both of the Keyword Type. A Task can have been worked by an arbitrary number of People, and have an arbitrary number of Tags (specific tag names aren't known the tags could be anything).

Is there a way to create a visualization that will show me the 5 (or any number) of people that have worked the most tasks?
Also, similarly, the top 5 tags that appear most often?

Right now when trying to use a visualization and using the Terms/Significant terms filter it will return the most common entire arrays. For example it would return the number of times [Bob, Fred, Jon] is found, rather than the number of times just Jon is found.

I can reformat this data if needed but would like to try to keep it all in one document if possible.

Thanks for any help!


#2

Look at using the split filter in logstash

https://www.elastic.co/guide/en/logstash/current/plugins-filters-split.html

split your array/s to create documents for every item in the array

hope it helps


(Jon) #3

For anyone that sees this in the future. I couldn't get any other solution to work so I uploaded my Tags to a separate Index as a separate type of item.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.