Hello, I have a big dataset (5+ million documents and growing), where each item is a combination of a server and update that applies to it. For example:
{
"severity": "Important",
"erratum_id": "RHSA-2019:3878",
"hostname": "host1.local",
"issue_date": "2019-11-13T00:00:00Z",
"os": "RedHat 6.10",
"report_id": "cb75e0ec-c5c9-4826-a114-45ffe6dbc7ec",
"hostgroup": "G1",
"id": "5e7f4224-954c-4bcf-a411-f7bffa2a8306",
"source": "https://satellite.local",
"synopsis": "Important: kernel security update",
"host_id": 1602,
"timestamp": "2019-11-18T00:00:00+02:00"
},
{
"severity": "Important",
"erratum_id": "RHSA-2019:3878",
"hostname": "host2.local",
"issue_date": "2019-11-13T00:00:00Z",
"os": "RedHat 6.10",
"report_id": "cb75e0ec-c5c9-4826-a114-45ffe6dbc7ec",
"hostgroup": "G1",
"id": "d5b662f6-f085-487e-8a99-d305a602f2ee",
"source": "https://satellite2.local",
"synopsis": "Important: kernel security update",
"host_id": 966,
"timestamp": "2019-11-18T00:00:00+02:00"
},
What I need is a visualization that shows number of documents within certain ranges. So similar to date histogram, except on the unique count
of hostname
field.
In other words I need to answer question: "How many hosts have X more than Y number of updates that apply to it". Ranges should be something like 0-10,10-100,100+
I tried to do it via Canvas as instructed in this answer: Kibana: Range based on count but the number of documents is too large and it crashes the browser.
Can this be done without transforming underlying data?