Making custom ES queries form a Kibana plugin


(Tim Roes) #1

When writing a Kibana plugin (especially a visualization) you sometimes need to make custom ES queries, i.e. queries, that do not use the aggregations, that you can define via Schemas in your visualization definition. You might still want to have these integrate with the set filters, queries auto-refresh times, etc.

An example: I have an index, that contains tweets from twitter. I would like to create a visualization, that shows a random tweet from "the result". For the sake of the example, let's assume, that it's not bad practise to create a visualization, that highly relies on the specific data in one index. What is "the result"? The index linked to the visualization when creating it should be used. I would also like to integrate now with some of the dashboard features:

  • I also would like, that e.g. the set filters on the current dashboards are used for this query to ES and would filter the possible tweets shown in the visualization.
  • If the user enters a search query this should filter the possible shown tweets.
  • If the auto-refresh on the dashboard is set, I would like to do this query again (and determine new possible tweets) in that interval.
  • I would also like to get only tweets from the selected time period.

In Kibana there is a courier service that manages and batches ES calls together. It would be great if I could somehow use this to make regular ES queries, but still have the integration mentioned above. Also sometimes it would be useful to make custom ES queries, but only apply some of the above mentioned filters.

Another example from more real-world scenario. We have a monitoring list, that works on logs sent by devices into ES. We determine all possible devices (terms aggregation over the device id field) and afterwards check for every device the latest log message, to see how long it haven't had logged to ES. It would be great to be possible to integrate that as a Kibana plugin, but we need to do several "chained" ES calls, that still would be great to be integrated with auto refresh and possibly the set query/filters. But these requests for example should not take the time range into account, since the job is to determine the last time a device logged.

Are there any (clean) possibilities for this in Kibana yet? Are there any plans or ideas how to implement such methods (e.g. would it be better to create a Kibana server API, and the server queries ES)?

Discuss now ...


How to build Kibana plugin of the discover view, without aggregation
(Matt Bargar) #2

I believe the answer is yes, it is possible. I know @stormpython was working on a viz plugin that created a tag cloud, and I assume it worked with the filters and date ranges correctly, so he might be able to provide some info.

If I understand it correctly, I think you could accomplish your second example without writing a custom plugin. Creating a visualization with a terms agg on device id and the metric set to max for the timestamp field should give you what you want. Something like this:


(Mark Walkom) #3

You can find that tag cloud plugin here - https://github.com/stormpython/tagcloud


(Tim Roes) #4

I know that plugin (as from its readme: "This visualization was inspired by Tim Roe's blog post on creating a tag cloud plugin for Kibana 4.") :wink:

The tagcloud plugin just defines Schemas and use an aggregation to create buckets for which to show tags. The aggregations as defined in the Schemas are perfectly well integrated into filters, queries, etc. The question is, how to integrate non aggregation queries with this stuff.

For the second example: There was some more complex stuff involved, that I didn't describe, but you are right, if you need only what I was describing in my post, you could perfectly solve this with a bucket aggregation and a max value.
So let's reduce the 2nd example to the following question: Since I am interested especially in the offline devices, i.e. the once that haven't logged for a longer time, this aggregation should be independent of the set time range on the dasahboard. Because even if I want to just see logs of the last 15 minuten, I want the offline device list to fetch device information from a much larger time range.

So we could reduce the second example to the kind of opposite question: How to use Schemas to make aggregations but detaching from some settings like time range, filters, queries, etc?


#5

I have a situation with very similar requirements for a time series type visualisation.

After the esResponse comes back to my custom plugin I need to execute different ES queries to get more data to annotate the data I just got back. The queries I run depend on the data returned initially. In my case I go to check if the metric being plotted is anomalous at that time and provide additional information depending on if and how it is anomalous. Other things as well.

I need these second set of queries to use the search filter if the user entered anything in there, along with index and aggregations information. I can get most of what I need from the Vis object.

However I have not been able to find any way to access the search query that the user typed. Is there any way to get this information in my custom plugin? Can I watch it like I can watch esResponse?


(Xiaodong Zhou) #6

That is exactly what I am looking for also. Or put it in another way, instead of "saved search" added to dashboard. I want to show it differently, not the traditional list view by columns and detail view by field/values, I want to do very different visual based on the specific data we have.


(Juan Ignacio Carniglia) #7

Just adding to this request, it would be really useful.

Let me explain my use case. I need to get a large result set (array of numbers) all the way from ES to the Client Side Javascript, in order to make some computations with the numbers and show an statistical chart. These computations cannot be made in ES (there are no aggregations for it) and it can be easily done in Javascript.

I think that what I am talking about here has something to do with the Elasticsearch plugin that handles, among other things, the health-check. If I could just "server.plugins.client.search()" from the Provider, maybe I can play around with the result set. Also I think it is important that that search respects all filters (time/date, search, etc.) that where set on the dashboard.

While writing this I had an idea, and I think it might work.

I have in ES a collection of documents with:

{ 'value' : [a double type number], 'eventid' : [an id] }

If I set up a Table Vis, like so (Request) :

{ "size": 0, "query": { "query_string": { "query": "*", "analyze_wildcard": true } }, "aggs": { "2": { "terms": { "field": "value", "size": 50, "order": { "_count": "desc" } }, "aggs": { "3": { "terms": { "field": "eventid", "size": 5, "order": { "_count": "desc" } } } } } } }

I am getting all the events (a bucket for each VALUE and a collection of all the events that have that value:

{ "3": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": 1, "doc_count": 1 }, { "key": 2, "doc_count": 1 } ] }, "key": 12, "doc_count": 2 }

This actually works, if I have the SAME VALUE in the SAME EVENT, the document count for such bucket increases:

{ "3": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": 1, "doc_count": 2 }, { "key": 2, "doc_count": 1 } ] }, "key": 12, "doc_count": 3 }

What do you think?
May this kill the server if I have thousands of results? (I know there is a buckets limit, that can be tweaked but I don't know to what extent).

Thanks.


(Juan Ignacio Carniglia) #8

Ok, just for the record. I am sending over 6000 values (each from a different document) over to a boxplot plugin, and it's working.

I created two aggregations, having in mind (small detail) to actually account for all REPEATED values. In statistics, every value counts, even if it repeated (for computing mean and average, for instance).

So I have to take all those values that are repeated (doc_count > 1) and actually repeat them on the end result set.

You can check out the plugin at https://github.com/JuanCarniglia/kbn_boxplot_violin_vis

This visualization computes quantiles, mean and other values in order to render both the boxplots, and also the "violins" (which are actually histograms drawn at a 90 degrees angle).


(Mark Walkom) #9

Awesome, thanks for sharing!

I'll get this added to our list of community plugins - https://github.com/elastic/kibana/issues/6746


(Brian Walsh) #10

+1. My use case is to pass the detailed query results back to an upstream pipeline via a plugin we've written. In the past we've forked kibana and customized doc_table. Is there a way to do that via a plugin?


(Juan Ignacio Carniglia) #11

If you are still with this issue, contact me and maybe I can help out.

I think that if what you are looking for is an alternative to having the Kibana doc_table customized, a new visualization might be the solution.


(android.kc) #12

I got into the same situation, want to customize the requests sent to ES, also want to integrate with all the kibana features, e.g. the set filters, queries auto-refresh times, etc. I would like to stay in the plugin level instead of forking kibana. I kind of got it worked as the following.

-- Use Angular interceptor to add customized payload along with requests sent from Kibana vis plugin to ES
-- on the elasticsearch side, parse the payload inside my ES API extension plugin, https://www.elastic.co/guide/en/elasticsearch/plugins/current/api.html , which can also intercept all the requests sent to ES, then parse the payload/add my own logic before doing the actual query/aggs

Hi @JuanCarniglia ,
I read the source code of the kibana plugin, https://github.com/sasauz/kbn_boxplot_violin_vis. If I understood correctly, the plugin customizes the responses based on the options selected on the UI, but not the requests sent from Kibana plugin to elasticsearch. Am I correct?

Hi @timroes,
The topic is about one year old. have you found any solution meanwhile?

Thanks!
KC


(system) #13