Kibana server-size integration with R, Perl, and other tools

Is there some existing method to integrate processing between the Kibana/
Elasticsearch response JSON and the graphing?

For example, I have a Perl script that can convert an Elasticsearch JSON
response into a CSV, even reversing the response to put the oldest event
first (for gnuplot compatibility). I then have an R script that can accept
a CSV and perform custom statistical analysis from it. It can even
auto-detect the timestamp and ordering and reverse the CSV events (adapting
without change to either an Elasticsearch response as CSV, or a direct CSV
export from Splunk).

I've showed the process to a few people, but all balk outright or else shy
away politely at the thought of going to Kibana's Info button, copying and
pasting the curl-based query, and then running it along with the Perl CSV
conversion script and R processing script from the command line. And I
can't blame them!

It may be that Kibana already has the capability to pipe data through
server-installed commands and scripts, but my lack of Javascript experience
and lack of Kibana internals expertise doesn't seem to help me discover it.

Or perhaps this would be a great new addition to Kibana:

  1. Allow a server-side command to be in the middle of the response and the
    charting.
  2. Deliver the response as a CSV with headers, including the @timestamp
    field of course, to the server-side command, along with the appropriate
    arguments and options for the particular panel.
  3. Document the graphite / graphviz / other format required to display the
    plots.

Just a thought.

Brian

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/132cfc20-ea67-42c8-a518-48404593d35d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Brian,

I like the direction you are going down and am trying to do that myself.
However, being a perl fledgling, I am still battling Dumper etc. I would
appreciate it if you could share your code to convert and ES query to CSV.
I want to use aggregations and print/report/graph results. Kibana is very
pretty and does the basics well, but I want to know who used web mail and
order it by volume of data sent by hour of day and either graph / tabulate
/ csv out the result. I just cant see how to do that with Kibana.

Thanks

Ash

On Monday, August 25, 2014 6:36:42 PM UTC-4, Brian wrote:

Is there some existing method to integrate processing between the Kibana/
Elasticsearch response JSON and the graphing?

For example, I have a Perl script that can convert an Elasticsearch JSON
response into a CSV, even reversing the response to put the oldest event
first (for gnuplot compatibility). I then have an R script that can accept
a CSV and perform custom statistical analysis from it. It can even
auto-detect the timestamp and ordering and reverse the CSV events (adapting
without change to either an Elasticsearch response as CSV, or a direct CSV
export from Splunk).

I've showed the process to a few people, but all balk outright or else shy
away politely at the thought of going to Kibana's Info button, copying and
pasting the curl-based query, and then running it along with the Perl CSV
conversion script and R processing script from the command line. And I
can't blame them!

It may be that Kibana already has the capability to pipe data through
server-installed commands and scripts, but my lack of Javascript experience
and lack of Kibana internals expertise doesn't seem to help me discover it.

Or perhaps this would be a great new addition to Kibana:

  1. Allow a server-side command to be in the middle of the response and the
    charting.
  2. Deliver the response as a CSV with headers, including the @timestamp
    field of course, to the server-side command, along with the appropriate
    arguments and options for the particular panel.
  3. Document the graphite / graphviz / other format required to display the
    plots.

Just a thought.

Brian

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aa578197-352f-4def-a341-4388b9627a58%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ash,

JSON is a natural for Kibana's Javascript to read and therefore emit as
CSV. So what I really was asking is Kibana going to become a serious
conteder and allow user-written commands to be inserted into the pipeline
between data query/response and charting. After my few weeks with R, I have
gotten it to far exceed GNUPlot for plotting (even with the base plotting
functions; I haven't yet dived into ggplot2 package), and to also far
exceeds Kibana. For example, setting up a custom dashboard is tedious, and
it's not easily customizable.

Now, I am not suggesting that the ELK stack turn into Splunk directly. But
since it wants to become a serious contender, I an am strongly recommending
that the ELK team take the next step and allow a user-written command to be
run against the Kibana output and its charting. And I recommend that the
output be CSV because that's what R supports so naturally. And with R, I
can build out custom analysis scripts that are flexible (and not hard-coded
like Kibana dashboards).

For example, I have an R script that gives me the most-commonly used
functions that the Splunk timechart command offers. And with all of its
customizability: Selecting the fields to use as the analysis, the by field
(for example, plotting response time by host name), the statistics (mean,
max, 95th percentile, and so on), even splitting the colors so that the
plot instantly shows the distribution of load across 10 hosts that reside
within two data centers.

This is an excellent (and free) book that shows what Splunk can do by way
of clear examples:

Again, I don't suggest that Kibana duplicate this. But I strongly suggest
that Kibana gives me a way to insert my own commands into the processing so
that I can implement the specific functions that our group requires, and
can do it without my gorpy Perl script and copy-paste command mumbo-jumbo,
and instead in a much more friendly and accessible way that even the PMs
can run from their Windows laptops without touching the command line.

And as my part of the bargain, I will use Perl, R, or whatever else is at
my disposal to create custom commands that can run on the Kibana host and
perform all of the analysis that our group needs.

Brian

On Wednesday, September 24, 2014 4:34:43 PM UTC-4, Ashit Kumar wrote:

Brian,

I like the direction you are going down and am trying to do that myself.
However, being a perl fledgling, I am still battling Dumper etc. I would
appreciate it if you could share your code to convert and ES query to CSV.
I want to use aggregations and print/report/graph results. Kibana is very
pretty and does the basics well, but I want to know who used web mail and
order it by volume of data sent by hour of day and either graph / tabulate
/ csv out the result. I just cant see how to do that with Kibana.

Thanks

Ash

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b04ee873-23a6-40dd-a91b-7fa304634715%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

I think you could automate some of this with the Elasticsearch scripting
module:

Unfortunately none of that is accessible through Kibana yet, but we can
hold out hope for Kibana 4.0.

On Thursday, September 25, 2014 9:57:44 AM UTC-6, Brian wrote:

Ash,

JSON is a natural for Kibana's Javascript to read and therefore emit as
CSV. So what I really was asking is Kibana going to become a serious
conteder and allow user-written commands to be inserted into the pipeline
between data query/response and charting. After my few weeks with R, I have
gotten it to far exceed GNUPlot for plotting (even with the base plotting
functions; I haven't yet dived into ggplot2 package), and to also far
exceeds Kibana. For example, setting up a custom dashboard is tedious, and
it's not easily customizable.

Now, I am not suggesting that the ELK stack turn into Splunk directly. But
since it wants to become a serious contender, I an am strongly recommending
that the ELK team take the next step and allow a user-written command to be
run against the Kibana output and its charting. And I recommend that the
output be CSV because that's what R supports so naturally. And with R, I
can build out custom analysis scripts that are flexible (and not hard-coded
like Kibana dashboards).

For example, I have an R script that gives me the most-commonly used
functions that the Splunk timechart command offers. And with all of its
customizability: Selecting the fields to use as the analysis, the by field
(for example, plotting response time by host name), the statistics (mean,
max, 95th percentile, and so on), even splitting the colors so that the
plot instantly shows the distribution of load across 10 hosts that reside
within two data centers.

This is an excellent (and free) book that shows what Splunk can do by way
of clear examples:

Exploring Splunk: Search Processing Language (SPL) Primer and Cookbook | Splunk

Again, I don't suggest that Kibana duplicate this. But I strongly suggest
that Kibana gives me a way to insert my own commands into the processing so
that I can implement the specific functions that our group requires, and
can do it without my gorpy Perl script and copy-paste command mumbo-jumbo,
and instead in a much more friendly and accessible way that even the PMs
can run from their Windows laptops without touching the command line.

And as my part of the bargain, I will use Perl, R, or whatever else is at
my disposal to create custom commands that can run on the Kibana host and
perform all of the analysis that our group needs.

Brian

On Wednesday, September 24, 2014 4:34:43 PM UTC-4, Ashit Kumar wrote:

Brian,

I like the direction you are going down and am trying to do that myself.
However, being a perl fledgling, I am still battling Dumper etc. I would
appreciate it if you could share your code to convert and ES query to CSV.
I want to use aggregations and print/report/graph results. Kibana is very
pretty and does the basics well, but I want to know who used web mail and
order it by volume of data sent by hour of day and either graph / tabulate
/ csv out the result. I just cant see how to do that with Kibana.

Thanks

Ash

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c1c9cebf-2c4e-4783-a8c0-734c0e064c32%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Brian,

I agree completely with your expectations. If I am to replace Splunk
(ridiculously overpriced in my opinion) or Logrhythm, I need to be able to :

  1. Generate alerts that need immediate reaction.
  2. Generate reports
    • Compliance related reporting
    • Perform aggregations on the fly (e.g. social-networking traffic by
      user by hour of day/week etc.)
    • Generate Trend reports (Logon failures by Source IP address by
      destination address by hour of day etc.)
  3. Provide an analyst a console with which he can quickly drill down to
    events of interest.

Of the three broad categories, Kibana does an admirable job of the third.

Alerts can be generated by Logstash or nxlog that feeds ES though this is
kludgy and hard to maintain at best. (Graylog2 streams?)
I was looking to generate reports by scripted process. Basically create a
scripting framework and vary the query as necessary, but this is easier
said than done at the moment.

With respect to Kibana, it is a great product but I believe it took a step
back in functionality from Version 2 to version 3. The pace of development
appears to now be the same as OSSEC WEBUI and I do not see a roadmap for
future releases. It may be that folding the product into the Elasticsearch
umbrella has moved the company in a different direction (I smell a
commercial product coming soon).

I took a look at Graylog2 and I must say it has moved forward by leaps and
bounds over the last year and does have the ability to generate alerts. Not
sure how easy it would be to create custom reports as I have not tried.
What holds me back is that it only supports ES version 0.90.9. Considering
that version does not support aggregations (only facets), I am reluctant to
move my eggs into an old basket considering ES is pushing Ver. 1.3 and
upwards at the moment.

Until the smoke clears up, I believe we will have to rely on building
scripting skills. I am personally struggling with parsing JSON output with
nested fields as effectively into CSV so I can feed R etc. programatically

Regards

Ash Kumar

On Thursday, September 25, 2014 11:57:44 AM UTC-4, Brian wrote:

Ash,

JSON is a natural for Kibana's Javascript to read and therefore emit as
CSV. So what I really was asking is Kibana going to become a serious
conteder and allow user-written commands to be inserted into the pipeline
between data query/response and charting. After my few weeks with R, I have
gotten it to far exceed GNUPlot for plotting (even with the base plotting
functions; I haven't yet dived into ggplot2 package), and to also far
exceeds Kibana. For example, setting up a custom dashboard is tedious, and
it's not easily customizable.

Now, I am not suggesting that the ELK stack turn into Splunk directly. But
since it wants to become a serious contender, I an am strongly recommending
that the ELK team take the next step and allow a user-written command to be
run against the Kibana output and its charting. And I recommend that the
output be CSV because that's what R supports so naturally. And with R, I
can build out custom analysis scripts that are flexible (and not hard-coded
like Kibana dashboards).

For example, I have an R script that gives me the most-commonly used
functions that the Splunk timechart command offers. And with all of its
customizability: Selecting the fields to use as the analysis, the by field
(for example, plotting response time by host name), the statistics (mean,
max, 95th percentile, and so on), even splitting the colors so that the
plot instantly shows the distribution of load across 10 hosts that reside
within two data centers.

This is an excellent (and free) book that shows what Splunk can do by way
of clear examples:

Exploring Splunk: Search Processing Language (SPL) Primer and Cookbook | Splunk

Again, I don't suggest that Kibana duplicate this. But I strongly suggest
that Kibana gives me a way to insert my own commands into the processing so
that I can implement the specific functions that our group requires, and
can do it without my gorpy Perl script and copy-paste command mumbo-jumbo,
and instead in a much more friendly and accessible way that even the PMs
can run from their Windows laptops without touching the command line.

And as my part of the bargain, I will use Perl, R, or whatever else is at
my disposal to create custom commands that can run on the Kibana host and
perform all of the analysis that our group needs.

Brian

On Wednesday, September 24, 2014 4:34:43 PM UTC-4, Ashit Kumar wrote:

Brian,

I like the direction you are going down and am trying to do that myself.
However, being a perl fledgling, I am still battling Dumper etc. I would
appreciate it if you could share your code to convert and ES query to CSV.
I want to use aggregations and print/report/graph results. Kibana is very
pretty and does the basics well, but I want to know who used web mail and
order it by volume of data sent by hour of day and either graph / tabulate
/ csv out the result. I just cant see how to do that with Kibana.

Thanks

Ash

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c3d710ef-d49e-472d-8b8a-52c3860a06c7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

On 2014-09-25 11:57 am, Brian wrote:

And as my part of the bargain, I will use Perl, R, or whatever else is
at my disposal to create custom commands that can run on the Kibana
host and perform all of the analysis that our group needs.

Something to remember: The "Kibana host" is your browser. The current
version of Kibana run entirely within the browser, making calls to
Elasticsearch for data, processing it and generating graphs all within
the browser. There is no server-side operating component, just static
files that get loaded into your browser.

Given that, calling back to a server-side R, Perl, etc. script is
problematic. The suggestion to use Elasticsearch-side scripting may
have value.

I have had great success using Kibana to drill into my data, identify
what I'm after, pull out an elasticsearch query, then use that in my own
tools to obtain data and present it as needed. It would be nice to have
a CSV output in Kibana, for example.

Kibana does what it does very well, but don't try to force it into a
mold it's not suited for.

--[Lance]

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/56e36df745c1ca5521d7a4cd3e400b7f%40webmail.bearcircle.net.
For more options, visit https://groups.google.com/d/optout.

Lance et. al.,

My post probably sounded more critical than intended.

Kibana is a great tool, no question about it. It is my go-to tool for most
of my work, getting a high level view and being able to quickly drill down
to specifics.I spend most of my time on it.

I understand that the Roadmap of Kibana does not have to mirror mine. I am
simply looking for like minded folk who have found a way to meet other
goals such as quick aggregation, graphing, di-graphing etc.

Since I rely heavily on it, I am naturally inclined to expect more from it
than intended as a consequence of a natural human failing. The more I rely
on a single tool, the more I want it to do more than it is intended to.
Having said that, it is precisely what drives innovation and improvement.
In the absence of that, we would still be using MSDOS and WordStar, :slight_smile:

Note: This post is intended to be treated lightly.

Thanks

Ash

On Friday, September 26, 2014 2:51:38 PM UTC-4, Lance A. Brown wrote:

On 2014-09-25 11:57 am, Brian wrote:

And as my part of the bargain, I will use Perl, R, or whatever else is
at my disposal to create custom commands that can run on the Kibana
host and perform all of the analysis that our group needs.

Something to remember: The "Kibana host" is your browser. The current
version of Kibana run entirely within the browser, making calls to
Elasticsearch for data, processing it and generating graphs all within
the browser. There is no server-side operating component, just static
files that get loaded into your browser.

Given that, calling back to a server-side R, Perl, etc. script is
problematic. The suggestion to use Elasticsearch-side scripting may
have value.

I have had great success using Kibana to drill into my data, identify
what I'm after, pull out an elasticsearch query, then use that in my own
tools to obtain data and present it as needed. It would be nice to have
a CSV output in Kibana, for example.

Kibana does what it does very well, but don't try to force it into a
mold it's not suited for.

--[Lance]

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2018ed5b-093b-4376-a387-7c27cbebe286%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

It is quite easy to add a wrapper as a plugin in ES in the REST output
routine around search responses, see
https://github.com/jprante/elasticsearch-arrayformat

or

If the CSV plugin has deficiencies, I would like to get feedback what is
missing/what can be added. With a bit of hacking, it is possible to write
ES plugin(s) that can trigger the creation of graphviz, gnuplot, R etc.
plots instead of delivering temporary CSV files.

Jörg

On Tue, Aug 26, 2014 at 12:36 AM, Brian brian.from.fl@gmail.com wrote:

Is there some existing method to integrate processing between the Kibana/
Elasticsearch response JSON and the graphing?

For example, I have a Perl script that can convert an Elasticsearch JSON
response into a CSV, even reversing the response to put the oldest event
first (for gnuplot compatibility). I then have an R script that can accept
a CSV and perform custom statistical analysis from it. It can even
auto-detect the timestamp and ordering and reverse the CSV events (adapting
without change to either an Elasticsearch response as CSV, or a direct CSV
export from Splunk).

I've showed the process to a few people, but all balk outright or else shy
away politely at the thought of going to Kibana's Info button, copying and
pasting the curl-based query, and then running it along with the Perl CSV
conversion script and R processing script from the command line. And I
can't blame them!

It may be that Kibana already has the capability to pipe data through
server-installed commands and scripts, but my lack of Javascript experience
and lack of Kibana internals expertise doesn't seem to help me discover it.

Or perhaps this would be a great new addition to Kibana:

  1. Allow a server-side command to be in the middle of the response and the
    charting.
  2. Deliver the response as a CSV with headers, including the @timestamp
    field of course, to the server-side command, along with the appropriate
    arguments and options for the particular panel.
  3. Document the graphite / graphviz / other format required to display the
    plots.

Just a thought.

Brian

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/132cfc20-ea67-42c8-a518-48404593d35d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/132cfc20-ea67-42c8-a518-48404593d35d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF%3D5SNJcVfZ8gp3grMgA%2BVR5-5On9bgYsOTciA7xT1Q_g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks, Jörg. I will need to find some time to look into this, as it seems
exactly like what I was looking for.

Thanks again!

Brian

On Monday, September 29, 2014 12:21:00 PM UTC-4, Jörg Prante wrote:

It is quite easy to add a wrapper as a plugin in ES in the REST output
routine around search responses, see
https://github.com/jprante/elasticsearch-arrayformat

or

GitHub - jprante/elasticsearch-csv: CSV format for Elasticsearch REST search responses

If the CSV plugin has deficiencies, I would like to get feedback what is
missing/what can be added. With a bit of hacking, it is possible to write
ES plugin(s) that can trigger the creation of graphviz, gnuplot, R etc.
plots instead of delivering temporary CSV files.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/02166e89-b8ad-4778-800a-77e6d01dc8ac%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Lance,

Thanks for the clarification. Yeah, the consensus seems to be to either
issue the same REST command off-line (not available to Windows PMs, since I
am not going to touch Windows with a pole shorter than 25m :-), or to write
a server plug-in (would allow even Windows users to invoke the scripts).

But one question: When I click on the Info button near the upper right of a
panel, it shows the JSON request as invoked by curl. But that's only a
suggestion, right? In other words, my browser is not using curl?

I've run into issues with curl's buffer limitations with large queries, and
am hoping that Kibana is only giving me a suggestion to use curl, but isn't
telling my browser to use curl.

Brian

On Friday, September 26, 2014 2:51:38 PM UTC-4, Lance A. Brown wrote:

On 2014-09-25 11:57 am, Brian wrote:

And as my part of the bargain, I will use Perl, R, or whatever else is
at my disposal to create custom commands that can run on the Kibana
host and perform all of the analysis that our group needs.

Something to remember: The "Kibana host" is your browser. The current
version of Kibana run entirely within the browser, making calls to
Elasticsearch for data, processing it and generating graphs all within
the browser. There is no server-side operating component, just static
files that get loaded into your browser.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9dbdb62e-7473-4d71-81f8-3ed27e90c2fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

On 2014-10-01 3:26 pm, Brian wrote:

But one question: When I click on the Info button near the upper right
of a panel, it shows the JSON request as invoked by curl. But that's
only a suggestion, right? In other words, my browser is not using
curl?

I've run into issues with curl's buffer limitations with large
queries, and am hoping that Kibana is only giving me a suggestion to
use curl, but isn't telling my browser to use curl.

Correct. Kibana formats the query as a curl command so you can more
easily cut-n-paste it into a terminal window someplace and run it. The
Kibana javascript makes it's own http calls to elasticsearch to issue
the queries. It's not somehow running curl out of your browser. :slight_smile:

--[Lance]

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ac2175a0b3c36e5b2a1150bcc709b62c%40webmail.bearcircle.net.
For more options, visit https://groups.google.com/d/optout.