Splunk vs. Elastic search performance?

brian_yoder · July 2, 2014, 1:08pm

Patrick,

> Well, I did answer your question. But probably not from the direction
you expected. hmm no, you didn't. My question was: "it looks like I cant
retrieve/display [_all fields] content. Any idea?" and you replied with
your logstash template where _all is disabled. I'm interested in disabling
_all, but that was not my question at this point.

Fair enough. I don't know the inner details; I am just an enthusiastic end
user.

To the best of my knowledge, there is no content for the _all field; I view
this as an Elasticsearch psuedo field whose name is _all and whose index
terms are taken from all fields (by default), but still there is no actual
content for it.

And after I got into the habit of disabling the _all field, my hands-on
exploration of its nuances have ended. It's time for the experts to explain!

*Your answer to my second message, below, is informative and interesting
but fails to answer my second question too. I simply asked whether I need
to feed the complete modified mapping of my template or if I can just push
the modified part (ie. the _all:{enabled: false} part). *

Again, I have never done this, so I can only tell you what I do. I just
cannot tell you all the nuances of what Elasticsearch is capable of.

My recommendation is to try it. Elasticsearch is great at letting you
experiment and then telling you clearly if your attempt succeeds or fails.

So, try your scenario. If it fails, then it didn't work or you did
something wrong. If it succeeds, then you can see exactly what
Elasticsearch actually accepted as your mapping. For example:

curl 'http://localhost:9200/logstash-2014.06.30/_mapping?pretty=true' &&
echo

This particular query looks at one of my logstash-generated indices, and it
lets me verify that Elasticsearch and Logstash conspired to create the
mappings I expected. I used this command quite a bit until I finally got
everything configured correctly. (I actually verify the mapping via
Elasticsearch Head, but under the covers it's the same command.)

Brian

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8eaefd0e-f684-4f44-9fcb-3137812a99d3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Steve_Mayzak · July 2, 2014, 3:22pm

All,

This seems apropos to the current discussion and could help clear up some
confusion on recommendations etc. We, Elasticsearch, are hosting a Webinar
on ELK, given by the Logstash creator, Jordan Sissel.

Its today in 40 minutes.

On Wednesday, July 2, 2014 6:08:34 AM UTC-7, Brian wrote:

Patrick,

> Well, I did answer your question. But probably not from the direction
you expected. hmm no, you didn't. My question was: "it looks like I cant
retrieve/display [_all fields] content. Any idea?" and you replied with
your logstash template where _all is disabled. I'm interested in disabling
_all, but that was not my question at this point.

Fair enough. I don't know the inner details; I am just an enthusiastic end
user.

To the best of my knowledge, there is no content for the _all field; I
view this as an Elasticsearch psuedo field whose name is _all and whose
index terms are taken from all fields (by default), but still there is no
actual content for it.

And after I got into the habit of disabling the _all field, my hands-on
exploration of its nuances have ended. It's time for the experts to explain!

*Your answer to my second message, below, is informative and interesting
but fails to answer my second question too. I simply asked whether I need
to feed the complete modified mapping of my template or if I can just push
the modified part (ie. the _all:{enabled: false} part). *

Again, I have never done this, so I can only tell you what I do. I just
cannot tell you all the nuances of what Elasticsearch is capable of.

My recommendation is to try it. Elasticsearch is great at letting you
experiment and then telling you clearly if your attempt succeeds or fails.

So, try your scenario. If it fails, then it didn't work or you did
something wrong. If it succeeds, then you can see exactly what
Elasticsearch actually accepted as your mapping. For example:

curl 'http://localhost:9200/logstash-2014.06.30/_mapping?pretty=true' &&
echo

This particular query looks at one of my logstash-generated indices, and
it lets me verify that Elasticsearch and Logstash conspired to create the
mappings I expected. I used this command quite a bit until I finally got
everything configured correctly. (I actually verify the mapping via
Elasticsearch Head, but under the covers it's the same command.)

Brian

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d2dd4206-c8bd-4c96-90df-5ad4a7bce5e1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Steve_Mayzak · July 3, 2014, 1:47am

In the latest version of Logstash, you can use the elasticsearch output and
just set the protocol to http. The elasticsearch_http output will be
removed eventually.

On Monday, June 23, 2014 9:22:28 AM UTC-7, Ivan Brusic wrote:

I agree. I thought elasticsearch_http was actually the recommended route.
Also, I have seen no reported issues with different client/server versions
since 1.0. My current logstash setup (which is not production level, simply
a dev logging tool) uses Elasticsearch 1.2.1 with Logstash 1.4.1 using the
non http interface.

--
Ivan

On Fri, Jun 20, 2014 at 3:29 PM, Mark Walkom <ma...@campaignmonitor.com
<javascript:>> wrote:

I wasn't aware that the elasticsearch_http output wasn't recommended?
When I spoke to a few of the ELK devs a few months ago, they indicated
that there was minimal performance difference, at the greater benefit of
not being locked to specific LS+ES versioning.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 21 June 2014 02:43, Brian <brian....@gmail.com <javascript:>> wrote:

Thomas,

Thanks for your insights and experiences. As I am someone who has
explored and used ES for over a year but is relatively new to the ELK
stack, your data points are extremely valuable. Let me offer some of my own
views.

Re: double the storage. I strongly recommend ELK users to disable the
_all field. The entire text of the log events generated by logstash ends up
in the message field (and not @message as many people incorrectly
post). So the _all field is just redundant overhead with no value add. The
result is a dramatic drop in database file sizes and dramatic increase in
load performance. Of course, you need to configure ES to use the message field
as the default for a Lucene Kibana query.

During the year that I've used ES and watched this group, I have been on
the front line of a brand new product with a smart and dedicated
development team working steadily to improve the product. Six months ago,
the ELK stack eluded me and reports weren't encouraging (with the sole
exception of the Kibana web site's marketing pitch). But ES has come a long
way since six months ago, and the ELK stack is much more closely integrated.

The Splunk UI is carefully crafted to isolate users from each other and
prevent external (to the Splunk db itself, not to our company) users from
causing harm to data. But Kibana seems to be meant for a small cadre of
trusted users. What if I write a dashboard with the same name as someone
else's? Kibana doesn't even begin to discuss user isolation. But I am
confident that it will.

How can I tell Kibana to set the default Lucene query operator to AND
instead of OR. Google is not my friend: I keep getting references to the
Ruby versions of Kibana; that's ancient history by now. Kibana is cool and
promising, but it has a long way to go for deployment to all of the folks
in our company who currently have access to Splunk.

Logstash has a nice book that's been very helpful, and logstash itself
has been an excellent tool for prototyping. The book has been invaluable in
helping me extract dates from log events and handling all of our different
multiline events. But it still doesn't explain why the date filter needs a
different array of matching strings to get the date that the grok filter
has already matched and isolated. And recommendations to avoid the
elasticsearch_http output and use elasticsearch (via the Node client)
directly contradict the fact that logstash's 1.1.1 version of the ES client
library is not compatible with the most recent 1.2.1 version of ES.

And logstash is also a resource hog, so we eventually plan to replace it
with Perl and Apache Flume (already in use) and pipe it into my Java bulk
load tool (which is always kept up-to-date with the versions of ES we
deploy!!). Because we send the data via Flume to our data warehouse, any
losses in ES will be annoying but won't be catastrophic. And the front-end
following of rotated log files will be done using the GNU tail -F command
and option. This GNU tail command with its uppercase -F option follows
rotated log files perfectly. I doubt that logstash can do the same, and we
currently see that neither can Splunk (so we sporadically lose log events
in Splunk too). So GNU tail -F piped into logstash with the stdin filter
works perfectly in my evaluation setup and will likely form the first stage
of any log forwarder we end up deploying,

Brian

On Thursday, June 19, 2014 8:48:34 AM UTC-4, Thomas Paulsen wrote:

We had a 2,2TB/d installation of Splunk and ran it on VMWare with 12
Indexer and 2 Searchheads. Each indexer had 1000IOPS guaranteed assigned.
The system is slow but ok to use.

We tried Elasticsearch and we were able to get the same performance
with the same amount of machines. Unfortunately with Elasticsearch you need
almost double amount of storage, plus a LOT of patience to make is run. It
took us six months to set it up properly, and even now, the system is quite
buggy and instable and from time to time we loose data with Elasticsearch.

I don´t recommend ELK for a critical production system, for just dev
work, it is ok, if you don´t mind the hassle of setting up and operating
it. The costs you save by not buying a splunk license you have to invest
into consultants to get it up and running. Our dev teams hate Elasticsearch
and prefer Splunk.

On Thursday, June 19, 2014 8:48:34 AM UTC-4, Thomas Paulsen wrote:

We had a 2,2TB/d installation of Splunk and ran it on VMWare with 12
Indexer and 2 Searchheads. Each indexer had 1000IOPS guaranteed assigned.
The system is slow but ok to use.

We tried Elasticsearch and we were able to get the same performance
with the same amount of machines. Unfortunately with Elasticsearch you need
almost double amount of storage, plus a LOT of patience to make is run. It
took us six months to set it up properly, and even now, the system is quite
buggy and instable and from time to time we loose data with Elasticsearch.

I don´t recommend ELK for a critical production system, for just dev
work, it is ok, if you don´t mind the hassle of setting up and operating
it. The costs you save by not buying a splunk license you have to invest
into consultants to get it up and running. Our dev teams hate Elasticsearch
and prefer Splunk.

Am Samstag, 19. April 2014 00:07:44 UTC+2 schrieb Mark Walkom:

That's a lot of data! I don't know of any installations that big but
someone else might.

What sort of infrastructure are you running splunk on now, what's your
current and expected retention?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 19 April 2014 07:33, Frank Flynn faultle...@gmail.com wrote:

We have a large Splunk instance. We load about 1.25 Tb of logs a
day. We have about 1,300 loaders (servers that collect and load logs -
they may do other things too).

As I look at Elasticsearch / Logstash / Kibana does anyone know of a
performance comparison guide? Should I expect to run on very similar
hardware? More? or Less?

Sure it depends on exactly what we're doing, the exact queries and
the frequency we'd run them but I'm trying to get any kind of idea before
we start.

Are there any white papers or other documents about switching? It
seems an obvious choice but I can only find very little performance
comparisons (I did see that Elasticsearch just hired "the former VP of
Products at Splunk, Gaurav Gupta" - but there were few numbers in that
article either).

Thanks,
Frank

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/ea1a338b-5b44-485d-84b2-3558a812e8a0%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ea1a338b-5b44-485d-84b2-3558a812e8a0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6441b278-39ad-417d-98a6-d6e131895634%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6441b278-39ad-417d-98a6-d6e131895634%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624ZPUksz0DdYMPrTrN0D21PqSdbZrEozGsG8srjom3CvSQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEM624ZPUksz0DdYMPrTrN0D21PqSdbZrEozGsG8srjom3CvSQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/436d1afd-c533-4aaa-a970-54edb9367029%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Oneti_Messo · July 11, 2014, 7:17pm

I am new to this subject. I noticed that Rsyslog also has an elasticsearch output module for sending traditional syslog and other application logs (i.e., tail -f any text file) to elasticsearch directly. Does that mean I can skip the logstash middleman and create a system using just elasticsearch and kibana?

Onetimesso

Oneti_Messo · July 13, 2014, 1:27am

I am new to this subject. I noticed that Rsyslog also has an elasticsearch output module for sending traditional syslog and other application logs (i.e., tail -f any text file) to elasticsearch directly. Does that mean I can skip the logstash middleman and create a system using just elasticsearch and kibana?

Oneti Messo

otisg · July 19, 2014, 5:33am

Hi Oneti,

Yes, you can use omelasticsearch and index logs directly from rsyslog to
ES. No need for Logstash.
We have some documentation about how to index logs into Logsene over
at https://sematext.atlassian.net/wiki/display/PUBLOGSENE/Sending+Events+to+Logsene
and the piece that sounds like you may be after is
at https://sematext.atlassian.net/wiki/display/PUBLOGSENE/rsyslog .

You should be able to use pretty much all the information there to index
your logs to your own ES cluster.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Monday, July 14, 2014 3:54:37 AM UTC-4, Oneti Messo wrote:

I am new to this subject. I noticed that Rsyslog also has an
elasticsearch
output module for sending traditional syslog and other application logs
(i.e., tail -f any text file) to elasticsearch directly. Does that mean I
can skip the logstash middleman and create a system using just
elasticsearch
and kibana?

Oneti Messo

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Splunk-vs-Elastic-search-performance-tp4054414p4059773.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a3ddbf84-4535-484c-879e-559219cc84ed%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Skender_Kollcaku · April 30, 2015, 4:01pm

Hi everyone,

Recently I use Splunk Enterprise for my company's need and I am new to
Elasticsearch (and other open-source alternatives mentioned here).
But I am also interested in this analysis (and focus group) about these two
solutions.
One of the first analysis I read in Web is from riskfocus, which is a
Splunk partner, and here is the link about performance:
http://riskfocus.com/splunk-vs-elk-part-1-cost/

I hope new topics will emerge from the article (and maybe critical views!).

Look forward to hear more experiences with the two tools!

Skender Kollcaku
System Engineer
Consoft Sistemi
Milan, Italy

On Friday, 18 April 2014 23:33:59 UTC+2, Frank Flynn wrote:

We have a large Splunk instance. We load about 1.25 Tb of logs a day. We
have about 1,300 loaders (servers that collect and load logs - they may do
other things too).

As I look at Elasticsearch / Logstash / Kibana does anyone know of a
performance comparison guide? Should I expect to run on very similar
hardware? More? or Less?

Sure it depends on exactly what we're doing, the exact queries and the
frequency we'd run them but I'm trying to get any kind of idea before we
start.

Are there any white papers or other documents about switching? It seems
an obvious choice but I can only find very little performance comparisons
(I did see that Elasticsearch just hired "the former VP of Products at
Splunk, Gaurav Gupta" - but there were few numbers in that article either).

Thanks,
Frank

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aa7d38c6-2858-42d6-b419-63fb264a54cd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Help for extra large load Elasticsearch	5	342	March 9, 2022
Performance issues that make no sense Kibana	10	1244	July 28, 2017
Elasticsearch Performance Problem Elasticsearch	5	784	June 1, 2017
Elasticsearch and log load Elasticsearch	9	875	April 4, 2019
Service needed: looking for the elasticsearch expert Elasticsearch	1	315	July 6, 2017

Splunk vs. Elastic search performance?

Otis

Related topics