What tools to use for indexing ES in production?

We want to index our sales records as they come in from our apps. We are
using quartz job right now, which is really slow and not really real-time.

We will be implementing a message bus soon for firing sales event.

The process is:
read from a message queue
grab some extra data from mysql
do some ETL to construct the document
index it to ES.

I've been reading about apache spark, ES river, logstash.

My questions is what kind of tools are right for the job here?
is apache spark an over kill?
is DIY a better option here?
what are you guys using?

please advice and point me to the right thing to read.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/949b0c40-e8b1-4d40-bced-68e5c005c713%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

This depends on how the app is made and what options you have to extract
data from it.

On 16 February 2015 at 20:28, Kevin Liu kevin@ticketfly.com wrote:

We want to index our sales records as they come in from our apps. We are
using quartz job right now, which is really slow and not really real-time.

We will be implementing a message bus soon for firing sales event.

The process is:
read from a message queue
grab some extra data from mysql
do some ETL to construct the document
index it to ES.

I've been reading about apache spark, ES river, logstash.

My questions is what kind of tools are right for the job here?
is apache spark an over kill?
is DIY a better option here?
what are you guys using?

please advice and point me to the right thing to read.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/949b0c40-e8b1-4d40-bced-68e5c005c713%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/949b0c40-e8b1-4d40-bced-68e5c005c713%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_Ggo6O0fTFYfH2ggFiPODZCNU0XmVXNBUNK5XeBBSXsg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Well, the sales messages are coming through kafka. We need to extract some
info from the database. We can do anything really. I'm just not sure what
are the common practice here. It seems to be so many options. what kind of
questions am I not asking here?

On Monday, February 16, 2015 at 2:30:45 AM UTC-8, Mark Walkom wrote:

This depends on how the app is made and what options you have to extract
data from it.

On 16 February 2015 at 20:28, Kevin Liu <ke...@ticketfly.com <javascript:>

wrote:

We want to index our sales records as they come in from our apps. We are
using quartz job right now, which is really slow and not really real-time.

We will be implementing a message bus soon for firing sales event.

The process is:
read from a message queue
grab some extra data from mysql
do some ETL to construct the document
index it to ES.

I've been reading about apache spark, ES river, logstash.

My questions is what kind of tools are right for the job here?
is apache spark an over kill?
is DIY a better option here?
what are you guys using?

please advice and point me to the right thing to read.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/949b0c40-e8b1-4d40-bced-68e5c005c713%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/949b0c40-e8b1-4d40-bced-68e5c005c713%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4a11083b-4e9f-4be2-9a8e-83caf34d56b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kafka is fine, you could also use Logstash to read this data and send to ES

NB this is in the 1.5 BETA release, so use with caution.

Otherwise take a look at other Logstash inputs and see if there is
something suitable you can leverage, there is also a number of official
clients if you want to roll your own. There isn't an official method, but
those last two are pretty common.

On 19 February 2015 at 20:07, Kevin Liu kevin@ticketfly.com wrote:

Well, the sales messages are coming through kafka. We need to extract some
info from the database. We can do anything really. I'm just not sure what
are the common practice here. It seems to be so many options. what kind of
questions am I not asking here?

On Monday, February 16, 2015 at 2:30:45 AM UTC-8, Mark Walkom wrote:

This depends on how the app is made and what options you have to extract
data from it.

On 16 February 2015 at 20:28, Kevin Liu ke...@ticketfly.com wrote:

We want to index our sales records as they come in from our apps. We are
using quartz job right now, which is really slow and not really real-time.

We will be implementing a message bus soon for firing sales event.

The process is:
read from a message queue
grab some extra data from mysql
do some ETL to construct the document
index it to ES.

I've been reading about apache spark, ES river, logstash.

My questions is what kind of tools are right for the job here?
is apache spark an over kill?
is DIY a better option here?
what are you guys using?

please advice and point me to the right thing to read.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/949b0c40-e8b1-4d40-bced-68e5c005c713%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/949b0c40-e8b1-4d40-bced-68e5c005c713%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4a11083b-4e9f-4be2-9a8e-83caf34d56b4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/4a11083b-4e9f-4be2-9a8e-83caf34d56b4%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8r2as%3DXJRfYa1cTV3ZGsV_ckfjW-ZxqQCF1iDa9mz-mA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.