Elastic search index strategies under high traffic


(Onur BARAN) #1

We use ElasticSearch for our tool's real time metrics and analytics part.
ElasticSearch is very cool and fast when we are query our data.
(statiticial facets and terms facet)

But we have problem when we try to index our hourly data. We collect every
our metric data from other services. First we collect data from other
services and save them RabbitMQ process. But when queue worker runs our all
hourly data not index to ES. Usually %40 of data index in ES and other them
lost.

So what is your idea about when index ES under high traffic ?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Adrien Grand) #2

Hi Onur,

On Mon, Sep 23, 2013 at 6:18 PM, Onur BARAN baranonur@gmail.com wrote:

We use ElasticSearch for our tool's real time metrics and analytics part.
ElasticSearch is very cool and fast when we are query our data.
(statiticial facets and terms facet)

But we have problem when we try to index our hourly data. We collect every
our metric data from other services. First we collect data from other
services and save them RabbitMQ process. But when queue worker runs our all
hourly data not index to ES. Usually %40 of data index in ES and other them
lost.

So what is your idea about when index ES under high traffic ?

Even under high traffic, ES shouldn't lose documents, can you give more
details about the errors you are seeing?

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Onur BARAN) #3

There is no error in logs and index result. I try to index dummy document
(100.000 doc)
When indexing process finished, ES index only ~70000 docs.

23 Eylül 2013 Pazartesi 19:18:59 UTC+3 tarihinde Onur BARAN yazdı:

We use ElasticSearch for our tool's real time metrics and analytics part.
ElasticSearch is very cool and fast when we are query our data.
(statiticial facets and terms facet)

But we have problem when we try to index our hourly data. We collect every
our metric data from other services. First we collect data from other
services and save them RabbitMQ process. But when queue worker runs our all
hourly data not index to ES. Usually %40 of data index in ES and other them
lost.

So what is your idea about when index ES under high traffic ?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #4

You should check if the doc id's overlap. If so, you'll accidentally
overwrite previous docs.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #5