How to speed up indexing by using Python API


(潘飞) #1

Hi all:

Now , I am trying to index my logs by using the elasticsearch Python API,
but I only get about 600 records/s indexing speed.

but, on the same ES cluster, with the same data, logstash(redis -> logstash
-> elasticsearch) can index data at the speed about 3000records/s.

any advice on how to speed up indexing speed by using the Python API?

thanks very much.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1b957d36-b7ad-4671-999a-de06aaa74407%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Honza Král) #2

Hi,

what method are you using in your python script? Have you looked at
the bulk and streaming_bulk helpers in ealsticsearch-py?

http://elasticsearch-py.readthedocs.org/en/master/helpers.html

Hope this helps,
Honza

On Thu, May 22, 2014 at 11:09 AM, 潘飞 cnweike@gmail.com wrote:

Hi all:

Now , I am trying to index my logs by using the elasticsearch Python API,
but I only get about 600 records/s indexing speed.

but, on the same ES cluster, with the same data, logstash(redis -> logstash
-> elasticsearch) can index data at the speed about 3000records/s.

any advice on how to speed up indexing speed by using the Python API?

thanks very much.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1b957d36-b7ad-4671-999a-de06aaa74407%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CABfdDipXuL01-WNM0DbDE2Y%2BqBTs5G3wdRofMGN6C2eY2uwWrA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(emeschitc) #3

Hi,

use bulk indexing !
This will speed you up by at least an order of magnitude
em

On Thu, May 22, 2014 at 11:09 AM, 潘飞 [via ElasticSearch Users] <
ml-node+s115913n4056261h83@n3.nabble.com> wrote:

Hi all:

Now , I am trying to index my logs by using the elasticsearch Python API,
but I only get about 600 records/s indexing speed.

but, on the same ES cluster, with the same data, logstash(redis ->
logstash -> elasticsearch) can index data at the speed about 3000records/s.

any advice on how to speed up indexing speed by using the Python API?

thanks very much.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [hidden email]http://user/SendEmail.jtp?type=node&node=4056261&i=0
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1b957d36-b7ad-4671-999a-de06aaa74407%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/1b957d36-b7ad-4671-999a-de06aaa74407%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the discussion
below:

http://elasticsearch-users.115913.n3.nabble.com/How-to-speed-up-indexing-by-using-Python-API-tp4056261.html
To unsubscribe from ElasticSearch Users, click herehttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=115913&code=ZW1lc2NoaXRjQGdtYWlsLmNvbXwxMTU5MTN8LTExODcwOTk0NDI=
.
NAMLhttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


(system) #4