Some elasticsearch thrift test performances - python

Alberto_Paro_2 · September 10, 2010, 10:14pm

I'm writing an elasticsearch "driver" in python.
I'm started the project looking pyelasticsearch and I created a new project
because I don't like the previous api and I want to experiment new way.

The project is in very alpha state, but I have implemented first a connection using standard library via HTTP,
the using the thrift interface.

This is a small dump of results.

(pyes)MBPAlbertoParo:tests alberto$ python generate_dataset.py 10000
samples.shelve generated with 10000 samples

Urllib

(pyes)MBPAlbertoParo:tests alberto$ python performance.py
time: 0:00:08.652321
(pyes)MBPAlbertoParo:tests alberto$ python performance.py
time: 0:00:08.282428
(pyes)MBPAlbertoParo:tests alberto$ python performance.py
time: 0:00:08.889818
(pyes)MBPAlbertoParo:tests alberto$

Thrift

(pyes)MBPAlbertoParo:tests alberto$ python performance.py
time: 0:00:04.448639
(pyes)MBPAlbertoParo:tests alberto$ python performance.py
time: 0:00:04.812295
(pyes)MBPAlbertoParo:tests alberto$ python performance.py
time: 0:00:04.301892
(pyes)MBPAlbertoParo:tests alberto$

This is a first step. I'm investigating in using multiprocess and a producer/consumer to increase the parallel/throughtput.
I'm open to suggestions and hits.

Hi,
Alberto

kimchy · September 12, 2010, 8:33am

Look good Alberto!, I see you already have it on github (
GitHub - aparo/pyes: Python connector for ElasticSearch - the pythonic way to use ElasticSearch), I added it to the projects page on the site.

p.s. Love the name (pyes)!

On Sat, Sep 11, 2010 at 12:14 AM, Alberto Paro alberto.paro@gmail.comwrote:

I'm writing an elasticsearch "driver" in python.
I'm started the project looking pyelasticsearch and I created a new project
because I don't like the previous api and I want to experiment new way.

The project is in very alpha state, but I have implemented first a
connection using standard library via HTTP,
the using the thrift interface.

This is a small dump of results.

(pyes)MBPAlbertoParo:tests alberto$ python generate_dataset.py 10000
samples.shelve generated with 10000 samples

Urllib

(pyes)MBPAlbertoParo:tests alberto$ python performance.py
time: 0:00:08.652321
(pyes)MBPAlbertoParo:tests alberto$ python performance.py
time: 0:00:08.282428
(pyes)MBPAlbertoParo:tests alberto$ python performance.py
time: 0:00:08.889818
(pyes)MBPAlbertoParo:tests alberto$

Thrift

(pyes)MBPAlbertoParo:tests alberto$ python performance.py
time: 0:00:04.448639
(pyes)MBPAlbertoParo:tests alberto$ python performance.py
time: 0:00:04.812295
(pyes)MBPAlbertoParo:tests alberto$ python performance.py
time: 0:00:04.301892
(pyes)MBPAlbertoParo:tests alberto$

This is a first step. I'm investigating in using multiprocess and a
producer/consumer to increase the parallel/throughtput.
I'm open to suggestions and hits.

Hi,
Alberto

Topic		Replies	Views
ES write performance Elasticsearch	34	3299	July 6, 2017
Indexing Progressively Slows on Thrift Input Stream Elasticsearch	4	372	July 6, 2017
Slow indexing performance via http Elasticsearch	8	739	July 6, 2017
Pyes related question : performance related Elasticsearch	13	542	July 6, 2017
Thrift Usage Elasticsearch	4	289	July 6, 2017

Some elasticsearch thrift test performances - python

Urllib

Thrift

Urllib

Thrift

Related topics