ElasticSearch as a blazing fast primary key ticket server


(Clinton Gormley) #1

Hi all

I thought some of you may be interested in a blog post I've just
published about how to use ElasticSearch as a primary key ticket server
(think auto-increment in MySQL).

http://blogs.perl.org/users/clinton_gormley/2011/10/elasticsearchsequence---a-blazing-fast-ticket-server.html

Using ES as a backend, I can get almost double the performance that I
get out of MySQL

clint


(David Pilato) #2

Thanks Clint. It's a nice post and shows that we can use ES to do many other things than searches !

To be fast, you write that you have to use bulk. I am wondering if you can also group your mysql calls with only one commit at end and if so, is MySql still "slow" ?

BTW, I strongly think that ES will be used in the next years as a primary database and not only for indexes and searches.

David :wink:

Le 22 oct. 2011 à 16:42, Clinton Gormley clint@traveljury.com a écrit :

Hi all

I thought some of you may be interested in a blog post I've just
published about how to use ElasticSearch as a primary key ticket server
(think auto-increment in MySQL).

http://blogs.perl.org/users/clinton_gormley/2011/10/elasticsearchsequence---a-blazing-fast-ticket-server.html

Using ES as a backend, I can get almost double the performance that I
get out of MySQL

clint


(Clinton Gormley) #3

Hi David

Thanks Clint. It's a nice post and shows that we can use ES to do many other things than searches !

thanks :slight_smile:

To be fast, you write that you have to use bulk. I am wondering if you
can also group your mysql calls with only one commit at end and if so,
is MySql still "slow" ?

According to the MySQL docs:

    If you insert multiple rows using a single INSERT statement,
    LAST_INSERT_ID() returns the value generated for the first
    inserted row only. The reason for this is to make it possible to
    reproduce easily the same INSERT statement against some other
    server. 

So it looks like the answer is no.

But my point wasn't to show that MySQL was slow, but to show that ES can
keep up with an already fast solution, and in fact do better.

BTW, I strongly think that ES will be used in the next years as a
primary database and not only for indexes and searches.

Yeah - the only thing in the way for me now is the 1 second delay on
searching after indexing a doc, ie that is something I have to code
around, rather than having it "just work".

clint

David :wink:

Le 22 oct. 2011 à 16:42, Clinton Gormley clint@traveljury.com a écrit :

Hi all

I thought some of you may be interested in a blog post I've just
published about how to use ElasticSearch as a primary key ticket server
(think auto-increment in MySQL).

http://blogs.perl.org/users/clinton_gormley/2011/10/elasticsearchsequence---a-blazing-fast-ticket-server.html

Using ES as a backend, I can get almost double the performance that I
get out of MySQL

clint


(Karussell) #4

Yeah - the only thing in the way for me now is the 1 second delay on
searching after indexing a doc, ie that is something I have to code
around, rather than having it "just work".

Couldn't you use the realtime get feature instead of search for DB-
like commands?

Regards,
Peter.

BTW: I blogged about ES as the new kid on the nosql horizont some
times ago :wink:


(Clinton Gormley) #5

Hi Peter

On Sun, 2011-10-23 at 14:26 -0700, Karussell wrote:

Yeah - the only thing in the way for me now is the 1 second delay on
searching after indexing a doc, ie that is something I have to code
around, rather than having it "just work".

Couldn't you use the realtime get feature instead of search for DB-
like commands?

You can't use get when you don't know what you're getting. A simple
example is: the user adds a new comment, you return the page, showing
all comments (which should include the new comment left by the user).

A simple query won't include your newly added comment (or might,
depending on whether the index was refreshed automatically or not).
Instead you need to include the newly added comment in the list
manually.

I haven't yet thought through all the consequences of this 1 second
delay - they may all be as simple as the example I give above, but maybe
not.

In my current framework, I use memcached in front of the database, so
there is an added complication of cache expiry of collections when
adding new objects.

Some queries are frequently used, and faster to return from memcached
than from the DB. As fast as ES is, the same may be true when using just
ES as your datastore.

Perhaps not. I very much doubt that doing a query in ES can be as fast
as getting cached results from memcached. But maybe it is fast enough
(and scalable enough) to allow me to remove memcached from the
infrastructure.

If so, that would really simplify the architecture: no RDBM, no
memcached, no cache expiry.

But my gut instinct says that there is always likely to be a role for
caching.

I'm definitely going down the route of using ES as my only data store
for my next application, so I'll report back my experiences when I've
collected them :slight_smile:

clint


(system) #6