Node replication

Vinicius_Carvalho · August 27, 2012, 2:24am

Hello there! We are using ES to replace our search (endeca) stack, and we
loved it so much, that we actually decided to use it for other things. We
have a requirement on a token service, and instead of using a mongodb or
memcached we decided to use ES, it has the json schema from mongo and the
TTL that a cache can give you. Besides, 90% of the queries will be simple
gets on an index: /tokens/token/{id} a few will be some queries on two
indexed fields.

Our search is not realtime, so we do not care about consistency on the
nodes in a window of few minutes, but the token service has to be
consistent.

So, my question is, if I have a cluster of 3 nodes, with replication (no
shards, the index should be small < 200k elements per day, and we will use
a 1d TTL), let's say the token server adds a token in node A, a few seconds
later a client queries node B for the given token. If it does not find it
(using GET), will it check the other nodes before failing? Is there any way
to control this?

I know you can control the search type, but what about the GET operation?

Regards

--

dadoonet · August 27, 2012, 6:06am

It won't fail. If your document exists somewhere in the cluster, you will get it.

You don't have to worry about the node you query on.

HTH

David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 27 août 2012 à 04:24, Vinicius Carvalho viniciusccarvalho@gmail.com a écrit :

Hello there! We are using ES to replace our search (endeca) stack, and we loved it so much, that we actually decided to use it for other things. We have a requirement on a token service, and instead of using a mongodb or memcached we decided to use ES, it has the json schema from mongo and the TTL that a cache can give you. Besides, 90% of the queries will be simple gets on an index: /tokens/token/{id} a few will be some queries on two indexed fields.

Our search is not realtime, so we do not care about consistency on the nodes in a window of few minutes, but the token service has to be consistent.

So, my question is, if I have a cluster of 3 nodes, with replication (no shards, the index should be small < 200k elements per day, and we will use a 1d TTL), let's say the token server adds a token in node A, a few seconds later a client queries node B for the given token. If it does not find it (using GET), will it check the other nodes before failing? Is there any way to control this?

I know you can control the search type, but what about the GET operation?

Regards

--

Clinton_Gormley · August 27, 2012, 8:23am

Hi Vinicius

On Mon, 2012-08-27 at 08:06 +0200, David Pilato wrote:

It won't fail. If your document exists somewhere in the cluster, you
will get it.

You don't have to worry about the node you query on.

To add to the above, take a look at one of my Perl modules:
https://metacpan.org/module/ElasticSearchX::UniqueKey

which has a lot in common with what you want to do.

Specifically, have a look at how I create the index:
https://metacpan.org/source/DRTECH/ElasticSearchX-UniqueKey-0.03/lib/ElasticSearchX/UniqueKey.pm#L164

I'm not using TTL in this case, as my unique keys are supposed to
endure, but disabling _all and _source indexing will work well with what
you want to achieve.

clint

HTH

David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 27 août 2012 à 04:24, Vinicius Carvalho
viniciusccarvalho@gmail.com a écrit :

Hello there! We are using ES to replace our search (endeca) stack,
and we loved it so much, that we actually decided to use it for
other things. We have a requirement on a token service, and instead
of using a mongodb or memcached we decided to use ES, it has the
json schema from mongo and the TTL that a cache can give you.
Besides, 90% of the queries will be simple gets on an
index: /tokens/token/{id} a few will be some queries on two indexed
fields.

Our search is not realtime, so we do not care about consistency on
the nodes in a window of few minutes, but the token service has to
be consistent.

So, my question is, if I have a cluster of 3 nodes, with replication
(no shards, the index should be small < 200k elements per day, and
we will use a 1d TTL), let's say the token server adds a token in
node A, a few seconds later a client queries node B for the given
token. If it does not find it (using GET), will it check the other
nodes before failing? Is there any way to control this?

I know you can control the search type, but what about the GET
operation?

Regards

--