Node replication


(Vinicius Carvalho) #1

Hello there! We are using ES to replace our search (endeca) stack, and we
loved it so much, that we actually decided to use it for other things. We
have a requirement on a token service, and instead of using a mongodb or
memcached we decided to use ES, it has the json schema from mongo and the
TTL that a cache can give you. Besides, 90% of the queries will be simple
gets on an index: /tokens/token/{id} a few will be some queries on two
indexed fields.

Our search is not realtime, so we do not care about consistency on the
nodes in a window of few minutes, but the token service has to be
consistent.

So, my question is, if I have a cluster of 3 nodes, with replication (no
shards, the index should be small < 200k elements per day, and we will use
a 1d TTL), let's say the token server adds a token in node A, a few seconds
later a client queries node B for the given token. If it does not find it
(using GET), will it check the other nodes before failing? Is there any way
to control this?

I know you can control the search type, but what about the GET operation?

Regards

--


(David Pilato) #2

It won't fail. If your document exists somewhere in the cluster, you will get it.

You don't have to worry about the node you query on.

HTH

David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 27 août 2012 à 04:24, Vinicius Carvalho viniciusccarvalho@gmail.com a écrit :

Hello there! We are using ES to replace our search (endeca) stack, and we loved it so much, that we actually decided to use it for other things. We have a requirement on a token service, and instead of using a mongodb or memcached we decided to use ES, it has the json schema from mongo and the TTL that a cache can give you. Besides, 90% of the queries will be simple gets on an index: /tokens/token/{id} a few will be some queries on two indexed fields.

Our search is not realtime, so we do not care about consistency on the nodes in a window of few minutes, but the token service has to be consistent.

So, my question is, if I have a cluster of 3 nodes, with replication (no shards, the index should be small < 200k elements per day, and we will use a 1d TTL), let's say the token server adds a token in node A, a few seconds later a client queries node B for the given token. If it does not find it (using GET), will it check the other nodes before failing? Is there any way to control this?

I know you can control the search type, but what about the GET operation?

Regards

--


(Clinton Gormley) #3

Hi Vinicius

On Mon, 2012-08-27 at 08:06 +0200, David Pilato wrote:

It won't fail. If your document exists somewhere in the cluster, you
will get it.

You don't have to worry about the node you query on.

To add to the above, take a look at one of my Perl modules:
https://metacpan.org/module/ElasticSearchX::UniqueKey

which has a lot in common with what you want to do.

Specifically, have a look at how I create the index:
https://metacpan.org/source/DRTECH/ElasticSearchX-UniqueKey-0.03/lib/ElasticSearchX/UniqueKey.pm#L164

I'm not using TTL in this case, as my unique keys are supposed to
endure, but disabling _all and _source indexing will work well with what
you want to achieve.

clint

HTH

David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 27 août 2012 à 04:24, Vinicius Carvalho
viniciusccarvalho@gmail.com a écrit :

Hello there! We are using ES to replace our search (endeca) stack,
and we loved it so much, that we actually decided to use it for
other things. We have a requirement on a token service, and instead
of using a mongodb or memcached we decided to use ES, it has the
json schema from mongo and the TTL that a cache can give you.
Besides, 90% of the queries will be simple gets on an
index: /tokens/token/{id} a few will be some queries on two indexed
fields.

Our search is not realtime, so we do not care about consistency on
the nodes in a window of few minutes, but the token service has to
be consistent.

So, my question is, if I have a cluster of 3 nodes, with replication
(no shards, the index should be small < 200k elements per day, and
we will use a 1d TTL), let's say the token server adds a token in
node A, a few seconds later a client queries node B for the given
token. If it does not find it (using GET), will it check the other
nodes before failing? Is there any way to control this?

I know you can control the search type, but what about the GET
operation?

Regards

--


(system) #4