Read\Write paths


(Keidi) #1

Hi,

I'm new to ElasticSearch but from what I have seen so far I couldn't
figure out whether it can support a read\write path architecture.

Thanks!


(Lukáš Vlček) #2

Hi,

if you mean search or get for the read path then you might want to check GET
REST API http://www.elasticsearch.com/docs/elasticsearch/rest_api/get/ or
search REST API
http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/uri_request/
(note
there is also more advanced search REST API here
http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/body_request/
)

if you mean indexing documents for the write path then you can check index
REST API http://www.elasticsearch.com/docs/elasticsearch/rest_api/index/ (or
bulk updates http://www.elasticsearch.com/docs/elasticsearch/rest_api/bulk/
).

But there is much more to this... what exactly are you after?

Regards,
Lukas

On Wed, Oct 13, 2010 at 3:58 PM, Keidi shahar.clf@gmail.com wrote:

Hi,

I'm new to ElasticSearch but from what I have seen so far I couldn't
figure out whether it can support a read\write path architecture.

Thanks!


(Keidi) #3

Hi Lukas,

I believe you didn't follow my meaning. When I say read/write paths I mean
separating between nodes that act as write-only nodes (write path) and nodes
that are read-only nodes (read path). The idea is to reduce as much as
possible the load off the read path for better query performance. In other
words - we don't want heavy indexing requests to take resources away from
query requests.
This of course requires some mechanism that efficiently and regularly
replicates data from the write-path to the read-path. I'm guessing this is
not implemented by ElasticSearch but it can be implemented by us (for
example, by using different indexes for the different paths and migrating
data in bulks).
The question is: does this kind of approach goes against everything
ElasticSearch stands for? I'm not sure, but I did get the feeling that the
whole idea is to solve performance problems not by separating between
read\write paths but by scaling-out (i.e. adding more machines).

Thanks,
Shahar

View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Read-Write-paths-tp1694631p1700006.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(Keidi) #4

Hi Lukas,

I believe you didn't follow my meaning. When I say read/write paths I mean
separating between nodes that act as write-only nodes (write path) and nodes
that are read-only nodes (read path). The idea is to reduce as much as
possible the load off the read path for better query performance. In other
words - we don't want heavy indexing requests to take resources away from
query requests.
This of course requires some mechanism that efficiently and regularly
replicates data from the write-path to the read-path. I'm guessing this is
not implemented by ElasticSearch but it can be implemented by us (for
example, by using different indexes for the different paths and migrating
data in bulks).
The question is: does this kind of approach goes against everything
ElasticSearch stands for? I'm not sure, but I did get the feeling that the
whole idea is to solve performance problems not by separating between
read\write paths but by scaling-out (i.e. adding more machines).

Thanks,
Shahar

View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Read-Write-paths-tp1694631p1700695.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(Keidi) #5

Hi Lukas,

I believe you didn't follow my meaning. When I say read/write paths I
mean separating between nodes that act as write-only nodes (write
path) and nodes that are read-only nodes (read path). The idea is to
reduce as much as possible the load off the read path for better query
performance. In other words - we don't want heavy indexing requests to
take resources away from query requests.
This of course requires some mechanism that efficiently and regularly
replicates data from the write-path to the read-path. I'm guessing
this is not implemented by ElasticSearch but it can be implemented by
us (for example, by using different indexes for the different paths
and migrating data in bulks).
The question is: does this kind of approach goes against everything
ElasticSearch stands for? I'm not sure, but I did get the feeling that
the whole idea is to solve performance problems not by separating
between read\write paths but by scaling-out (i.e. adding more
machines).

Thanks,
Shahar

On Oct 13, 4:39 pm, Lukáš Vlček lukas.vl...@gmail.com wrote:

Hi,

if you mean search or get for the read path then you might want to check GET
REST APIhttp://www.elasticsearch.com/docs/elasticsearch/rest_api/get/or
search REST APIhttp://www.elasticsearch.com/docs/elasticsearch/rest_api/search/uri_r...
(note
there is also more advanced search REST API herehttp://www.elasticsearch.com/docs/elasticsearch/rest_api/search/body_...
)

if you mean indexing documents for the write path then you can check index
REST APIhttp://www.elasticsearch.com/docs/elasticsearch/rest_api/index/(or
bulk updateshttp://www.elasticsearch.com/docs/elasticsearch/rest_api/bulk/
).

But there is much more to this... what exactly are you after?

Regards,
Lukas

On Wed, Oct 13, 2010 at 3:58 PM, Keidi shahar....@gmail.com wrote:

Hi,

I'm new to ElasticSearch but from what I have seen so far I couldn't
figure out whether it can support a read\write path architecture.

Thanks!


(Shay Banon) #6

Hi,

Yea, this is not how elasticsearch works. Basically, index operation is
replicated to all the shard replicas, and executed there as well. More
replicas will give you better search scalability, but they will also do
indexing. I think what you mean is something like periodically pulling the
changed lucene index to "read only" replicas so they will be searched on,
certainly something that can be done, but does go against the current model
of (near) real time, atomicity, and others.

But this model does incur its overhead and has downsides, both in terms
of its freshness (how often do you commit lucene in order to get those
changes in), and its periodic load on nodes for moving data around (which
can be substantial).

Is there a chance that there is a premature optimization here, have you
really seen problems?

-shay.banon

On Thu, Oct 14, 2010 at 3:01 PM, Keidi shahar.clf@gmail.com wrote:

Hi Lukas,

I believe you didn't follow my meaning. When I say read/write paths I
mean separating between nodes that act as write-only nodes (write
path) and nodes that are read-only nodes (read path). The idea is to
reduce as much as possible the load off the read path for better query
performance. In other words - we don't want heavy indexing requests to
take resources away from query requests.
This of course requires some mechanism that efficiently and regularly
replicates data from the write-path to the read-path. I'm guessing
this is not implemented by ElasticSearch but it can be implemented by
us (for example, by using different indexes for the different paths
and migrating data in bulks).
The question is: does this kind of approach goes against everything
ElasticSearch stands for? I'm not sure, but I did get the feeling that
the whole idea is to solve performance problems not by separating
between read\write paths but by scaling-out (i.e. adding more
machines).

Thanks,
Shahar

On Oct 13, 4:39 pm, Lukáš Vlček lukas.vl...@gmail.com wrote:

Hi,

if you mean search or get for the read path then you might want to check
GET
REST APIhttp://www.elasticsearch.com/docs/elasticsearch/rest_api/get/or
search REST APIhttp://
www.elasticsearch.com/docs/elasticsearch/rest_api/search/uri_r...
(note
there is also more advanced search REST API herehttp://
www.elasticsearch.com/docs/elasticsearch/rest_api/search/body_...
)

if you mean indexing documents for the write path then you can check
index
REST APIhttp://
www.elasticsearch.com/docs/elasticsearch/rest_api/index/(or
bulk updateshttp://
www.elasticsearch.com/docs/elasticsearch/rest_api/bulk/
).

But there is much more to this... what exactly are you after?

Regards,
Lukas

On Wed, Oct 13, 2010 at 3:58 PM, Keidi shahar....@gmail.com wrote:

Hi,

I'm new to ElasticSearch but from what I have seen so far I couldn't
figure out whether it can support a read\write path architecture.

Thanks!


(Keidi) #7

Hi Shay,

We haven't seen real performance problems. To tell the truth, we still
have not been able to overload ElasticSearch (and we were using big
guns :)). But we are still only experimenting.
The reason I raised this issue is that using read\write paths was one
of the architectural options for our project. I just wanted to make
sure that with ElasticSearch this kind of architecture is against
design.
I'm positive that in case we choose ES we will not be using read\write
paths.

Thanks for the answer,
Shahar

On Oct 14, 3:21 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

Yea, this is not how elasticsearch works. Basically, index operation is
replicated to all the shard replicas, and executed there as well. More
replicas will give you better search scalability, but they will also do
indexing. I think what you mean is something like periodically pulling the
changed lucene index to "read only" replicas so they will be searched on,
certainly something that can be done, but does go against the current model
of (near) real time, atomicity, and others.

But this model does incur its overhead and has downsides, both in terms
of its freshness (how often do you commit lucene in order to get those
changes in), and its periodic load on nodes for moving data around (which
can be substantial).

Is there a chance that there is a premature optimization here, have you
really seen problems?

-shay.banon

On Thu, Oct 14, 2010 at 3:01 PM, Keidi shahar....@gmail.com wrote:

Hi Lukas,

I believe you didn't follow my meaning. When I say read/write paths I
mean separating between nodes that act as write-only nodes (write
path) and nodes that are read-only nodes (read path). The idea is to
reduce as much as possible the load off the read path for better query
performance. In other words - we don't want heavy indexing requests to
take resources away from query requests.
This of course requires some mechanism that efficiently and regularly
replicates data from the write-path to the read-path. I'm guessing
this is not implemented by ElasticSearch but it can be implemented by
us (for example, by using different indexes for the different paths
and migrating data in bulks).
The question is: does this kind of approach goes against everything
ElasticSearch stands for? I'm not sure, but I did get the feeling that
the whole idea is to solve performance problems not by separating
between read\write paths but by scaling-out (i.e. adding more
machines).

Thanks,
Shahar

On Oct 13, 4:39 pm, Lukáš Vlček lukas.vl...@gmail.com wrote:

Hi,

if you mean search or get for the read path then you might want to check
GET
REST APIhttp://www.elasticsearch.com/docs/elasticsearch/rest_api/get/or
search REST APIhttp://
www.elasticsearch.com/docs/elasticsearch/rest_api/search/uri_r...
(note
there is also more advanced search REST API herehttp://
www.elasticsearch.com/docs/elasticsearch/rest_api/search/body_...
)

if you mean indexing documents for the write path then you can check
index
REST APIhttp://
www.elasticsearch.com/docs/elasticsearch/rest_api/index/(or
bulk updateshttp://
www.elasticsearch.com/docs/elasticsearch/rest_api/bulk/
).

But there is much more to this... what exactly are you after?

Regards,
Lukas

On Wed, Oct 13, 2010 at 3:58 PM, Keidi shahar....@gmail.com wrote:

Hi,

I'm new to ElasticSearch but from what I have seen so far I couldn't
figure out whether it can support a read\write path architecture.

Thanks!


(system) #8