Get document by Key faster


(Fernandojose Boiton Tello) #1

Hello! At work we are starting to interact with ElasticSearch we have a
.Net Web application that index a document. Using MSMQ we notify our
service bus an event and when processing that event. The service bus call
ES to get the previous indexed document. We are having some problems
because ES don't have available the document yet.

Doing some research we found the "refresh" option, but is announced as
something that will impact performance. Also we saw the "flush" operation
that occur each 5000 documents. Are we lost on this?

What are your recommended configurations? Any particular clue about how to
set shards/replicas based on a faster document availability?
Action.Write_consistency makes any difference on this?

We are using NEST as .Net client and elasticsearch 0.16.0 is running as
service in two windows server 2008 installed with elasticsearch-setup (
https://github.com/rgl/elasticsearch-setup/downloads)

Thank y'all in advance!


(Shay Banon) #2

Use a later version of elasticsearch (latest 0.18 is best), it has a feature called real time get where you don't need to refresh in order to get the latest indexed document (get by id).

On Monday, January 30, 2012 at 6:47 PM, Fernandojose Boiton Tello wrote:

Hello! At work we are starting to interact with ElasticSearch we have a .Net Web application that index a document. Using MSMQ we notify our service bus an event and when processing that event. The service bus call ES to get the previous indexed document. We are having some problems because ES don't have available the document yet.

Doing some research we found the "refresh" option, but is announced as something that will impact performance. Also we saw the "flush" operation that occur each 5000 documents. Are we lost on this?

What are your recommended configurations? Any particular clue about how to set shards/replicas based on a faster document availability? Action.Write_consistency makes any difference on this?

We are using NEST as .Net client and elasticsearch 0.16.0 is running as service in two windows server 2008 installed with elasticsearch-setup (https://github.com/rgl/elasticsearch-setup/downloads)

Thank y'all in advance!


(Fernandojose Boiton Tello) #3

Thanks!

We have to interact with a previsously installed ElasticSearch that is
running the version 0.16. That's why we have installed that.

Is there any known issue when running different elasticsearch versions for
a cluster?

On Mon, Jan 30, 2012 at 2:09 PM, Shay Banon kimchy@gmail.com wrote:

Use a later version of elasticsearch (latest 0.18 is best), it has a
feature called real time get where you don't need to refresh in order to
get the latest indexed document (get by id).

On Monday, January 30, 2012 at 6:47 PM, Fernandojose Boiton Tello wrote:

Hello! At work we are starting to interact with ElasticSearch we have a
.Net Web application that index a document. Using MSMQ we notify our
service bus an event and when processing that event. The service bus call
ES to get the previous indexed document. We are having some problems
because ES don't have available the document yet.

Doing some research we found the "refresh" option, but is announced as
something that will impact performance. Also we saw the "flush" operation
that occur each 5000 documents. Are we lost on this?

What are your recommended configurations? Any particular clue about how to
set shards/replicas based on a faster document availability?
Action.Write_consistency makes any difference on this?

We are using NEST as .Net client and elasticsearch 0.16.0 is running as
service in two windows server 2008 installed with elasticsearch-setup (
https://github.com/rgl/elasticsearch-setup/downloads)

Thank y'all in advance!

--
Fernandojosé Boiton
http://www.fboiton.com/


(Shay Banon) #4

You can't run different major version nodes in the same cluster…, also, it makes little sense since the real time get will be only on part of them (if it did).

On Monday, January 30, 2012 at 11:22 PM, Fernandojose Boiton Tello wrote:

Thanks!

We have to interact with a previsously installed ElasticSearch that is running the version 0.16. That's why we have installed that.

Is there any known issue when running different elasticsearch versions for a cluster?

On Mon, Jan 30, 2012 at 2:09 PM, Shay Banon <kimchy@gmail.com (mailto:kimchy@gmail.com)> wrote:

Use a later version of elasticsearch (latest 0.18 is best), it has a feature called real time get where you don't need to refresh in order to get the latest indexed document (get by id).

On Monday, January 30, 2012 at 6:47 PM, Fernandojose Boiton Tello wrote:

Hello! At work we are starting to interact with ElasticSearch we have a .Net Web application that index a document. Using MSMQ we notify our service bus an event and when processing that event. The service bus call ES to get the previous indexed document. We are having some problems because ES don't have available the document yet.

Doing some research we found the "refresh" option, but is announced as something that will impact performance. Also we saw the "flush" operation that occur each 5000 documents. Are we lost on this?

What are your recommended configurations? Any particular clue about how to set shards/replicas based on a faster document availability? Action.Write_consistency makes any difference on this?

We are using NEST as .Net client and elasticsearch 0.16.0 is running as service in two windows server 2008 installed with elasticsearch-setup (https://github.com/rgl/elasticsearch-setup/downloads)

Thank y'all in advance!

--
Fernandojosé Boiton
http://www.fboiton.com/


(Fernandojose Boiton Tello) #5

Thanks! :slight_smile:

On Tue, Jan 31, 2012 at 9:52 AM, Shay Banon kimchy@gmail.com wrote:

You can't run different major version nodes in the same cluster…, also,
it makes little sense since the real time get will be only on part of them
(if it did).

On Monday, January 30, 2012 at 11:22 PM, Fernandojose Boiton Tello wrote:

Thanks!

We have to interact with a previsously installed ElasticSearch that is
running the version 0.16. That's why we have installed that.

Is there any known issue when running different elasticsearch versions for
a cluster?

On Mon, Jan 30, 2012 at 2:09 PM, Shay Banon kimchy@gmail.com wrote:

Use a later version of elasticsearch (latest 0.18 is best), it has a
feature called real time get where you don't need to refresh in order to
get the latest indexed document (get by id).

On Monday, January 30, 2012 at 6:47 PM, Fernandojose Boiton Tello wrote:

Hello! At work we are starting to interact with ElasticSearch we have a
.Net Web application that index a document. Using MSMQ we notify our
service bus an event and when processing that event. The service bus call
ES to get the previous indexed document. We are having some problems
because ES don't have available the document yet.

Doing some research we found the "refresh" option, but is announced as
something that will impact performance. Also we saw the "flush" operation
that occur each 5000 documents. Are we lost on this?

What are your recommended configurations? Any particular clue about how to
set shards/replicas based on a faster document availability?
Action.Write_consistency makes any difference on this?

We are using NEST as .Net client and elasticsearch 0.16.0 is running as
service in two windows server 2008 installed with elasticsearch-setup (
https://github.com/rgl/elasticsearch-setup/downloads)

Thank y'all in advance!

--
Fernandojosé Boiton
http://www.fboiton.com/

--
Fernandojosé Boiton
http://www.fboiton.com/


(system) #6