Newbie question - distributed ranking


(solmyr72) #1

Hi,

Could anyone please refer me to documentation on how elasticsearch
works "under the hood", especially how it decides on ranking when
collection results from several nodes?
Could ranking be slightly different from the one we'd have on a single
centralized index? For example, will we have to give up the Lucene
feature of "IDF" ("rare words")?

The background is, I have an application that uses Lucene, and we
expect we'll need to Shard due to a leap in the amount of data (moving
from "startup / proof of concept" stage - to internationally-marketed
stable application).
I believe in understanding how stuff works before basing my
application on it...

Thanks :slight_smile:


(ppearcy) #2

Hey,
This should explain things:
http://www.elasticsearch.org/guide/reference/api/search/search-type.html

There are 4 different methods for how the searches get distributed, a
couple of them handle distributed term frequencies.

Regards,
Paul

On Oct 3, 8:13 am, solmyr72 solmy...@yahoo.com wrote:

Hi,

Could anyone please refer me to documentation on how elasticsearch
works "under the hood", especially how it decides on ranking when
collection results from several nodes?
Could ranking be slightly different from the one we'd have on a single
centralized index? For example, will we have to give up the Lucene
feature of "IDF" ("rare words")?

The background is, I have an application that uses Lucene, and we
expect we'll need to Shard due to a leap in the amount of data (moving
from "startup / proof of concept" stage - to internationally-marketed
stable application).
I believe in understanding how stuff works before basing my
application on it...

Thanks :slight_smile:


(system) #3