Need some insight on how distributed scoring works in elasticsearch


(ajasuja18) #1

Hey Everyone,

In elasticsearch, as we know, while querying we can give various aspects of
querying.
One category is Query_and_fetch v/s Query_then_fetch.
Also one more add on we can provide is the dfs query with both of them.

I heard one of the vedio of Shay Banon explaining dfs query and he
mentioned that
in case of dfs, there is an additional phase in which frequencies from all
the shards
is gathered.

*I am not sure what he means by frequencies. Is it just the distributed
term frequencies *
*or it also takes into account the distributed inverse document frequencies?
*

Thanks in advance.


(simonw-2) #2

On Monday, July 30, 2012 8:14:58 PM UTC+2, ajasuja wrote:

Hey Everyone,

In elasticsearch, as we know, while querying we can give various aspects
of querying.
One category is Query_and_fetch v/s Query_then_fetch.
Also one more add on we can provide is the dfs query with both of them.

I heard one of the vedio of Shay Banon explaining dfs query and he
mentioned that
in case of dfs, there is an additional phase in which frequencies from all
the shards
is gathered.

*I am not sure what he means by frequencies. Is it just the distributed
term frequencies *
or it also takes into account the distributed inverse document
frequencies?

in the DFS phase ES will collect a global view of the query terms to
calculate a global & consistent DocumentFrequencies and in turn
InverseDocumentFrequencies. It will not collect the TermFrequency (#
of occurrences in a Document)

simon

Thanks in advance.


(system) #3