Getting position of document in search results


(Wojciech Durczyński) #1

Hello. I use elasticsearch for search (java API) and would like to
achieve following functionality:
Having document id and query criteria (query, filters and sort
specification) I'd like to get position of document with this id in
search results of query with provided criteria.
Also I need ids of neighbour documents of this document in this search
results.
I know I can query for ids of all documents matching criteria and then
find document with provided id in this sequence (together with its
position and its neighbours). But it would be very slow if search
results are very big.
How to achieve this in an optimal way?


(Ridvan Gyundogan) #2

Hm,
this would be interesting to me too.

I think about something like, I don know whether it is possible:

  1. First query - add id=yourDocumentId with a filter to get the exact score
    of your document, scoreDoc.
  2. Second query, add filter score<scoreDoc and get the first result for the
    document after.
  3. Third query, add filter score>scoreDoc and get the last result for the
    document before.

I am newbie here, but I think this should be doable.

Best

2011/4/13 Wojciech Durczyński wojciech.durczynski@comarch.com

Hello. I use elasticsearch for search (java API) and would like to
achieve following functionality:
Having document id and query criteria (query, filters and sort
specification) I'd like to get position of document with this id in
search results of query with provided criteria.
Also I need ids of neighbour documents of this document in this search
results.
I know I can query for ids of all documents matching criteria and then
find document with provided id in this sequence (together with its
position and its neighbours). But it would be very slow if search
results are very big.
How to achieve this in an optimal way?


(Clinton Gormley) #3

I think about something like, I don know whether it is possible:

  1. First query - add id=yourDocumentId with a filter to get the exact
    score of your document, scoreDoc.
  2. Second query, add filter score<scoreDoc and get the first result
    for the document after.
  3. Third query, add filter score>scoreDoc and get the last result for
    the document before.

Can you explain your use case?

clint


(Ridvan Gyundogan) #4

I don't have use case at the moment but I could have the following
situation:
I sell products from several suppliers on my website.
Direct clients search for some words on the site. For example "best product
ever".

On the admin panel for the "supplier1" I want to show information: your
"product1" appears on position 137 for the user's search "best product
ever".

On Sat, Apr 16, 2011 at 8:34 PM, Clinton Gormley clinton@iannounce.co.ukwrote:

I think about something like, I don know whether it is possible:

  1. First query - add id=yourDocumentId with a filter to get the exact
    score of your document, scoreDoc.
  2. Second query, add filter score<scoreDoc and get the first result
    for the document after.
  3. Third query, add filter score>scoreDoc and get the last result for
    the document before.

Can you explain your use case?

clint


(Clinton Gormley) #5

On Sat, 2011-04-16 at 22:15 +0300, Ridvan Gyundogan wrote:

I don't have use case at the moment but I could have the following
situation:
I sell products from several suppliers on my website.
Direct clients search for some words on the site. For example "best
product ever".

On the admin panel for the "supplier1" I want to show information:
your "product1" appears on position 137 for the user's search "best
product ever".

Well, then I'd suggest pulling back enough results until you find the
record you are after.

clint


(K.B.) #6

@clint: sounds only good for pos 1 to 200 - but what would you suggest
to do when the position is e.g.: 2734 or even 1.500.237 ?

I'm at a similar problem needing to find out the position of a
document in a given query to get the document that are next to it (the
one before and the one after) and couldn't find any good solution so
far beside to save the whole query and use the from(int) onto it (java
api); However, this only works in case the query was executed before
so that the index-position could be saved; In case the query didn't
execute before I have no idea what to do;

(Imagine: page with catalog -> detail view that should allow to
scroll; one could access the detail view without going to catalog
beforehand);

Best

On 17 Apr., 14:02, Clinton Gormley clin...@iannounce.co.uk wrote:

On Sat, 2011-04-16 at 22:15 +0300, Ridvan Gyundogan wrote:

I don't have use case at the moment but I could have the following
situation:
I sell products from several suppliers on my website.
Direct clients search for some words on the site. For example "best
product ever".

On the admin panel for the "supplier1" I want to show information:
your "product1" appears on position 137 for the user's search "best
product ever".

Well, then I'd suggest pulling back enough results until you find the
record you are after.

clint


(Clinton Gormley) #7

On Mon, 2011-04-18 at 01:45 -0700, K.B. wrote:

@clint: sounds only good for pos 1 to 200 - but what would you suggest
to do when the position is e.g.: 2734 or even 1.500.237 ?

OK - so lets say you have 5 shards in an ES index. And you want to find
the doc at position 1,500,00

Each shard has to order 1,500,000 of their local docs an return to the
node handling the request, which then has to combine 7,500,000 results
into the right order, before discarding all but 10 of them.

This puts an enormous strain on your server.

Much better to keep track of these things in other ways.

clint


(David B.) #8

On Apr 16, 6:25 pm, Ridvan Gyundogan ridva...@gmail.com wrote:

Hm,
this would be interesting to me too.

I think about something like, I don know whether it is possible:

  1. First query - add id=yourDocumentId with a filter to get the exact score
    of your document, scoreDoc.

I originally thought something like this might be possible using the
percolator but I see it is not - i.e., percolating documents against
an index just returns the list of percolators that match, and doesn't
give any indication as to the quality of the match.

A priori there doesn't seem to be any reason why this information
shouldn't come out with the percolator (for my use case, I would only
want rough score data - I wouldn't need an exact score that would be
guaranteed to stay the same as other documents were added to the
index). Is there any way this could be exposed or are percolator
queries strict pass/fail?

  1. Second query, add filter score<scoreDoc and get the first result for the
    document after.
  2. Third query, add filter score>scoreDoc and get the last result for the
    document before.

I am newbie here, but I think this should be doable.

Best

2011/4/13 Wojciech Durczyński wojciech.durczyn...@comarch.com

Hello. I use elasticsearch for search (java API) and would like to
achieve following functionality:
Having document id and query criteria (query, filters and sort
specification) I'd like to get position of document with this id in
search results of query with provided criteria.
Also I need ids of neighbour documents of this document in this search
results.
I know I can query for ids of all documents matching criteria and then
find document with provided id in this sequence (together with its
position and its neighbours). But it would be very slow if search
results are very big.
How to achieve this in an optimal way?


(dmitry.polushkin) #9

Hi David,

Have you found some solution?

Regards,
Dmitry


(system) #10