Deep Paging in ES


(Brett Wooldridge) #1

I asked this question the other day, but given no response, possibly I
wasn't clear enough.

I'm trying to understand whether ES supports "Deep Paging" similar to the
new deep paging feature added to Lucene (3.5,4.0) and Solr (4.0).

From the Solr docs:

The following list of parameters are supported by SearchHandlerhttp://wiki.apache.org/solr/SearchHandler
...

pageDoc and pageScore

See https://issues.apache.org/jira/browse/SOLR-1726 If you expect to be
paging deeply into the results (say beyond page 10, assuming rows=10) and
you are sorting by score, you may wish to add the pageDoc and pageScore
parameters to your request. These two parameters tell Solr (and Lucene)
what the last result (Lucene internal docid and score) of the previous page
was, so that when scoring the query for the next set of pages, it can
ignore any results that occur higher than that item. To get the Lucene
internal doc id, you will need to add [docid] to the &fl list.

Example: q=:&start=10&pageDoc=5&pageScore=1.345&fl=[docid],score

Specifically, this feature is enabled by LUCENE-2215https://issues.apache.org/jira/browse/LUCENE-2215with a new method on TopDocs. Does ES support deep paging?

Brett


(Otis Gospodnetić) #2

Hi Brett,

That sounds like regular paging. Yes, ES supports it:
http://www.elasticsearch.org/guide/reference/api/search/from-size.html

Otis

Search Analytics - http://sematext.com/search-analytics/index.html
Scalable Performance Monitoring - http://sematext.com/spm/index.html

On Wednesday, July 11, 2012 10:33:48 PM UTC-4, Brett Wooldridge wrote:

I asked this question the other day, but given no response, possibly I
wasn't clear enough.

I'm trying to understand whether ES supports "Deep Paging" similar to the
new deep paging feature added to Lucene (3.5,4.0) and Solr (4.0).

From the Solr docs:

The following list of parameters are supported by SearchHandlerhttp://wiki.apache.org/solr/SearchHandler
...

pageDoc and pageScore

See https://issues.apache.org/jira/browse/SOLR-1726 If you expect to be
paging deeply into the results (say beyond page 10, assuming rows=10) and
you are sorting by score, you may wish to add the pageDoc and pageScore
parameters to your request. These two parameters tell Solr (and Lucene)
what the last result (Lucene internal docid and score) of the previous page
was, so that when scoring the query for the next set of pages, it can
ignore any results that occur higher than that item. To get the Lucene
internal doc id, you will need to add [docid] to the &fl list.

Example: q=:&start=10&pageDoc=5&pageScore=1.345&fl=[docid],score

Specifically, this feature is enabled by LUCENE-2215https://issues.apache.org/jira/browse/LUCENE-2215with a new method on TopDocs. Does ES support deep paging?

Brett


(hazzadous) #3

Although that I assume will not provide any scoring optimisations?
Depending on your use case there is a scan search type and constant_score
query, which may help.

On Thursday, July 12, 2012 3:43:28 AM UTC+1, Otis Gospodnetic wrote:

Hi Brett,

That sounds like regular paging. Yes, ES supports it:
http://www.elasticsearch.org/guide/reference/api/search/from-size.html

Otis

Search Analytics - http://sematext.com/search-analytics/index.html
Scalable Performance Monitoring - http://sematext.com/spm/index.html

On Wednesday, July 11, 2012 10:33:48 PM UTC-4, Brett Wooldridge wrote:

I asked this question the other day, but given no response, possibly I
wasn't clear enough.

I'm trying to understand whether ES supports "Deep Paging" similar to the
new deep paging feature added to Lucene (3.5,4.0) and Solr (4.0).

From the Solr docs:

The following list of parameters are supported by SearchHandlerhttp://wiki.apache.org/solr/SearchHandler
...

pageDoc and pageScore

See https://issues.apache.org/jira/browse/SOLR-1726 If you expect to be
paging deeply into the results (say beyond page 10, assuming rows=10) and
you are sorting by score, you may wish to add the pageDoc and pageScore
parameters to your request. These two parameters tell Solr (and Lucene)
what the last result (Lucene internal docid and score) of the previous page
was, so that when scoring the query for the next set of pages, it can
ignore any results that occur higher than that item. To get the Lucene
internal doc id, you will need to add [docid] to the &fl list.

Example: q=:&start=10&pageDoc=5&pageScore=1.345&fl=[docid],score

Specifically, this feature is enabled by LUCENE-2215https://issues.apache.org/jira/browse/LUCENE-2215with a new method on TopDocs. Does ES support deep paging?

Brett


(system) #4