Using the aggregation Framework using a large set of doc IDs as query? ( + bypassing the scoring part)


(nicolas) #1

Hi guys,

I use my own framework and get already the top N results from a previous
processing.

I would like to use the aggregation framework of ES to use facets & co
features on such results.

I previously indexed my documents in ES.

What ES query should I do to avoid the scoring process and process only the
aggregation facets and co features using the IDs a set of documents as
query knowing that N could be large (N = 1K)?

JAVA API

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0b6b4e52-11fe-4914-b1bb-ed4b69421c08%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Radu Gheorghe) #2

Hello,

One way to do it would be to store all those IDs in an Elasticsearch
document. Then, you can use the terms filter with the terms lookup
mechanism to have ES fetch all the terms for you:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-terms-filter.html#_terms_lookup_mechanism

As you can see there, you have quite a lot of options for caching.

Best regards,
Radu

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Thu, Apr 24, 2014 at 12:19 PM, NM n.maisonneuve@gmail.com wrote:

Hi guys,

I use my own framework and get already the top N results from a previous
processing.

I would like to use the aggregation framework of ES to use facets & co
features on such results.

I previously indexed my documents in ES.

What ES query should I do to avoid the scoring process and process only
the aggregation facets and co features using the IDs a set of documents as
query knowing that N could be large (N = 1K)?

JAVA API

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0b6b4e52-11fe-4914-b1bb-ed4b69421c08%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/0b6b4e52-11fe-4914-b1bb-ed4b69421c08%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHXA0_3cZynh5WmcWFEH0oQFzLWNg-b600Wbu%3DEsOyFiPZBwkA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(nicolas) #3

thanks Radu,

to be sure to understand:

I have a query from a user, run a process A returning a list of IDs
specific of the query and would like to use ES to enrich these ID with
aggregated info coming from the related (and already indexed) documents

so the list of IDs from the results are a prior unknown / depends on the
query of the user.

to use only the aggregation framework, you propose then for each query, to
first index the results of the process A (list of ID) as a lookup document.
and then after query ES using a term filter + lookup mechanism.

Is that right?

Le jeudi 24 avril 2014 13:40:22 UTC+2, Radu Gheorghe a écrit :

Hello,

One way to do it would be to store all those IDs in an Elasticsearch
document. Then, you can use the terms filter with the terms lookup
mechanism to have ES fetch all the terms for you:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-terms-filter.html#_terms_lookup_mechanism

As you can see there, you have quite a lot of options for caching.

Best regards,
Radu

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Thu, Apr 24, 2014 at 12:19 PM, NM <n.mais...@gmail.com <javascript:>>wrote:

Hi guys,

I use my own framework and get already the top N results from a previous
processing.

I would like to use the aggregation framework of ES to use facets & co
features on such results.

I previously indexed my documents in ES.

What ES query should I do to avoid the scoring process and process only
the aggregation facets and co features using the IDs a set of documents as
query knowing that N could be large (N = 1K)?

JAVA API

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0b6b4e52-11fe-4914-b1bb-ed4b69421c08%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/0b6b4e52-11fe-4914-b1bb-ed4b69421c08%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/913a7d06-ec51-4d4d-9d7a-e95750015c4b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Radu Gheorghe) #4

Hi,

Yes, that's what I said, but I didn't know you wanted to use ES to enrich
the results only.

Plus, since you have the IDs already, you might use a huge multi-get
request:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-multi-get.htmlhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-multi-get.html#docs-multi-get

Best regards,
Radu

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Thu, Apr 24, 2014 at 6:25 PM, NM n.maisonneuve@gmail.com wrote:

thanks Radu,

to be sure to understand:

I have a query from a user, run a process A returning a list of IDs
specific of the query and would like to use ES to enrich these ID with
aggregated info coming from the related (and already indexed) documents

so the list of IDs from the results are a prior unknown / depends on the
query of the user.

to use only the aggregation framework, you propose then for each query, to
first index the results of the process A (list of ID) as a lookup document.
and then after query ES using a term filter + lookup mechanism.

Is that right?

Le jeudi 24 avril 2014 13:40:22 UTC+2, Radu Gheorghe a écrit :

Hello,

One way to do it would be to store all those IDs in an Elasticsearch
document. Then, you can use the terms filter with the terms lookup
mechanism to have ES fetch all the terms for you:
http://www.elasticsearch.org/guide/en/elasticsearch/
reference/current/query-dsl-terms-filter.html#_terms_lookup_mechanism

As you can see there, you have quite a lot of options for caching.

Best regards,
Radu

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Thu, Apr 24, 2014 at 12:19 PM, NM n.mais...@gmail.com wrote:

Hi guys,

I use my own framework and get already the top N results from a
previous processing.

I would like to use the aggregation framework of ES to use facets & co
features on such results.

I previously indexed my documents in ES.

What ES query should I do to avoid the scoring process and process only
the aggregation facets and co features using the IDs a set of documents as
query knowing that N could be large (N = 1K)?

JAVA API

Thanks

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/0b6b4e52-11fe-4914-b1bb-ed4b69421c08%
40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/0b6b4e52-11fe-4914-b1bb-ed4b69421c08%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/913a7d06-ec51-4d4d-9d7a-e95750015c4b%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/913a7d06-ec51-4d4d-9d7a-e95750015c4b%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHXA0_0mroJemTpy4wygCOzBLsBXc1hWBYMdVLJC%3D0vinpmCdA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #5