Removing results from same domain in search query result

Devashish_Tyagi · June 17, 2013, 11:33am

I have an index containing around 1 million HTML documents. The index
mapping has following fields

file : Holding the actual html document
*url : *Holds the source url of the html document
*meta-data : *Some meta data associated with the document

I want to do the following :

Query the index for a result set of size 100.
Prune the result set so that it contains at most 2 results from a
particular domain.

For the pruning part, I was wondering whether I can use script filter to
perform this function. Is it possible to do this using script filter ? If
yes then how ? Is there any other option to do what I want to do ?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ivan · June 18, 2013, 6:32pm

My first thought would be a custom facet/collector, but that would require
a fair amount of code. Perhaps Igor's facet script could help? I have never
used it.

--
Ivan

On Mon, Jun 17, 2013 at 4:33 AM, Devashish Tyagi
devashishrocker@gmail.comwrote:

I have an index containing around 1 million HTML documents. The index
mapping has following fields

file : Holding the actual html document

*url : *Holds the source url of the html document

*meta-data : *Some meta data associated with the document

I want to do the following :

Query the index for a result set of size 100.

Prune the result set so that it contains at most 2 results from a
particular domain.

For the pruning part, I was wondering whether I can use script filter to
perform this function. Is it possible to do this using script filter ? If
yes then how ? Is there any other option to do what I want to do ?

Thanks!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Filtering results after search Elasticsearch	5	464	July 6, 2017
Filter results list Elasticsearch	2	249	July 6, 2017
Filter output of term facet - not input Elasticsearch	8	439	July 6, 2017
Using elasticsearch to find duplicates in dataset Elasticsearch	7	5526	July 6, 2017
Complex custom relevance calculations and huge result sets Elasticsearch	4	584	July 6, 2017

Removing results from same domain in search query result

Related topics