Pagination over facets values

Hi Team,

Kindly let us know can we paginate over facets values. Our requirement is to retrieve all the values for a particular field. Since we may have hundreds of thousands of values for a particular field,it would be nice if some pagination API is available in ES, so that we could get all the available values not at one go but in smaller chunks of 50,100, 200...

Currently we are using TermsFacetBuilder to get top 100 or 200 values for a field but it has limitations as the values grows large in numbers - memory consumption would be high. The following snippet shows how are we forming queries for facets -

for (String tf : termFacets) {
TermsFacetBuilder termsFacetBuilder = FacetBuilders.termsFacet(tf);
termsFacetBuilder.field(tf);
//termsFacetBuilder.allTerms(Boolean.TRUE);
termsFacetBuilder.order(TermsFacet.ComparatorType.COUNT);
termsFacetBuilder.size(topDataCount);
builder.addFacet(termsFacetBuilder);
}

where builder is an instance of SearchRequestBuilder and topDataCount is an int variable which holds the size. After we execute query we iterate on the SearchRespnse object to get the values of facets.

The problem with this approach is that we have to set the size in advance and for large sizes there is danger of high processing time and high memory footprint.

So please let us know if there is some pagination kind of framework available for facets to iterate over its values in smaller chunks.

Thanks

There isn't a way to paginate over terms facet.

On Fri, May 4, 2012 at 3:14 PM, ElasticUsers kranti_123@rediffmail.comwrote:

Hi Team,

Kindly let us know can we paginate over facets values. Our requirement is
to
retrieve all the values for a particular field. Since we may have hundreds
of thousands of values for a particular field,it would be nice if some
pagination API is available in ES, so that we could get all the available
values not at one go but in smaller chunks of 50,100, 200...

Currently we are using TermsFacetBuilder to get top 100 or 200 values for a
field but it has limitations as the values grows large in numbers - memory
consumption would be high. The following snippet shows how are we forming
queries for facets -

for (String tf : termFacets) {
TermsFacetBuilder termsFacetBuilder =
FacetBuilders.termsFacet(tf);
termsFacetBuilder.field(tf);
//termsFacetBuilder.allTerms(Boolean.TRUE);

termsFacetBuilder.order(TermsFacet.ComparatorType.COUNT);
termsFacetBuilder.size(topDataCount);
builder.addFacet(termsFacetBuilder);
}

where builder is an instance of SearchRequestBuilder and topDataCount is an
int variable which holds the size. After we execute query we iterate on the
SearchRespnse object to get the values of facets.

The problem with this approach is that we have to set the size in advance
and for large sizes there is danger of high processing time and high memory
footprint.

So please let us know if there is some pagination kind of framework
available for facets to iterate over its values in smaller chunks.

Thanks

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Pagination-over-facets-values-tp3962094.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

Since paginating terms facet is impossible what would you recommend to do?

Kranti, I am facing a very similar issue, what did you end up doing to get
this resolved?

On Friday, May 4, 2012 7:51:11 PM UTC+6, kimchy wrote:

There isn't a way to paginate over terms facet.

On Fri, May 4, 2012 at 3:14 PM, ElasticUsers <krant...@rediffmail.com<javascript:>

wrote:

Hi Team,

Kindly let us know can we paginate over facets values. Our requirement is
to
retrieve all the values for a particular field. Since we may have hundreds
of thousands of values for a particular field,it would be nice if some
pagination API is available in ES, so that we could get all the available
values not at one go but in smaller chunks of 50,100, 200...

Currently we are using TermsFacetBuilder to get top 100 or 200 values for
a
field but it has limitations as the values grows large in numbers - memory
consumption would be high. The following snippet shows how are we forming
queries for facets -

for (String tf : termFacets) {
TermsFacetBuilder termsFacetBuilder =
FacetBuilders.termsFacet(tf);
termsFacetBuilder.field(tf);
//termsFacetBuilder.allTerms(Boolean.TRUE);

termsFacetBuilder.order(TermsFacet.ComparatorType.COUNT);
termsFacetBuilder.size(topDataCount);
builder.addFacet(termsFacetBuilder);
}

where builder is an instance of SearchRequestBuilder and topDataCount is
an
int variable which holds the size. After we execute query we iterate on
the
SearchRespnse object to get the values of facets.

The problem with this approach is that we have to set the size in advance
and for large sizes there is danger of high processing time and high
memory
footprint.

So please let us know if there is some pagination kind of framework
available for facets to iterate over its values in smaller chunks.

Thanks

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Pagination-over-facets-values-tp3962094.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I would do a second query and would only return values for this field
limited by size.
The field-results could be used to do facetting.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.