Facets Refactor

Can you please provide some information on the faceting refactor?
What does "collector" and "post" modes do and when would you use one
over the other?

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

There was no mention of the facet refactoring in the 0.90 Beta release
notes. The commits are there, hopefully so more details will emerge.

--
Ivan

On Tue, Feb 26, 2013 at 5:53 PM, Matt Weber matt@mattweber.org wrote:

Can you please provide some information on the faceting refactor?
What does "collector" and "post" modes do and when would you use one
over the other?

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi, the first phase of the refactoring is included in 0.90. We have decided to move the second phase to post 0.90 because we wanted to get the version out.

If you are a custom facet developer, the first phase doesn't break too much. In FacetExecutor, you now need to implement the collection of facets by implementing the #collector method (we have abstracted it away).

The new mode (which we keep internal for now), allow to control if the facets will be executed as part of the query execution while it "collects" hits, or as a post phase, on the aggregated hits. This is implemented automatically on top of the collector implementation.

On Feb 27, 2013, at 2:53 AM, Matt Weber matt@mattweber.org wrote:

Can you please provide some information on the faceting refactor?
What does "collector" and "post" modes do and when would you use one
over the other?

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Is there a good place for someone to start learning custom facet
development?

On Thursday, February 28, 2013 12:01:51 PM UTC+4, kimchy wrote:

Hi, the first phase of the refactoring is included in 0.90. We have
decided to move the second phase to post 0.90 because we wanted to get the
version out.

If you are a custom facet developer, the first phase doesn't break too
much. In FacetExecutor, you now need to implement the collection of facets
by implementing the #collector method (we have abstracted it away).

The new mode (which we keep internal for now), allow to control if the
facets will be executed as part of the query execution while it "collects"
hits, or as a post phase, on the aggregated hits. This is implemented
automatically on top of the collector implementation.

On Feb 27, 2013, at 2:53 AM, Matt Weber <ma...@mattweber.org <javascript:>>
wrote:

Can you please provide some information on the faceting refactor?
What does "collector" and "post" modes do and when would you use one
over the other?

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

If possible, I would wait for the refactor. Right now, there are a lot of
boilerplate classes that need to be created for a custom facet.

For starters, looks at the terms facets classes such
as TermsStringFacetCollector/TermsIntFacetCollector and the helper classes
TermsFacetProcesser/Builder.

--
Ivan

On Tue, Mar 5, 2013 at 1:17 AM, Mo mohammady.mahdy@gmail.com wrote:

Is there a good place for someone to start learning custom facet
development?

On Thursday, February 28, 2013 12:01:51 PM UTC+4, kimchy wrote:

Hi, the first phase of the refactoring is included in 0.90. We have
decided to move the second phase to post 0.90 because we wanted to get the
version out.

If you are a custom facet developer, the first phase doesn't break too
much. In FacetExecutor, you now need to implement the collection of facets
by implementing the #collector method (we have abstracted it away).

The new mode (which we keep internal for now), allow to control if the
facets will be executed as part of the query execution while it "collects"
hits, or as a post phase, on the aggregated hits. This is implemented
automatically on top of the collector implementation.

On Feb 27, 2013, at 2:53 AM, Matt Weber ma...@mattweber.org wrote:

Can you please provide some information on the faceting refactor?
What does "collector" and "post" modes do and when would you use one
over the other?

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thank you for the advice/ reply.

I did look there. I am not however familiar with the primitives used,
things like field caches or cache recycler or what an index reader really
does. I was just wondering if there is a resource that has a walk through
or a more detailed description of what are the best practices or why are
things that are currently implemented done the way they are? I am doing it
from the basics now by getting to know lucene better then will delve in the
ES code base

Many thanks to you once more!

On Wednesday, March 6, 2013 12:19:01 AM UTC+4, Ivan Brusic wrote:

If possible, I would wait for the refactor. Right now, there are a lot of
boilerplate classes that need to be created for a custom facet.

For starters, looks at the terms facets classes such
as TermsStringFacetCollector/TermsIntFacetCollector and the helper classes
TermsFacetProcesser/Builder.

--
Ivan

On Tue, Mar 5, 2013 at 1:17 AM, Mo <mohamma...@gmail.com <javascript:>>wrote:

Is there a good place for someone to start learning custom facet
development?

On Thursday, February 28, 2013 12:01:51 PM UTC+4, kimchy wrote:

Hi, the first phase of the refactoring is included in 0.90. We have
decided to move the second phase to post 0.90 because we wanted to get the
version out.

If you are a custom facet developer, the first phase doesn't break too
much. In FacetExecutor, you now need to implement the collection of facets
by implementing the #collector method (we have abstracted it away).

The new mode (which we keep internal for now), allow to control if the
facets will be executed as part of the query execution while it "collects"
hits, or as a post phase, on the aggregated hits. This is implemented
automatically on top of the collector implementation.

On Feb 27, 2013, at 2:53 AM, Matt Weber ma...@mattweber.org wrote:

Can you please provide some information on the faceting refactor?
What does "collector" and "post" modes do and when would you use one
over the other?

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I will try to sum it up in a few sentences from what I understand so far...

Facets are a well-known feature of search engines that can return
voluminous result sets, in order to let users decide how to proceed with
their query. One decision may be to refine the query to filter out only
parts of the result set. By assisting the user to refine the search
without knowing the subtleties of a query language, facets are a most
comfortable way for a search product to offer UI actions to filter
result sets. Facets need to display additional information, most of
statistical nature, to give the required information to the user so an
educated guess can be made.

In Elasticsearch, the facets are implemented as part of the search
module in package org.elastisearch.search and cover a broad range of
different approaches how result sets can be statstically analyzed
(counting, histogram, geo, ...). They operate on field cache data in
order to be fast. The process is an evaluation by a scatter/gather
algorithm (or map/reduce-style) that works in several stages: first send
out facet criteria, then aggregate the data in the result set, and
finally collect them into a small handy reduced data structure that can
be represented as part of the search response. This process works on
several layers, on shard layer, and on node layer, on one or more indices.

The facet implementation of Elasticsearch is now refactored to allow
better customization in the implementations. Right now, only "static"
facet execution is possible, and unfortunately, some of the facet code
is tied to other code in Elasticsearch. Igor Motov wrote a facet with
powerful scripting capabilities
https://github.com/imotov/elasticsearch-facet-script to demonstrate how
facet execution phases could be customized. With the new framework,
plugin authors will be enabled to program new algorithms to create
facets, also on custom data types, by implementing custom "executors"
that can operate on the data in the map/reduce phases, just like they
want...

So, in the future, you will see a standard set of facet implementations
in the Elasticsearch core, and in plugins, you will see how
Elasticsearch extensions can also introduce new kinds of facets, with a
minimum of boilerplate code.

Jörg

Am 06.03.13 12:37, schrieb Mo:

I was just wondering if there is a resource that has a walk through or
a more detailed description of what are the best practices or why are
things that are currently implemented done the way they are?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.