Search results grouping (aka field combining/collapsing, distinct, de-dup) or alternate sollution

Hi!

I just built the current 0.20.0 snapshot and I was wondering what is the
evolution of the issue discussed here:

Is it possible to use anything of this sort with the current 0.20.0
snapshot?

I'll try to explain our issue to see if it can be resolved in another way
if search grouping is still far from implemented:

We have users that can have several roles (one or many), currently each
user-role is indexed as one independent document. We want to search by user
name, and get only one result per user (it does not matter which). So
ideally we would like to group by user name. I have been reading about
nested documents, and parent/child relationships. (Are they related?) Which
one would better cover our use-case? Note that we might index different
user-roles at different times, so perhaps parent-child indexing is more
suited. Is parent mapping always done at type level? (Can we index a
document and tell ES which is its parent document, or is it always infered
from their respective types?)

Thanks for such an excellent search engine and thanks in advance for any
clarifications on our issue!

Sorry for the premature questions on parent-child relationships, I think
we'll go for a solution implemented using parent field mapping and some
ORed has_child filters (ala

).

On Thursday, August 9, 2012 3:23:25 PM UTC+1, Alex López wrote:

Hi!

I just built the current 0.20.0 snapshot and I was wondering what is the
evolution of the issue discussed here:

Field Collapsing/Combining · Issue #256 · elastic/elasticsearch · GitHub

Is it possible to use anything of this sort with the current 0.20.0
snapshot?

I'll try to explain our issue to see if it can be resolved in another way
if search grouping is still far from implemented:

We have users that can have several roles (one or many), currently each
user-role is indexed as one independent document. We want to search by user
name, and get only one result per user (it does not matter which). So
ideally we would like to group by user name. I have been reading about
nested documents, and parent/child relationships. (Are they related?) Which
one would better cover our use-case? Note that we might index different
user-roles at different times, so perhaps parent-child indexing is more
suited. Is parent mapping always done at type level? (Can we index a
document and tell ES which is its parent document, or is it always infered
from their respective types?)

Thanks for such an excellent search engine and thanks in advance for any
clarifications on our issue!

Lucene 4 will have (has) a grouping API:
http://lucene.apache.org/core/4_0_0-ALPHA/grouping/index.html

It might be worthwhile to adhere to the Lucene API and not recreate
something that is not portable. Shay's call. That said, Lucene 4 is
still only in alpha, and since the API might change, so coding against
it might be a bit premature. Perhaps Simon W has some more insight.

Cheers,

Ivan

On Thu, Aug 9, 2012 at 9:34 AM, Alex López aliksandr@gmail.com wrote:

Sorry for the premature questions on parent-child relationships, I think
we'll go for a solution implemented using parent field mapping and some ORed
has_child filters (ala
Fun with elasticsearch's children and nested documents - Space Vatican
).

On Thursday, August 9, 2012 3:23:25 PM UTC+1, Alex López wrote:

Hi!

I just built the current 0.20.0 snapshot and I was wondering what is the
evolution of the issue discussed here:

Field Collapsing/Combining · Issue #256 · elastic/elasticsearch · GitHub

Is it possible to use anything of this sort with the current 0.20.0
snapshot?

I'll try to explain our issue to see if it can be resolved in another way
if search grouping is still far from implemented:

We have users that can have several roles (one or many), currently each
user-role is indexed as one independent document. We want to search by user
name, and get only one result per user (it does not matter which). So
ideally we would like to group by user name. I have been reading about
nested documents, and parent/child relationships. (Are they related?) Which
one would better cover our use-case? Note that we might index different
user-roles at different times, so perhaps parent-child indexing is more
suited. Is parent mapping always done at type level? (Can we index a
document and tell ES which is its parent document, or is it always infered
from their respective types?)

Thanks for such an excellent search engine and thanks in advance for any
clarifications on our issue!

Thanks for the feedback, I read that version 0.20.0 was getting some
refactoring to allow for this kind of queries, but I guess waiting for
Lucene 4 API to freeze makes sense anyway.

2012/8/9 Ivan Brusic ivan@brusic.com:

Lucene 4 will have (has) a grouping API:
Lucene 4.0.0 API

It might be worthwhile to adhere to the Lucene API and not recreate
something that is not portable. Shay's call. That said, Lucene 4 is
still only in alpha, and since the API might change, so coding against
it might be a bit premature. Perhaps Simon W has some more insight.

Cheers,

Ivan

On Thu, Aug 9, 2012 at 9:34 AM, Alex López aliksandr@gmail.com wrote:

Sorry for the premature questions on parent-child relationships, I think
we'll go for a solution implemented using parent field mapping and some ORed
has_child filters (ala
Fun with elasticsearch's children and nested documents - Space Vatican
).

On Thursday, August 9, 2012 3:23:25 PM UTC+1, Alex López wrote:

Hi!

I just built the current 0.20.0 snapshot and I was wondering what is the
evolution of the issue discussed here:

Field Collapsing/Combining · Issue #256 · elastic/elasticsearch · GitHub

Is it possible to use anything of this sort with the current 0.20.0
snapshot?

I'll try to explain our issue to see if it can be resolved in another way
if search grouping is still far from implemented:

We have users that can have several roles (one or many), currently each
user-role is indexed as one independent document. We want to search by user
name, and get only one result per user (it does not matter which). So
ideally we would like to group by user name. I have been reading about
nested documents, and parent/child relationships. (Are they related?) Which
one would better cover our use-case? Note that we might index different
user-roles at different times, so perhaps parent-child indexing is more
suited. Is parent mapping always done at type level? (Can we index a
document and tell ES which is its parent document, or is it always infered
from their respective types?)

Thanks for such an excellent search engine and thanks in advance for any
clarifications on our issue!

Lucene 4.0 is now in beta:
http://search-lucene.com/m/9LhyoLfdKY&subj=+ANNOUNCE+Apache+Lucene+4+0+beta+released+

Hopefully the full release will happen on schedule around September/October.

On Fri, Aug 10, 2012 at 2:29 AM, Alex Rodriguez Lopez
aliksandr@gmail.com wrote:

Thanks for the feedback, I read that version 0.20.0 was getting some
refactoring to allow for this kind of queries, but I guess waiting for
Lucene 4 API to freeze makes sense anyway.

2012/8/9 Ivan Brusic ivan@brusic.com:

Lucene 4 will have (has) a grouping API:
Lucene 4.0.0 API

It might be worthwhile to adhere to the Lucene API and not recreate
something that is not portable. Shay's call. That said, Lucene 4 is
still only in alpha, and since the API might change, so coding against
it might be a bit premature. Perhaps Simon W has some more insight.

Cheers,

Ivan

On Thu, Aug 9, 2012 at 9:34 AM, Alex López aliksandr@gmail.com wrote:

Sorry for the premature questions on parent-child relationships, I think
we'll go for a solution implemented using parent field mapping and some ORed
has_child filters (ala
Fun with elasticsearch's children and nested documents - Space Vatican
).

On Thursday, August 9, 2012 3:23:25 PM UTC+1, Alex López wrote:

Hi!

I just built the current 0.20.0 snapshot and I was wondering what is the
evolution of the issue discussed here:

Field Collapsing/Combining · Issue #256 · elastic/elasticsearch · GitHub

Is it possible to use anything of this sort with the current 0.20.0
snapshot?

I'll try to explain our issue to see if it can be resolved in another way
if search grouping is still far from implemented:

We have users that can have several roles (one or many), currently each
user-role is indexed as one independent document. We want to search by user
name, and get only one result per user (it does not matter which). So
ideally we would like to group by user name. I have been reading about
nested documents, and parent/child relationships. (Are they related?) Which
one would better cover our use-case? Note that we might index different
user-roles at different times, so perhaps parent-child indexing is more
suited. Is parent mapping always done at type level? (Can we index a
document and tell ES which is its parent document, or is it always infered
from their respective types?)

Thanks for such an excellent search engine and thanks in advance for any
clarifications on our issue!

--

I'm also eagerly awaiting this feature. I'm developing a catalog of
applications (think app store). Let's say you'd like to page through a
list of applications where the results displayed contain the highest
version of the app you have access to (based on a number of factors). This
is very difficult to do outside of the search engine while maintaining the
appropriate paging values.

Like you say, if there's a way to do this with the .19 or .20 beta
releases, I'd love to understand it as well.

--