The completion suggester needs n! input combinations?

Hi there,

the completion API looks really nice! There is nearly all I need like
payloads, weighting etc

But assume the example with the hotels [1]. How would you solve the problem
that the user can do the input in every possible combination 'Mercure Hotel
Munich', 'Mercure Munich', 'Munich Mercure', ... ? Should I specify 321
input values? What if my input is an address which could consist of 6 terms
(country, county, town/village, district, zipcode, street) plus POI
information which can easily lead to over 1000 combinations per entry!?
Should I better use prefix query or nEdgeGram? Another possibility would be
the facet workaround [2] but this is not nice as I need to include a
payload value.

Also highlighting and (geo) filtering is currently not supported, right?

Regards,
Peter.

[1]

[2]
https://groups.google.com/forum/#!searchin/elasticsearch/suggestion/elasticsearch/76VIOu32J9Q/zfwTjYLDocgJ

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53ec62bd-d038-451f-8d00-d6377f6e5de2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Someone an idea about this?

This is probably not such a big problem with languages having spaces as
separator. Because you can easily split by space and feed some of the
combinations. But how should I make sure completion is done properly with
the other languages like chinese? Why not use a shingle token filter?

Regards,
Peter.

On Thursday, November 28, 2013 2:23:14 PM UTC+1, Karussell wrote:

Hi there,

the completion API looks really nice! There is nearly all I need like
payloads, weighting etc

But assume the example with the hotels [1]. How would you solve the
problem that the user can do the input in every possible combination
'Mercure Hotel Munich', 'Mercure Munich', 'Munich Mercure', ... ? Should I
specify 321 input values? What if my input is an address which could
consist of 6 terms (country, county, town/village, district, zipcode,
street) plus POI information which can easily lead to over 1000
combinations per entry!? Should I better use prefix query or nEdgeGram?
Another possibility would be the facet workaround [2] but this is not nice
as I need to include a payload value.

Also highlighting and (geo) filtering is currently not supported, right?

Regards,
Peter.

[1]
Elasticsearch Platform — Find real-time answers at scale | Elastic

[2]

Redirecting to Google Groups

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/79f5dcdb-b8f9-4a90-9f45-8dfacb9f94cf%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

we will likely add other suggesters that solve the infix problem. In a lot
of cases this suggester solves problems very efficiently and I don't think
we should advertise to use combinatorial inputs since they require
different handling on the algorithmic side. There is stuff in the pipeline
that helps here but I don't think we can just throw shingles against it and
it will work. I experimented with it but it can quickly explode which is
another thing we need to take care of. I agree we need to make the
interface better and for freetext this one will not work at this point but
I don't think this is it's purpose.

simon

On Friday, November 29, 2013 7:49:42 PM UTC+1, Karussell wrote:

Someone an idea about this?

This is probably not such a big problem with languages having spaces as
separator. Because you can easily split by space and feed some of the
combinations. But how should I make sure completion is done properly with
the other languages like chinese? Why not use a shingle token filter?

Regards,
Peter.

On Thursday, November 28, 2013 2:23:14 PM UTC+1, Karussell wrote:

Hi there,

the completion API looks really nice! There is nearly all I need like
payloads, weighting etc

But assume the example with the hotels [1]. How would you solve the
problem that the user can do the input in every possible combination
'Mercure Hotel Munich', 'Mercure Munich', 'Munich Mercure', ... ? Should I
specify 321 input values? What if my input is an address which could
consist of 6 terms (country, county, town/village, district, zipcode,
street) plus POI information which can easily lead to over 1000
combinations per entry!? Should I better use prefix query or nEdgeGram?
Another possibility would be the facet workaround [2] but this is not nice
as I need to include a payload value.

Also highlighting and (geo) filtering is currently not supported, right?

Regards,
Peter.

[1]
Elasticsearch Platform — Find real-time answers at scale | Elastic

[2]

Redirecting to Google Groups

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/35b18b38-a488-4375-aab8-5d373701ea11%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks Simon,

I've actually relative good results with a normal search combined with a
prefix stuff. It looks like this is also very fast and additionally I can
apply filters, plus: the suggestions give nearly identical results like the
normal search.

I'll keep you informed with limitations of my approach :wink: !

Regards,
Peter.

On Friday, November 29, 2013 10:42:52 PM UTC+1, simonw wrote:

we will likely add other suggesters that solve the infix problem. In a lot
of cases this suggester solves problems very efficiently and I don't think
we should advertise to use combinatorial inputs since they require
different handling on the algorithmic side. There is stuff in the pipeline
that helps here but I don't think we can just throw shingles against it and
it will work. I experimented with it but it can quickly explode which is
another thing we need to take care of. I agree we need to make the
interface better and for freetext this one will not work at this point but
I don't think this is it's purpose.

simon

On Friday, November 29, 2013 7:49:42 PM UTC+1, Karussell wrote:

Someone an idea about this?

This is probably not such a big problem with languages having spaces as
separator. Because you can easily split by space and feed some of the
combinations. But how should I make sure completion is done properly with
the other languages like chinese? Why not use a shingle token filter?

Regards,
Peter.

On Thursday, November 28, 2013 2:23:14 PM UTC+1, Karussell wrote:

Hi there,

the completion API looks really nice! There is nearly all I need like
payloads, weighting etc

But assume the example with the hotels [1]. How would you solve the
problem that the user can do the input in every possible combination
'Mercure Hotel Munich', 'Mercure Munich', 'Munich Mercure', ... ? Should I
specify 321 input values? What if my input is an address which could
consist of 6 terms (country, county, town/village, district, zipcode,
street) plus POI information which can easily lead to over 1000
combinations per entry!? Should I better use prefix query or nEdgeGram?
Another possibility would be the facet workaround [2] but this is not nice
as I need to include a payload value.

Also highlighting and (geo) filtering is currently not supported, right?

Regards,
Peter.

[1]
Elasticsearch Platform — Find real-time answers at scale | Elastic

[2]

Redirecting to Google Groups

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8045c4c4-e9e7-4923-a5d9-ee0fd3bb16f8%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.