the completion API looks really nice! There is nearly all I need like
payloads, weighting etc
But assume the example with the hotels [1]. How would you solve the problem
that the user can do the input in every possible combination 'Mercure Hotel
Munich', 'Mercure Munich', 'Munich Mercure', ... ? Should I specify 321
input values? What if my input is an address which could consist of 6 terms
(country, county, town/village, district, zipcode, street) plus POI
information which can easily lead to over 1000 combinations per entry!?
Should I better use prefix query or nEdgeGram? Another possibility would be
the facet workaround [2] but this is not nice as I need to include a
payload value.
Also highlighting and (geo) filtering is currently not supported, right?
This is probably not such a big problem with languages having spaces as
separator. Because you can easily split by space and feed some of the
combinations. But how should I make sure completion is done properly with
the other languages like chinese? Why not use a shingle token filter?
Regards,
Peter.
On Thursday, November 28, 2013 2:23:14 PM UTC+1, Karussell wrote:
Hi there,
the completion API looks really nice! There is nearly all I need like
payloads, weighting etc
But assume the example with the hotels [1]. How would you solve the
problem that the user can do the input in every possible combination
'Mercure Hotel Munich', 'Mercure Munich', 'Munich Mercure', ... ? Should I
specify 321 input values? What if my input is an address which could
consist of 6 terms (country, county, town/village, district, zipcode,
street) plus POI information which can easily lead to over 1000
combinations per entry!? Should I better use prefix query or nEdgeGram?
Another possibility would be the facet workaround [2] but this is not nice
as I need to include a payload value.
Also highlighting and (geo) filtering is currently not supported, right?
we will likely add other suggesters that solve the infix problem. In a lot
of cases this suggester solves problems very efficiently and I don't think
we should advertise to use combinatorial inputs since they require
different handling on the algorithmic side. There is stuff in the pipeline
that helps here but I don't think we can just throw shingles against it and
it will work. I experimented with it but it can quickly explode which is
another thing we need to take care of. I agree we need to make the
interface better and for freetext this one will not work at this point but
I don't think this is it's purpose.
simon
On Friday, November 29, 2013 7:49:42 PM UTC+1, Karussell wrote:
Someone an idea about this?
This is probably not such a big problem with languages having spaces as
separator. Because you can easily split by space and feed some of the
combinations. But how should I make sure completion is done properly with
the other languages like chinese? Why not use a shingle token filter?
Regards,
Peter.
On Thursday, November 28, 2013 2:23:14 PM UTC+1, Karussell wrote:
Hi there,
the completion API looks really nice! There is nearly all I need like
payloads, weighting etc
But assume the example with the hotels [1]. How would you solve the
problem that the user can do the input in every possible combination
'Mercure Hotel Munich', 'Mercure Munich', 'Munich Mercure', ... ? Should I
specify 321 input values? What if my input is an address which could
consist of 6 terms (country, county, town/village, district, zipcode,
street) plus POI information which can easily lead to over 1000
combinations per entry!? Should I better use prefix query or nEdgeGram?
Another possibility would be the facet workaround [2] but this is not nice
as I need to include a payload value.
Also highlighting and (geo) filtering is currently not supported, right?
I've actually relative good results with a normal search combined with a
prefix stuff. It looks like this is also very fast and additionally I can
apply filters, plus: the suggestions give nearly identical results like the
normal search.
I'll keep you informed with limitations of my approach !
Regards,
Peter.
On Friday, November 29, 2013 10:42:52 PM UTC+1, simonw wrote:
we will likely add other suggesters that solve the infix problem. In a lot
of cases this suggester solves problems very efficiently and I don't think
we should advertise to use combinatorial inputs since they require
different handling on the algorithmic side. There is stuff in the pipeline
that helps here but I don't think we can just throw shingles against it and
it will work. I experimented with it but it can quickly explode which is
another thing we need to take care of. I agree we need to make the
interface better and for freetext this one will not work at this point but
I don't think this is it's purpose.
simon
On Friday, November 29, 2013 7:49:42 PM UTC+1, Karussell wrote:
Someone an idea about this?
This is probably not such a big problem with languages having spaces as
separator. Because you can easily split by space and feed some of the
combinations. But how should I make sure completion is done properly with
the other languages like chinese? Why not use a shingle token filter?
Regards,
Peter.
On Thursday, November 28, 2013 2:23:14 PM UTC+1, Karussell wrote:
Hi there,
the completion API looks really nice! There is nearly all I need like
payloads, weighting etc
But assume the example with the hotels [1]. How would you solve the
problem that the user can do the input in every possible combination
'Mercure Hotel Munich', 'Mercure Munich', 'Munich Mercure', ... ? Should I
specify 321 input values? What if my input is an address which could
consist of 6 terms (country, county, town/village, district, zipcode,
street) plus POI information which can easily lead to over 1000
combinations per entry!? Should I better use prefix query or nEdgeGram?
Another possibility would be the facet workaround [2] but this is not nice
as I need to include a payload value.
Also highlighting and (geo) filtering is currently not supported, right?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.