White Space Issue / Field Concatenation


(samuel dean) #1

Hi there

This is the is the issue that we're currently looking to resolve.

Currently we have no issue performing the following.

"cross trainer" returning results for "cross trainer" and "crosstrain
er"

however, we're looking to get results from searching "crosstrainer" to
give us "cross trainer"

Another example

Searching "London Restaurant" gives us both "London Restaurant" and "
LondonRestaurant" but it wont work vice versa

so searching "LondonRestaurant" will not return "London Restaurant"

We're seen others have the same issue but would like to know if it's
possible to get a fix for this.

It't the last outstanding issue we have at the moment.

Thanks for all the help

Sam!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2c8af5f4-f5f1-47c6-acb7-343f7949c2f6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Ivan Brusic) #2

I have never tried, but I wonder if you can use a shingle filter with an
empty token separator. If not, you can always output bigrams yourself. Of
course, this would only work for adjacent terms, so a field with "London
Restaurant Chinese" will not match the term "LondonChinese".

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-shingle-tokenfilter.html

If you know the potential terms that might be conjoined ahead of time, you
can use synonyms.

Cheers,

Ivan

On Fri, Apr 25, 2014 at 1:34 PM, samuel dean sam@pricesearcher.com wrote:

Hi there

This is the is the issue that we're currently looking to resolve.

Currently we have no issue performing the following.

"cross trainer" returning results for "cross trainer" and "
crosstrainer"

however, we're looking to get results from searching "crosstrainer" to
give us "cross trainer"

Another example

Searching "London Restaurant" gives us both "London Restaurant" and
"LondonRestaurant" but it wont work vice versa

so searching "LondonRestaurant" will not return "London Restaurant"

We're seen others have the same issue but would like to know if it's
possible to get a fix for this.

It't the last outstanding issue we have at the moment.

Thanks for all the help

Sam!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2c8af5f4-f5f1-47c6-acb7-343f7949c2f6%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/2c8af5f4-f5f1-47c6-acb7-343f7949c2f6%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAg2EaPQbMB4_YzHOmuOpf8%2BQVZOKwD_-mHc_7obHG_Jw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Ivan Brusic) #3

Forgot to add the compound word filter:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-compound-word-tokenfilter.html

I found that maintaining a synonym word list easier than maintaining a word
list of compound words candidates for my use case.

--
Ivan

On Mon, Apr 28, 2014 at 10:05 PM, Ivan Brusic ivan@brusic.com wrote:

I have never tried, but I wonder if you can use a shingle filter with an
empty token separator. If not, you can always output bigrams yourself. Of
course, this would only work for adjacent terms, so a field with "London
Restaurant Chinese" will not match the term "LondonChinese".

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-shingle-tokenfilter.html

If you know the potential terms that might be conjoined ahead of time, you
can use synonyms.

Cheers,

Ivan

On Fri, Apr 25, 2014 at 1:34 PM, samuel dean sam@pricesearcher.comwrote:

Hi there

This is the is the issue that we're currently looking to resolve.

Currently we have no issue performing the following.

"cross trainer" returning results for "cross trainer" and "
crosstrainer"

however, we're looking to get results from searching "crosstrainer" to
give us "cross trainer"

Another example

Searching "London Restaurant" gives us both "London Restaurant" and
"LondonRestaurant" but it wont work vice versa

so searching "LondonRestaurant" will not return "London Restaurant"

We're seen others have the same issue but would like to know if it's
possible to get a fix for this.

It't the last outstanding issue we have at the moment.

Thanks for all the help

Sam!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2c8af5f4-f5f1-47c6-acb7-343f7949c2f6%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/2c8af5f4-f5f1-47c6-acb7-343f7949c2f6%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCg7q2iSw9ZmrN8dpr%2BmWEa5oo1WzJoYkavMf_nmsju4g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4