Phrases With Stop Words


(timscott) #1

I notice is that phrase searches containing stop words will never matc
any document. For example a field or term search for "Bank of
America" would have no hits. I understand why this is, but it's
useless to try to explain it to a user.

Is there a reasonable solution without indexing stop words?

I tried one way, but it does not seem to work. I thought if I could
get a list of stop words in use by the current analyzer, I could
remove them from search phrases. So if the user searches for "Bank of
America" I would remove "of" and search for the phrase "Bank
America". Because the analyzer had removed "of" I was thinking that
ES would actually see "Bank" and "America" as adjacent words and match
on the "Bank America" search phrase. It seems this does not work.
Perhaps the index leaves a "hole" where the stop word was and so does
not see "Bank" and "America" as adjacent?

If the aforementioned solution did actually work, then a search for
"Bank of America" would also match instances of "Bank and America",
"Bank the America" and so forth. I can live with that, no problem.

By the way, I seem to recall a recent thread on this topic, but I
could not find it, thus a new thread.


(srrin) #2

Hi Shay,
I am having the same issue with my search capability. Can you provide any work around for this issue.

Thanks
srrin


(Clinton Gormley) #3

Srrin

On Wed, 2011-06-15 at 06:08 -0700, srrin wrote:

Hi Shay, I am having the same issue with my search capability. Can you
provide any work around for this issue. Thanks srrin

This is not a useful email.

We don't know what the issue is.

Please read http://www.elasticsearch.org/help for advice about how to
ask questions on the mailing list

clint


(Shay Banon) #4

Tim,

Sorry for the late response, I must have missed your mail. In 0.16.2, the new text family of queries should help, give it a go.

-shay.banon

On Saturday, January 22, 2011 at 9:23 PM, Tim Scott wrote:

I notice is that phrase searches containing stop words will never matc
any document. For example a field or term search for "Bank of
America" would have no hits. I understand why this is, but it's
useless to try to explain it to a user.

Is there a reasonable solution without indexing stop words?

I tried one way, but it does not seem to work. I thought if I could
get a list of stop words in use by the current analyzer, I could
remove them from search phrases. So if the user searches for "Bank of
America" I would remove "of" and search for the phrase "Bank
America". Because the analyzer had removed "of" I was thinking that
ES would actually see "Bank" and "America" as adjacent words and match
on the "Bank America" search phrase. It seems this does not work.
Perhaps the index leaves a "hole" where the stop word was and so does
not see "Bank" and "America" as adjacent?

If the aforementioned solution did actually work, then a search for
"Bank of America" would also match instances of "Bank and America",
"Bank the America" and so forth. I can live with that, no problem.

By the way, I seem to recall a recent thread on this topic, but I
could not find it, thus a new thread.


(system) #5