Strange behavior with match query and synonyms

Hi all,

I have a problem when trying to do a match query with synonyms. I can see
my synonym filter working, and the query being expanded using Inquisitor.
Also, I can see that my match query works when using the or operator:

"match": {
"autocomplete.ngram": {
"query": "saint john",
"slop": 0,
"operator": "or"
}
}

Explain line:

  • description: ConstantScore(autocomplete.ngram:st
    autocomplete.ngram:st. autocomplete.ngram:saint autocomplete.ngram:john),
    product of:

However, setting it to and means I get no results. I think that the issue
is that match is not respecting the token positions for the synonyms. I can
see them all as token #1 in Inquisitor, and john as token #2 as expected,
but clearly the query is doing something different. So the question is, am
I doing something incorrectly? Is there another query type I should be
looking at instead of match? I really appreciate anyone who can help.

Thanks,

Jorge

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Are you applying the synonyms during indexing and querying, or only during
indexing?

--
Ivan

On Wed, Jun 19, 2013 at 11:53 AM, Jorge T
jorge.alberto.trujillo@gmail.comwrote:

Hi all,

I have a problem when trying to do a match query with synonyms. I can see
my synonym filter working, and the query being expanded using Inquisitor.
Also, I can see that my match query works when using the or operator:

"match": {
"autocomplete.ngram": {
"query": "saint john",
"slop": 0,
"operator": "or"
}
}

Explain line:

  • description: ConstantScore(autocomplete.ngram:st
    autocomplete.ngram:st. autocomplete.ngram:saint autocomplete.ngram:john),
    product of:

However, setting it to and means I get no results. I think that the
issue is that match is not respecting the token positions for the synonyms.
I can see them all as token #1 in Inquisitor, and john as token #2 as
expected, but clearly the query is doing something different. So the
question is, am I doing something incorrectly? Is there another query type
I should be looking at instead of match? I really appreciate anyone who can
help.

Thanks,

Jorge

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Ivan,

Synonyms are currently applied only at query time, the reason being that
the content will be very large and it won't be feasible to reindex for
every synonym change.

Thanks,

Jorge

On Wednesday, June 19, 2013 3:10:23 PM UTC-4, Ivan Brusic wrote:

Are you applying the synonyms during indexing and querying, or only during
indexing?

--
Ivan

On Wed, Jun 19, 2013 at 11:53 AM, Jorge T <jorge.alber...@gmail.com<javascript:>

wrote:

Hi all,

I have a problem when trying to do a match query with synonyms. I can see
my synonym filter working, and the query being expanded using Inquisitor.
Also, I can see that my match query works when using the or operator:

"match": {
"autocomplete.ngram": {
"query": "saint john",
"slop": 0,
"operator": "or"
}
}

Explain line:

  • description: ConstantScore(autocomplete.ngram:st
    autocomplete.ngram:st. autocomplete.ngram:saint autocomplete.ngram:john),
    product of:

However, setting it to and means I get no results. I think that the
issue is that match is not respecting the token positions for the synonyms.
I can see them all as token #1 in Inquisitor, and john as token #2 as
expected, but clearly the query is doing something different. So the
question is, am I doing something incorrectly? Is there another query type
I should be looking at instead of match? I really appreciate anyone who can
help.

Thanks,

Jorge

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Then AND is not working for you because query time synonyms are simply not in your index

For example you have "milk cow" indexed and say milk has dairy as a synonym. Your match AND query will effectively expand your "milk" query to "milk AND dairy" which is not in your index

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey Alex,

Thanks, that confirms what I believed was the problem. That said, isn't
this behavior the opposite of what it should be? If I have I have two
documents, "milk cow" and "dairy cow", the whole point in my view is to
enable the user to type "dairy cow" and get both results. What this means
is that query time synonyms are not useful, at least with match queries. In
my view, the correct and logical behavior should be match expanding to
"(milk OR dairy) AND cow".

I'll play with bool and term or match queries with individual terms now to
see if I can get the desired results, but this seems like an unnecessary
workaround.

Regards,

Jorge

On Wednesday, June 19, 2013 4:41:42 PM UTC-4, AlexR wrote:

Then AND is not working for you because query time synonyms are simply not
in your index

For example you have "milk cow" indexed and say milk has dairy as a
synonym. Your match AND query will effectively expand your "milk" query to
"milk AND dairy" which is not in your index

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

well, I guess match query just gets tokens and either does AND or OR on it
it has no knowledge on how they were produced. It would have been nice if
it could treat same position tokens as OR

I had similar issues when stemming even though I kept booth stemmed and
original words (minus dups) in the same field.
Guess you could analyze your search phrase to get it expanded with synonyms
and then produce your query yourself using

On Wed, Jun 19, 2013 at 5:16 PM, Jorge T
jorge.alberto.trujillo@gmail.comwrote:

Hey Alex,

Thanks, that confirms what I believed was the problem. That said, isn't
this behavior the opposite of what it should be? If I have I have two
documents, "milk cow" and "dairy cow", the whole point in my view is to
enable the user to type "dairy cow" and get both results. What this means
is that query time synonyms are not useful, at least with match queries. In
my view, the correct and logical behavior should be match expanding to
"(milk OR dairy) AND cow".

I'll play with bool and term or match queries with individual terms now to
see if I can get the desired results, but this seems like an unnecessary
workaround.

Regards,

Jorge

On Wednesday, June 19, 2013 4:41:42 PM UTC-4, AlexR wrote:

Then AND is not working for you because query time synonyms are simply
not in your index

For example you have "milk cow" indexed and say milk has dairy as a
synonym. Your match AND query will effectively expand your "milk" query to
"milk AND dairy" which is not in your index

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/jXFtmWh60mI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.