Generate synonyms on indexation side or on search side?

I want to use synonyms tokens filter ...
but I wonder if I have to "generate" my synonyms on the indexation side or
on the search side :

I go for :

index like this :
stop_words -> lowercase -> synonyms -> stemmer -> asciifolder

and parse search like this :
stop_words -> lowercase -> humspell -> stemmer -> asciifolder

So, do I have to "generate" synonyms" on indexing or on the "search side" ?
I go for generating synonyms while indexing ... but I prefer to ask ... :wink:

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

I tend to prefer indexing-time synonyms. If you only apply synonyms at
searching time then there will be a bias in the score that will favor the
documents that contain the least frequent synonyms because of their IDF.
For example, if television and TV are synonyms and if "television" is much
more frequent than "TV" in your index, searching for "TV" or "television"
will make documents which contain "TV" appear first in the results.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Yeah, from my experiences, I agree. Index time synonyms while not as
flexible (need to rebuild if it changes) ends up with much better
experience.

Only do the expansion on the index analyzer, not the search analyzer for
the field.

Best Regards,
Paul

On Friday, July 26, 2013 3:30:14 AM UTC-6, Adrien Grand wrote:

Hi,

I tend to prefer indexing-time synonyms. If you only apply synonyms at
searching time then there will be a bias in the score that will favor the
documents that contain the least frequent synonyms because of their IDF.
For example, if television and TV are synonyms and if "television" is much
more frequent than "TV" in your index, searching for "TV" or "television"
will make documents which contain "TV" appear first in the results.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

thanks guys !

2013/7/26 ppearcy ppearcy@gmail.com

Yeah, from my experiences, I agree. Index time synonyms while not as
flexible (need to rebuild if it changes) ends up with much better
experience.

Only do the expansion on the index analyzer, not the search analyzer for
the field.

Best Regards,
Paul

On Friday, July 26, 2013 3:30:14 AM UTC-6, Adrien Grand wrote:

Hi,

I tend to prefer indexing-time synonyms. If you only apply synonyms at
searching time then there will be a bias in the score that will favor the
documents that contain the least frequent synonyms because of their IDF.
For example, if television and TV are synonyms and if "television" is much
more frequent than "TV" in your index, searching for "TV" or "television"
will make documents which contain "TV" appear first in the results.

--
Adrien Grand

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/DFYxLBzPwqw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

"Si les gens ne croient pas que les mathématiques sont simples, c’est
uniquement parce qu’ils ne réalisent pas à quel point la vie est
compliquée."

  • John von Neumann

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.