Hello,
I am running 0.17.8 in our dev environment and 0.16.5 in QA and
production. I am doing some work around synonyms and noticed that they
don't seem to be working on 0.17.8 or 0.17.7 (I had updated to .8 to
check if this had been fixed).
Here is a gist that shows what I am observing:
I was testing around punctuation in the synonym list, but even the
simple case of work <=> business doesn't work on 0.17.8.
Please let me know if there are any other details I can provide.
Hopefully, I'm not missing something obvious.
FYI, this was caused by updates in lucene3.4 to handle synonym files
more efficiently. I made some tweaks and sent a pull request:
As a side note, it'd be very nice to define a list of token filters to
apply to the synonym list to ensure the behavior is in sync with the
analysis that has occurred up to that point on the data.
Hello,
I am running 0.17.8 in our dev environment and 0.16.5 in QA and
production. I am doing some work around synonyms and noticed that they
don't seem to be working on 0.17.8 or 0.17.7 (I had updated to .8 to
check if this had been fixed).
As a side note, it'd be very nice to define a list of token filters to
apply to the synonym list to ensure the behavior is in sync with the
analysis that has occurred up to that point on the data.
Hello,
I am running 0.17.8 in our dev environment and 0.16.5 in QA and
production. I am doing some work around synonyms and noticed that they
don't seem to be working on 0.17.8 or 0.17.7 (I had updated to .8 to
check if this had been fixed).
Hi,
not an expert here but I tried to look quickly into Lucene 3.4/Solr code and
it seems to me that synonym filter code has been ported from Solr to Lucene.
This is not true. This was rewritten from scratch.
So if I understand it correctly, the Solr's SynonymFilterFactory delegates
either to SlowSynonymFilter (if using Lucene < 3.4) or to SynonymFilter (via
FSTSynonymFilterFactory) which is located in Lucene module. And that Lucene
module was written from scratch, correct? Which means that SynonymFilter
JavaDoc mentioned by @ppearcy is probably the best source of up-to-date
documentation about synonym filter for now (speaking from the side of
Elasticsearch user).
Lukas
On Thu, Oct 13, 2011 at 3:07 PM, Robert Muir rcmuir@gmail.com wrote:
Hi,
not an expert here but I tried to look quickly into Lucene 3.4/Solr code
and
it seems to me that synonym filter code has been ported from Solr to
Lucene.
This is not true. This was rewritten from scratch.
Hi Robert,
thanks for clarification!
So if I understand it correctly, the Solr's SynonymFilterFactory delegates
either to SlowSynonymFilter (if using Lucene < 3.4) or to SynonymFilter (via
FSTSynonymFilterFactory) which is located in Lucene module. And that Lucene
module was written from scratch, correct? Which means that SynonymFilter
JavaDoc mentioned by @ppearcy is probably the best source of up-to-date
documentation about synonym filter for now (speaking from the side of
Elasticsearch user).
exactly. Lucene didnt have a synonymfilter before (Except a very
limited single-word one in the wordnet package, replaced by a wordnet
PARSER for this filter)
But, Solr had a synonymfilter before. there are a couple of corner
cases where it didnt make sense to try to emulate the old solr
functionality exactly for backwards
compatibility, so instead we did this delegation trick so that solr
users can continue to use the exact version they had before in those
cases.
for Lucene users its all new functionality... and javadocs are really
the documentation for lucene, since its a library.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.