Synonym

hi all --

I'm trying to setup a synonym filter in ES. here's what I did so far:

  1. I created a file named elasticsearch.json (and deleted the default
    elasticsearch.yml) with this information:

{
"index" : {
"analysis" : {
"analyzer" : {
"synonym" : {
"tokenizer" : "whitespace",
"filter" : [ "synonym" ]
}
},
"filter" : {
"synonym" : {
"type" : "synonym",
"synonyms_path" : "synonym.txt"
}
}
}
}
}

  1. added a file named synonym.txt in the same folder, with information like:

    candle, candles
    candle holder, candle holders
    tea light, tea lights, tealights

  2. reindexed all data

so now I expect that a (query-string) search for "red candle" and "red
candles" will return the same results as "candle" and "candles" are
synonyms but instead I get different results.

what am I doing wrong? and how can I tell which tokens were actually
indexed?

thanks,

Igal

--
Igal Sapir
Railo Core Developer
http://getRailo.org/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

  1. don't remove yml file. It contains node settings, not index settings.
    Use create index REST API to create your index with your analyzer.
  2. if you want to manage plural forms, why not using a stemmer instead?
  3. use analyze API to understand how elasticsearch is breaking your strings into tokens.

HTH

David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2013 à 03:33, "Igal @ getRailo.org" igal@getrailo.org a écrit :

hi all --

I'm trying to setup a synonym filter in ES. here's what I did so far:

  1. I created a file named elasticsearch.json (and deleted the default elasticsearch.yml) with this information:

{
"index" : {
"analysis" : {
"analyzer" : {
"synonym" : {
"tokenizer" : "whitespace",
"filter" : [ "synonym" ]
}
},
"filter" : {
"synonym" : {
"type" : "synonym",
"synonyms_path" : "synonym.txt"
}
}
}
}
}

  1. added a file named synonym.txt in the same folder, with information like:

candle, candles
candle holder, candle holders
tea light, tea lights, tealights

  1. reindexed all data

so now I expect that a (query-string) search for "red candle" and "red candles" will return the same results as "candle" and "candles" are synonyms but instead I get different results.

what am I doing wrong? and how can I tell which tokens were actually indexed?

thanks,

Igal

--
Igal Sapir
Railo Core Developer
http://getRailo.org/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

thanks for the reply.

so I can keep both elasticsearch.yml and elasticsearch.json and they will
both be read? in what order -- yml first? or are you suggesting that I
will not use elasticsearch.json at all? I personally don't like yml much
-- json seems clearer to me.

this is not only for plural so stemming will not work for me.

thanks again,

Igal

On Friday, June 7, 2013 8:40:05 PM UTC-7, David Pilato wrote:

  1. don't remove yml file. It contains node settings, not index settings.
    Use create index REST API to create your index with your analyzer.
  2. if you want to manage plural forms, why not using a stemmer instead?
  3. use analyze API to understand how elasticsearch is breaking your
    strings into tokens.

HTH

David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2013 à 03:33, "Igal @ getRailo.org" <ig...@getrailo.org<javascript:>>
a écrit :

hi all --

I'm trying to setup a synonym filter in ES. here's what I did so far:

  1. I created a file named elasticsearch.json (and deleted the default
    elasticsearch.yml) with this information:

{
"index" : {
"analysis" : {
"analyzer" : {
"synonym" : {
"tokenizer" : "whitespace",
"filter" : [ "synonym" ]
}
},
"filter" : {
"synonym" : {
"type" : "synonym",
"synonyms_path" : "synonym.txt"
}
}
}
}
}

  1. added a file named synonym.txt in the same folder, with information
    like:

candle, candles
candle holder, candle holders
tea light, tea lights, tealights

  1. reindexed all data

so now I expect that a (query-string) search for "red candle" and "red
candles" will return the same results as "candle" and "candles" are
synonyms but instead I get different results.

what am I doing wrong? and how can I tell which tokens were actually
indexed?

thanks,

Igal

--
Igal Sapir
Railo Core Developer
http://getRailo.org/

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.