Use Elasticsearch synonym filter to retrieving rows with synonym rule defined


(Yang Beaty) #1

Hi,

I am digging into Elasticsearch for a feasibility study, I would like to
see if elasticsearch's synonym filter could help pick up rows of data that
share similar wording in a column.

I have implemented a simple prototype

  1. I have downloaded accounts.json from elasticsearch.org
  2. I placed synonym.txt to the config directory under elasticsearch install
    with one line content as following
    Dona=>Dale
  3. modified an online example from "Rafat Kuc-3" to as following:
    curl -XPOST 'localhost:9200/bank/account' -d '
    {
    "settings": {
    "index" : {
    "analysis" : {
    "analyzer" : {
    "synonym" : {
    "tokenizer" : "whitespace",
    "filter" : ["synonym"]
    }
    },
    "filter" : {
    "synonym" : {
    "type" : "synonym",
    "synonyms_path" : "synonym.txt"
    }
    }
    }
    }
    },
    "mappings" : {
    "account" : {
    "properties" : {
    "firstname" : { "type" : "string", "index" : "analyzed", "analyzer" :
    "synonym" }
    }
    }
    }
    }'

this is successful!

  1. load accounts data with
    curl -XPOST 'localhost:9200/bank/account/_bulk?pretty' --data-binary
    @accounts.json

this is also successful

Problem I have is: I tried to use elasticsearch/_plugin/head to search
giving account.firstname (query string) equals to "Dale", hope to get back
two rows, with firstname "Dale" and "Dona". But I only get one row back.
Can someone tell me if I am missing anything to have the synonym filter
works correctly?
I see that the pluggin head has a query function such as "fuzzy", and it
returns rows with "Hale" and "Dale", where is the rule defined and how can
I insert my own rule?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/57a04c5d-7a90-4390-b537-08f460e35880%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #2