How to setup synonym using ES 5.6.2

hi,

Can someone assiSt me how to setup synonym using ES 5.6.2?
I have to import data from oracle into index in ES

Currently, what i'm doing is:

step 1:

  1. // file logstash-ora-01.conf

input {
jdbc {
jdbc_validate_connection => true
jdbc_connection_string => "jdbc:oracle:thin:@192.168.1.43:1521/PDBORCL"
jdbc_user => "propadvisor"
jdbc_password => "propadvisor"
jdbc_driver_library => "ojdbc8.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
statement => "SELECT my_property_id, my_long_address_t from myproperty"
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
index => "pa_index22"
document_type => "logsData22"
}
}

step 2

  1. file synonyms.txt

taman, tman, tmn
jalan, jln, jlan
kampung, kg, kampg, kmpg, kampong
bukit, bkt
batu, bt
peti surat, pt
lorong, lrg, lorg

  1. my query

POST pa_index22/_search
{
"query": {
"bool": {
"should": [
{ "match": { "my_long_address_t" : "tmn" }}
]
}
}
}

result

{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}

i dont know where to put the synonym configuration inside my existing index.
Please help me..

You need to add synonym filter to the mapping. Using a more general example to show its usage,

        $params['body']['settings'] = [
          'analysis' => [
            'filter' => [
              'filter_synonym' => [
                'type' => 'synonym',
                'synonyms_path' => 'some_dir/synonym.txt',
                'tokenizer' => 'keyword',
                'expand' => true
              ]
            ]
          ]
        ];

The synonyms_path is located at /etc/elasticsearch.

More on synonym token filter.

Hi, thanks for replying this message

I already follw this step....

  1. POST /pa_index22/_close

PUT /pa_index22/_settings
{
"index" : {
"analysis" : {
"filter" : {
"synonym" : {
"type" : "synonym",
"synonyms_path" : "synonyms.txt",
"tokenizer" : "whitespace"
}
},
"analyzer" : {
"synonym" : {
"tokenizer" : "whitespace",
"filter" : ["synonym"]
}
}
}
}
}

  1. POST /pa_index22/_open

but, the configuration still not take effect
when i run this command

GET /pa_index22/_mapping?pretty

{
"pa_index22": {
"mappings": {
"logsData22": {
"properties": {
"@timestamp": {
"type": "date"
},
"@version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"my_long_address_t": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"my_property_id": {
"type": "long"
}
}
}
}
}
}

please assist me where i'm doing the wrong... :frowning:

Nevermind... i find it myself

synonyms.txt
#-----------------------------------------------------------------
taman, tmn, tamn, tman
jalan, jln, jlan
kampung, kg, kampg, kmpg, kampong, kmpg
bukit, bkt, bkt
batu, bt, bt
peti surat, pt, pt
lorong, lrg, lorg
#--------------------------------------------------------

  1. close current index ----- > POST /pa_index22/_close

  2. Update current index with this setting

PUT /pa_index/_settings
{
"settings": {
"index" : {
"analysis" : {
"analyzer" : {
"synonym_graph" : {
"tokenizer" : "standard",
"filter" : ["synonym_graph"]
}
},
"filter" : {
"synonym_graph" : {
"type" : "synonym_graph",
"synonyms_path" : "synonyms.txt"
}
}
}
}
}
}

  1. Open back the index ----> POST /pa_index/_open

  2. Verify weather our setting is take effect or not
    -------------> GET /pa_index/_analyze?analyzer=synonym_graph&text=lrg

output
{
"tokens": [
{
"token": "lorong",
"start_offset": 0,
"end_offset": 3,
"type": "SYNONYM",
"position": 0
},
{
"token": "lorg",
"start_offset": 0,
"end_offset": 3,
"type": "SYNONYM",
"position": 0
},
{
"token": "lrg",
"start_offset": 0,
"end_offset": 3,
"type": "",
"position": 0
}
]
}

  1. It already take value from synonyms.txt when i search lrg, so it take lrg, lorong, lorg

  2. Also, we can run this statement to know the explanation
    ------>
    GET /pa_index/_validate/query?explain
    {
    "query": {
    "match": {
    "text": {
    "query": "lrg",
    "analyzer": "synonym_graph"
    }
    }
    }
    }

output:
{
"valid": true,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"explanations": [
{
"index": "pa_index",
"valid": true,
"explanation": "Synonym(text:lorg text:lorong text:lrg)"
}
]
}

  1. So, perform our search query
    ------>
    POST pa_index/_search
    {
    "query" : {
    "query_string" : {
    "query" : "lrg meranti",
    "analyzer" : "synonym_graph",
    "fields" : [ "my_long_address_t" ],
    "auto_generate_phrase_queries" : true
    }
    }
    }

output:
{
"took": 17,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 315873,
"max_score": 14.444332,
"hits": [
{
"_index": "pa_index",
"_type": "logsData",
"_id": "AV9NhdcDQlb49cpwsCNW",
"_score": 14.444332,
"_source": {
"@version": "1",
"@timestamp": "2017-10-24T08:35:37.812Z",
"my_long_address_t": "4344C, LORONG MERANTI,LORONG MERANTI, LORONG MER",
"my_property_id": 1936
}
},
{
"_index": "pa_index",
"_type": "logsData",
"_id": "AV9Ni631Qlb49cpws5ue",
"_score": 14.415924,
"_source": {
"@version": "1",
"@timestamp": "2017-10-24T08:42:00.514Z",
"my_long_address_t": "43343A, LORONG MERANTI,LORONG MERANTI, LORONG MERANTI",
"my_property_id": 1286
}
},
{
"_index": "pa_index",
"_type": "logsData",
"_id": "AV9Niw7cQlb49cpws0JU",
"_score": 13.769295,
"_source": {
"@version": "1",
"@timestamp": "2017-10-24T08:41:19.805Z",
"my_long_address_t": "2344C,LORONG MERANTI, LORONG MERANTI",
"my_property_id": 1290
}
},
{
"_index": "pa_index",
"_type": "logsData",
"_id": "AV9OOcgJQlb49cpw9WB1",
"_score": 13.769295,
"_source": {
"@version": "1",
"@timestamp": "2017-10-24T11:52:10.458Z",
"my_long_address_t": "54349,LORONG MERANTI, LORONG MERANTI",
"my_property_id": 21389
}
},
{
"_index": "pa_index",
"_type": "logsData",
"_id": "AV9OOZh_Qlb49cpw9Vhk",
"_score": 13.769295,
"_source": {
"@version": "1",
"@timestamp": "2017-10-24T11:51:58.304Z",
"my_long_address_t": "27BB,LORONG MERANTI, LORONG MERANTI",
"my_property_id": 2125
}
},
{
"_index": "pa_index",
"_type": "logsData",
"_id": "AV9NifSWQlb49cpwspvy",
"_score": 13.671318,
"_source": {
"@version": "1",
"@timestamp": "2017-10-24T08:40:07.527Z",
"my_long_address_t": "54449,LORONG MERANTI, LORONG MERANTI",
"my_property_id": 1079
}
},
{
"_index": "pa_index",
"_type": "logsData",
"_id": "AV9Nh7x3Qlb49cpwsVh5",
"_score": 13.451371,
"_source": {
"@version": "1",
"@timestamp": "2017-10-24T08:37:42.088Z",
"my_long_address_t": "443C, LORONG MERANTI,LORONG MERANTI, LORONG MERANTI, 55200",
"my_property_id": 16652
}
},
{
"_index": "pa_index",
"_type": "logsData",
"_id": "AV9NixqMQlb49cpws0cY",
"_score": 13.364031,
"_source": {
"@version": "1",
"@timestamp": "2017-10-24T08:41:22.796Z",
"my_long_address_t": "10A, LORONG MERANTI 2,LORONG MERANTI, LORONG MERANTI, 55200",
"my_property_id": 16702
}
},
{
"_index": "pa_index",
"_type": "logsData",
"_id": "AV9OO-YNQlb49cpw9eZi",
"_score": 13.364031,
"_source": {
"@version": "1",
"@timestamp": "2017-10-24T11:54:29.214Z",
"my_long_address_t": "47CAC, LORONG MERANTI,LORONG MERANTI, LORONG MERANTI, 55200",
"my_property_id": 21001
}
},
{
"_index": "pa_index",
"_type": "logsData",
"_id": "AV9Nh_sRQlb49cpwsYGd",
"_score": 13.355483,
"_source": {
"@version": "1",
"@timestamp": "2017-10-24T08:37:58.114Z",
"my_long_address_t": "183 LORONG MERANTI,LORONG MERANTI, LORONG MERANTI, 554200",
"my_property_id": 13961
}
}
]
}
}

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.