Understanding ELK Analyzer


(Cyril BOGNOU) #1

Hi all,
Am newby to ELK 5.1.1 stack and I have a few questions just for my understanding.
I have setup this stack basicaly with standard analyzers / filters and everything works great.
My data source is a MySQL backend that I index using Logstash.
I would like to deal with queries containing accents and hopefully asciifolding token filter can help achieve this.

First I learned out how to create custom analyzer and save as template.
Right now when I query this url http://localhost:9200/_template?pretty I have 2 templates: the logstash default template named logstash and my custom template which settings are:

"custom_template" : {
    "order" : 1,
    "template" : "doo*",
    "settings" : {
        "index" : {
            "analysis" : {
                "analyzer" : {
                    "myCustomAnalyzer" : {
                        "filter" : [
                            "standard",
                            "lowercase",
                            "asciifolding"
                        ],
                        "tokenizer" : "standard"
                   }
               }
         },
        "refresh_interval" : "5s"
     }
    },
    "mappings" : { },
    "aliases" : { }
}

Searching for the keyword Yaoundé returns 70 hits but when I search for Yaounde I keep having no hit.
Below is my query for the second case

{
"query": {
    "query_string": {
        "query": "yaounde",
        "fields": [
            "title"
        ]
    }
},
"from": 0,
"size": 10
}

Please can somebody help me guess what am doing wrong here?
Also knowing that my data is analyzed by Logstash during the index process do I really have to specify that the analyzer myCustomAnalyzer should be applied during the research as per this second query ?

{
     "query": {
    "query_string": {
        "query": "yaounde",
        "fields": [
            "title"
        ],
        "analyzer": "myCustomAnalyzer"
    }
},
"from": 0,
"size": 10
}

Here is a sample of the output part of my logstash config file

output {
stdout { codec => json_lines }
if [type] == "announces" {
    elasticsearch {  
        hosts => "localhost:9200"
        document_id => "%{job_id}"
        index => "dooone"
	document_type => "%{type}"
    }
} else {
    elasticsearch {  
        hosts => "localhost:9200"
	document_id => "%{uid}"
	index => "dootwo"
	document_type => "%{type}"
    }
}
}

Thank You


(Cyril BOGNOU) #2

My problem is finally solved.
Just for those that could have the same problem, the solution was to define the field settings in the mapping section of the template "mapping" : {#here}


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.