How to apply an Analyzer to many Indices?


(Cyril BOGNOU) #1

Hello Everyone.
Need help.
Am newby to elasticsearch. I have installed ES 5.1.1 and it works great. I use logstash 5.1.1 to index data from a MySQL database to populate 2 ES indices.
My users are mostly francophone and I have an issue with queries containing accents.
I read many books and found that the asciifolding filter could help.

I learned how to create analyzer using JSON template file. Here is the content of my job_template.json file that I want to apply.
{
"template": "job_template",
"settings": {
"index": {
"analysis": {
"analyzer": {
"myCustomAnalyzer": {
"tokenizer": "standard",
"filter": ["standard", "lowercase", "asciifolding"]
}
}
}
}
}
}

But I am a bit confused on how to achieve this.
Now my questions are:

  1. When to apply an analyzer to ES (before or after indexing)?
  2. If it should be before indexing how can I use that configuration into my logstash config file to startup Logstash when I run this command logstash -f conf-logstash.conf ?

The folowing the content of the file conf-logstash.conf
input {
jdbc {
jdbc_driver_library => "mysql-connector-java-5.1.40-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/appp-dev"
jdbc_user => "root"
jdbc_password => "cyrilb"

schedule => "* * * * *" #A la 5ieme minutes de chaque heure . Plus de details ici http://www.thegeekstuff.com/2011/07/cron-every-5-minutes/
# Pagination. Logstash chargera les donnees de 50000 en 50000
jdbc_paging_enabled => true
jdbc_page_size => "5000"
# Mettre cette valeur a true si on desir charger a chaque lancement de logstash la BD. Tres utilise apres avoir vider elasticsearch
clean_run => true
statement => "SELECT uid, creation_date, phone, dial_code, email, username, first_name, last_name, gender, title, country, language from doopins WHERE update_date > :sql_last_value"
  }
 }

filter {
  #...
}

output {
stdout { codec => json_lines }
elasticsearch {
hosts => "localhost:8270"
document_id => "%{uid}"
index => "test-index"
document_type => "users"
}
}


(Mark Walkom) #2

Before.

You can set a template that LS reads - https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-template - and then put your template path in there.

Or upload the template to ES prior to starting LS.


(Cyril BOGNOU) #3

Thank You for the reply

I tried both solution already.
I used php curl to post the template as you can see in these images


Also I even set it via LS before indexing in the template file. see the following image.

If I search for the keyword secrétaire I have my records but when I try the keyword secretaire I have no result.
So my question other is : Should I set the analyzer to be used before querying ES (I use query_string for instance) or the analyzer is set by default since the template already existing in ES

What am I missing here ? I am stuck since more than 5 days just because of this issue :confused:

Thanks for your help.


(Mark Walkom) #4

Please don't post pictures of text, they are difficult to read and some people may not be even able to see them :slight_smile:

Looking at the _template request in your browser, you need to fix the "template": "doojob_template" value so that it matches the name of the index, ie doojob.


(Cyril BOGNOU) #5

Still no change even after this update :confused:
Here is my new template
{
"template": "doojob",
"settings": {
"index": {
"analysis": {
"analyzer": {
"myCustomAnalyzer": {
"tokenizer": "standard",
"filter": ["standard", "lowercase", "asciifolding"]
}
}
}
}
}
}


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.