Hello Everyone.
Need help.
Am newby to elasticsearch. I have installed ES 5.1.1 and it works great. I use logstash 5.1.1 to index data from a MySQL database to populate 2 ES indices.
My users are mostly francophone and I have an issue with queries containing accents.
I read many books and found that the asciifolding filter could help.
I learned how to create analyzer using JSON template file. Here is the content of my job_template.json file that I want to apply.
{
"template": "job_template",
"settings": {
"index": {
"analysis": {
"analyzer": {
"myCustomAnalyzer": {
"tokenizer": "standard",
"filter": ["standard", "lowercase", "asciifolding"]
}
}
}
}
}
}
But I am a bit confused on how to achieve this.
Now my questions are:
- When to apply an analyzer to ES (before or after indexing)?
- If it should be before indexing how can I use that configuration into my logstash config file to startup Logstash when I run this command logstash -f conf-logstash.conf ?
The folowing the content of the file conf-logstash.conf
input {
jdbc {
jdbc_driver_library => "mysql-connector-java-5.1.40-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/appp-dev"
jdbc_user => "root"
jdbc_password => "cyrilb"
schedule => "* * * * *" #A la 5ieme minutes de chaque heure . Plus de details ici http://www.thegeekstuff.com/2011/07/cron-every-5-minutes/
# Pagination. Logstash chargera les donnees de 50000 en 50000
jdbc_paging_enabled => true
jdbc_page_size => "5000"
# Mettre cette valeur a true si on desir charger a chaque lancement de logstash la BD. Tres utilise apres avoir vider elasticsearch
clean_run => true
statement => "SELECT uid, creation_date, phone, dial_code, email, username, first_name, last_name, gender, title, country, language from doopins WHERE update_date > :sql_last_value"
}
}
filter {
#...
}
output {
stdout { codec => json_lines }
elasticsearch {
hosts => "localhost:8270"
document_id => "%{uid}"
index => "test-index"
document_type => "users"
}
}