How to custom analyzer to define an analyzer that emits one term per letter


(Ngọc Phạm) #1

Hi . Im newbie in elasticsearch And I have some question . Please help me :slightly_smiling:

Im trying input data from postgres database into elasticsearch. Now i want spaces with each letter before input it into elasticsearch server.

Example : i have three field like this:
id....code....name
1.....10...........John
2......19.........Lina

i want spaces in someone data in some field name . It look like John --> J o h n . Lina ---> L i n a
Anybody have anyidea about how to do it in analyzer elasticsearch?
im using elasticsearch 1.7.3
Thanks for your help :x :x


(David Pilato) #2

I'm unsure I understood what you want to do.

If it's before, so that _source will reflect those changes, you have to do that before elasticsearch which means in your client or in logstash if you are using logstash.

If you want that a field john become j,o,h,n at index time, then you can look at the ngram tokenizer and set min_gram and max_gram to 1.

Not sure that I understood the use case though.


(Ngọc Phạm) #3

thanks for your help. I will try with ngram tokenizer


(system) #4