How to Make a field analyzed without spliting it?

Hi Experts,

I have a one field ,name=vikas_gopal . I am making message field analyzed (ES default analyzed field)but the problem it when I show this in a table it is like
message count
abc --------- 1
bcd --------- 1
asd --------- 1
com ---------1
name -------1
vikas -------- 1
gopal --------1

I know I can use .raw field but I am afraid I do not want to do that.Any idea that I can achieve this , i mean filed should be analyzed but it should be visible like(it should not split after a special character like - or .)

message count
address--------------------1 ------ 1
vikas_gopal ------------- 1


You can use custom analyzer to split data in the way you want. But it seems that you have key-value data, why don't you use Logstash filter to split them before sending to ES?

thanks @anhlqn

Sorry I was not clear with my query, yes I am aware that I can use kv filter but what I want is to check which word or string repeats itself , so that i can replace string . Somethings like if a word or a string is getting repeated more than 20 times , i'll replace it with short string .I thought if I make message field an analyzed filed and visualize it in tabular form I will get words with respective count . Though i achieved it but the only problem is, it splits whenever it found a special character .

How I can force elasticsearch no to split words based on special character ? Space is fine .

Is there any specific customized analyzer which can solve my problem ?


Just create a custom analyzer and choose the tokenizer, char filter, and token filters that match your purpose.

Thanks this helps.

You may be interested in the keyword tokenizer/analyzer (depending on your version).