Standard analyzer

nayun_oh · May 9, 2019, 6:19am

Hello, Elasticsearch Member,

I used whitespace tokenizer with lowercase filter. but it turned out that it took a longer time than standard analyzer. But I need to split sentence only by white spaces and I also need to use lowercase filter. Standard analyzer is very right for that. But the problem is that it also removes special characters.
I need to search for data containing special characters.
Standard analyzer has stop words options. But I think it just removes terms that matches.

I hope you can help me with this.
Thank you in advance!

dadoonet · May 9, 2019, 6:33am

Can you share more information about the comparison tests you done (what are the 2 scenarii?) and what were the differences you have seen?

nayun_oh · May 9, 2019, 6:51am

I use default standard analyzer and 1 TB per index. When I search for data from the last five minutes, it takes a minute. But when I changed to whitespace analyzer, it took a longer time (about 20 minutes, I guess). So I had to revert it.

dadoonet · May 9, 2019, 7:22am

When I search for data from the last five minutes, it takes a minute.

WHAT? One minute for a search response?

What does your search query looks like?

nayun_oh · May 9, 2019, 7:46am

Actually, It is quite a long query. I am sorry. All I want to do is that using standard analyzer with special characters. Any advice?

dadoonet · May 9, 2019, 7:46am

I can't help without knowing what you are doing.

system · June 6, 2019, 7:50am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Analysing and Searching Elasticsearch	3	332	July 6, 2017
Whitespace analyzer (char-filter And token-filter) Elasticsearch	7	1217	November 27, 2019
Searching word with special characters Elasticsearch	7	1823	November 4, 2020
Changing tokenizer from whitespace to standard Elasticsearch	4	2559	July 6, 2017
Whitespace tokenizer doesn't allow lowercase search? Elasticsearch	2	2992	October 4, 2017

Standard analyzer

Related topics