How to use TermQueryBuilder to search Chinese word?

KennFalcon · August 29, 2015, 11:53pm

I put a Chinese word in the ElasticSearch. But when I want to search it by using TermQueryBuilder in Java API, I couldn't find the data. So, how to do?

Igor_Motov · September 1, 2015, 4:32pm

Unless you configured an analyzer your chinese words are are split into tokens, one character per token. The TermQueryBuilder doesn't analyze the query that you pass to it and tries to find a token that corresponds to your query as is. So, for example, if you text contains "黄山" it will be indexed as two term "黄" and "山". If you pass "黄山" to the term query builder it will try to search for a term "黄山" that doesn't exist in your index. In other words your search should use the same analyzer that you used during indexing. Try replacing TermQueryBuilder with MatchQueryBuilder, it will provide more reasonable results. However, if you are dealing with only Chinese text, you will get much better results using dedicated Chinese analyzer.