Java API substring matching

(moyang) #1

Hello guys, I am a new starter to elasticsearch. I am trying to use Java API to search and retrieve files.
The problem is, i cannot figure out a good way to do the substring searching.

For example, if the file's content is :


then I can use the "matchQuery()" or "regexpQuery()" to search for "aaaa" or "bbbb" and get the target file.
However, if the file's content change to :


I just cannot search for it with a substring! No matter I use matchQuery or regexpQuery or any other kinds of queries, I cannot get this file by searching "aaaa", "bbbb", or "cccc".

I don't understand why this is happening, because If i use Kibana and search from the gui, I can get the satisfying result.

Would any guys help me out please? I really appreciate that. Thanks in advance.

(Dan Tuffery) #2

I assume your are using the default analyzer (standard analyzer) to index the field. The standard analyzer does not split the term on the full stop characters during the analysis process. You can test this out using the Analyze API:

curl -XGET 'localhost:9200/{index_name}/_analyze?analyzer=standard&pretty=true' -d 'aaaa.bbbb.cccc'

The simple analyzer does split the term on the full stop character:

curl -XGET 'localhost:9200/{index_name}/_analyze?analyzer=simple&pretty=true' -d 'aaaa.bbbb.cccc'

In your index mapping specify the simple analyzer for the field and and re-index the document.

(system) #3