This results in a correct tokenization of input strings, i.e. java.io.File gets tokenized into (java, java.io, java.io.File) and I expected to be able to search for java.io and get back e.g. java.io.File, java.io.Reader, and java.io.Writer. However, I’ve realized that when including a java_classname_analyzer field in a query string, e.g. using the query
class:java.io
in Kibana I get many more hits than I asked for since the search term itself is tokenized into (java, java.io) and I’m actually getting hits for everything matching java.*.
Is there a way to I avoid this? With the query DSL I guess I can use a term query rather than e.g. a match query but since the use case is Kibana it needs to be a query string.
Thanks @jpountz, that looks exactly like what I'm looking for (could've sworn I looked at that documentation page the other day). However, I don't get the result I'm looking for. I updated my index template so that the logstash-2015.05.27 index uses the keyword analyzer for a number of fields and verified that the actual mapping of the index looks okay:
Is this by any chance because ES doesn't analyze the query for each index being searched but in this case uses the analyzer specified in the mappings of the dozens of other indexes that don't use the keyword analyzer for that field?
Hmm, this looks like a bug! How does your mapping template look like, did you actually modify the search_analyzer and not the search_quote_analyzer?
Regarding your other question, Elasticsearch actually analyzes the query string per shard, so your change to new indices should work on these new indices.
We have the same problem with ES 1.7.0. We specify search_analyzer in the mapping template but the actual mapping has search_quote_analyzer and it does not work as expected.
I just saw the same problem in 2.1. In the processing of migrating from 1.5.2 to 2.1, I used the same mapping and in 2.1 my search_analyzer gets applied as search_quote_analyzer.
Actually, I think it is a 1.5.2 bug, probably fixed in a later version.
So, in the process of I changed all my mappings to no longer use "index_analyzer" and just use "analyzer". So, all my fields would have "search_analyzer" and "analyzer". However, when I do that on 1.5.2, my index mapping ends up like this:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.