And the question is exactly: "what do you want to achieve?"
Sounds like you need an analyzer that does not break your text into terms but keep it as is.
May be, you should use a keyword tokenizer or not analyze at all that field.
On a side note, you should avoid using wildcards in query. Prefer ngrams filters when building your index than wildcards.
And the question is exactly: "what do you want to achieve?"
I want to index and search data which may contain asterisk(*). I read
somewhere that we can search for special characters by including the text
in escaped quotes like ""mytext"".
So I have two questions:
What kind of analyzers/tokenizers I should so that the indexed data
includes asterisk or any other special character?
How should I search for those data containing asterisk(*) or some other
special characters?
Right now I was trying to fire a query assuming the data contains
asterisk(*) as follows:
{"query" : {
"query_string" : {
"c101" : {
"query" :
""mail.google.*com""
}
}
}}
This is not returning a hit as expected.
But if I fire ""mail.google.com*"" , it returns a hit.
I guess the search data is getting tokenized around asterisk. Means if I
queried for :
"query" : ""mail.google.*com""
It may be breaking it into two tokens 'mail.google.' and 'com' and it
searches for both tokens in 'mail.google.com' and fails.
But when I query ""mail.google.com*"" it may be breaking it into one
token 'mail.google.com' and matches a hit.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.