I created an index "some" and a type "type", with default settings,
analyzers and mapping
I have indexed the following two docs:
{"annotationText": "here are for you"}
{"annotationText": "come as you are"}
I issue the following search:
{ "query": {
"query_string": {
"query" : ""
}
},
"highlight": {
"fields": {
"annotationText": {
}
}
}
}
I get the following results:
{
"_source": {
"annotationText": "here are for you"
},
"highlight": {
"annotationText": [
"here are <some> for you"
]
}
},
{
"_source": {
"annotationText": "come as you are"
},
"highlight": {
"annotationText": [
"come as you are"
]
}
}
So you see search for "" resulted in totally irrelevant results.
Escaping < and > does not help. It results in error saying unrecognised
escape character.
What is happening here? How can i issue search query which contain < or >
and get sane results?
The < sign in the beginning is interpreted by Lucene as a range query and
it becomes and open ended range query [:some] for field _all. I didn't
spend too much time looking at the code but at first glance I don't see an
escaping possibility. To be honest, I think you will be much better off
with the match query of elasticsearch - it treats the input as text and
doesn't parse it for any operator. This works for me:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.