I am trying to run a filter on a field that contains abbreviations for US states. My index contains documents where state=WA and where state=OR. The search works correctly on state=WA, but if I run the identical search for state=OR, I get no hits. No errors -- just no hits.
Is this because "or" is a reserved word and, if so, then how do I work around this issue?
On Tue, 2011-01-11 at 19:30 -0800, searchersteve wrote:
I am trying to run a filter on a field that contains abbreviations for US
states. My index contains documents where state=WA and where state=OR. The
search works correctly on state=WA, but if I run the identical search for
state=OR, I get no hits. No errors -- just no hits.
Is this because "or" is a reserved word and, if so, then how do I work
around this issue?
Correct. OR is a boolean operator in the Lucene query syntax.
http://lucene.apache.org/java/3_0_0/queryparsersyntax.html
Given that state definitions are well defined entities, you probably
want to store them as not_analyzed
http://www.elasticsearch.com/docs/elasticsearch/mapping/core_types/#String
Then, instead of including the state abbreviation in the query term
search, you probably want to filter your results with a term filter
http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/term_filter
eg :
curl -XGET 'http://127.0.0.1:9200/_all/_search' -d '
{
"query" : {
"filtered" : {
"query" : {
"field" : {
"_all" : "keywords to find"
}
},
"filter" : {
"term" : {
"state" : "OR"
}
}
}
}
}
'
Note: terms are not analyzed at all, so if you index the state as 'OR'
then you must filter for it as 'OR', not 'Or' or 'or'
clint
On Tue, 2011-01-11 at 19:30 -0800, searchersteve wrote:
> I am trying to run a filter on a field that contains abbreviations for US
> states. My index contains documents where state=WA and where state=OR. The
> search works correctly on state=WA, but if I run the identical search for
> state=OR, I get no hits. No errors -- just no hits.
>
> Is this because "or" is a reserved word and, if so, then how do I work
> around this issue?
Correct. OR is a boolean operator in the Lucene query syntax.
http://lucene.apache.org/java/3_0_0/queryparsersyntax.html
Given that state definitions are well defined entities, you probably
want to store them as not_analyzed
http://www.elasticsearch.com/docs/elasticsearch/mapping/core_types/#String
Then, instead of including the state abbreviation in the query term
search, you probably want to filter your results with a term filter
http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/term_filter
eg :
curl -XGET 'http://127.0.0.1:9200/_all/_search' -d '
{
"query" : {
"filtered" : {
"query" : {
"field" : {
"_all" : "keywords to find"
}
},
"filter" : {
"term" : {
"state" : "OR"
}
}
}
}
}
'
Note: terms are not analyzed at all, so if you index the state as 'OR'
then you must filter for it as 'OR', not 'Or' or 'or'
clint
Thanks!!! I was, in fact, searching for state with a filter, but I did not know about the not_analyzed solution. I am realizing that I have a number of fields that are like this (discrete and predictable values), and I suspect that I could improve indexing speed quite a bit if I selected not_analyzed.
I'll try out your suggestion and let you know if problems arise. Thanks again for the schooling.
Steve