Simple query string and upper case prefixes


(morus.walter.ml) #1

Hi,

I'm trying to do a prefix query using the simple_query_string query.

The field I'm searching on it analyzed using an Analyzer named "unstemmed",
which basically does lowercase, unicode normalisation and accent removal
(à treated as a).

If I do a search for an uppercase prefix (e.g. Project*) I get no results,
while lowercase prefixes work.

The documentation on the simple query string query talks about a
lowercase_expanded_terms option,
which creates a query parsing exception, if I make it explicit (see below,
full error message is
QueryParsingException[[jobs_v0002] [simple_query_string] unsupported field
[lowercase_expanded_terms]])

But the default sould be true anyway, which should allow uppercase input
anyway.

So question:
Do I missunderstand something here?
Do I do something wrong?
Or is it broken?

The ES version is 1.0.1. see below for the full query.

Another question would be, if the simple query string query could (and
should)
be enhanced to allow to specify two analyzers. One for normal terms and one
for
prefixes and fuzzy queries.
My problem here is, that even when lowercasing would work, there are cases
left, where
the prefix contains accents which would still fail to expand.
(e.g. à* would never find anything since à must be normalized to a)
would that be possible?

best
Morus

PS: the query I send is
{
"query": {
"simple_query_string": {
"analyzer": "unstemmed",
"default_operator": "and",
"fields": [
"fulltext.text"
],
"query": "Project*"
}
}
}

the attempt to make lowercase_expanded_tree explicit is
{
"query": {
"simple_query_string": {
"analyzer": "unstemmed",
"default_operator": "and",
"lowercase_expanded_terms": true,
"fields": [
"fulltext.text"
],
"query": "Project*"
}
}
}
full error is
SearchPhaseExecutionException[Failed to execute phase [query_fetch], all
shards failed; shardFailures {[x34iS8VlRiyhDFKkiVUFTw][jobs_v0002][0]:
SearchParseException[[jobs_v0002][0]: from[-1],size[-1]: Parse Failure
[Failed to parse source
[{"query":{"simple_query_string":{"analyzer":"unstemmed","default_operator":"and","lowercase_expanded_terms":true,"fields":["fulltext.text"],"query":"Project*"}}}]]];
nested: QueryParsingException[[jobs_v0002] [simple_query_string]
unsupported field [lowercase_expanded_terms]]; }]
status: 400

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f8bc56e6-7e34-445b-a542-5c55cf57835a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Binh Ly-2) #2

Sorry, the lowercase_expanded_terms was added in 1.1 - that documentation
page should be fixed. About your other question, you can experiment with
the query_string query instead which has an analyze_wildcard flag - you
still can't specify 2 analyzers, but you have have it try to analyze
wildcards in your search terms.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8ed4fbb0-4d2f-4982-ad56-58ebc668b5b1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(morus.walter.ml) #3

Hallo Binh,

Sorry, the lowercase_expanded_terms was added in 1.1 - that documentation
page should be fixed. About your other question, you can experiment with
the query_string query instead which has an analyze_wildcard flag - you
still can't specify 2 analyzers, but you have have it try to analyze
wildcards in your search terms.

ok. Thanks a lot for the explanation.

Actually I saw the 1.1 indicator on another new parameter (locale) for
simple_query_string and felt save regarding the version...

best
Morus

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/20140328162448.49168207%40tucholsky.experteer.muc.
For more options, visit https://groups.google.com/d/optout.


(system) #4