Special characters combined with wildcards issue


(Erling Wegger Linde) #1

Hi,

I've got an index with lots of different fields. Say that one of the
many fields contain "Mr. x worked with C#, .Net and Java for 2 years"

Let's say I add a * to the search queries, to be able to return
results while the user is typing, i.e. a search for "Jav" is turned
into "Jav*", so that it matches "Java".

{"query":{"bool":{"must":[{"query_string":{"query":"_all:Jav*"}}]}}},
returns the correct document

However, when the user has completed spelling Java, the search query
is then changed into "Java*". That still works fine though.

{"query":{"bool":{"must":[{"query_string":{"query":"_all:Java*"}}]}}},
returns the correct document

When I search for ".Net" or "C#" with the appended "*" this does not
seem to work though.

{"query":{"bool":{"must":[{"query_string":{"query":"_all:C#*"}}]}}},
returns nothing

{"query":{"bool":{"must":[{"query_string":{"query":"_all:.Net*"}}]}}},
returns nothing

If I omit the "*" it works fine:

{"query":{"bool":{"must":[{"query_string":{"query":"_all:C#"}}]}}},
returns the correct document

{"query":{"bool":{"must":[{"query_string":{"query":"_all:.Net"}}]}}},
returns the correct document

Obviously, I would like the "C#" and ".Net" queries to work. I've
tried specifiying a "whitespace" tokenizer on the analyzer, but to me
it seems the issue is with how the query is parsed/handled/escaped.
Any advice on how to solve this issue would be greatly appreciated.

Thanks,
Erling


(system) #2