Trivial document, default (i.e. dynamic) mapping/analyzer.
PUT example/doc/1
{"s":">hello"}
Tokenization discards the >
, but it's a reserved character, so you'd expect a naive unescaped querystring containing it to have problems:
GET example/_search
{"query":{"query_string":{"query":"(s:(>hello))"}}}
And it does, sort of; no errors, but no hits either. "query":"(s:(%hello))"
does match the doc, and %
isn't reserved, so the reserved-ness of >
definitely seems to be the reason.
Where it gets weird is that "(s:(\\>hello))"
, i.e. one JSON-escaped backslash followed by >
, doesn't match the doc either. "(s:(\\\\>hello))"
, which looks like it ought to be an Lucene-escaped backslash followed by an unescaped >
, does match. So do "(s:(\\\\\\>hello))"
, "(s:(\\\\\\\\>hello))"
and so ad infinitum.
Can anyone make any sense of this? As a newbie I've been banging my head against it without result, and colleagues with much more ES experience are similarly stumped.