I'm trying to test out how our system (and ES) actually change what a user is submitting into a query and what the expected response back should be. I have a post that has in it's content "Nee$ha". So I have submitted the search query of "nee$ha". In the response, I noticed that the highlighted information that comes back is:
"[bold]nee[/bold]$[bold]ha[/bold]"
When I run nee$ha
through the analyzer, I notice it gets split into two tokens:
curl -X GET "localhost:9200/_analyze" -H 'Content-Type: application/json' -d'
{
"analyzer": "standard",
"text": "nee$ha"
}'
Response:
{"tokens":[{"token":"nee","start_offset":0,"end_offset":3,"type":"<ALPHANUM>","position":0},{"token":"ha","start_offset":4,"end_offset":6,"type":"<ALPHANUM>","position":1}]}%
What I can't quite figure out, is what is actually happening in ES. I think it's turning it into an AND
query, requiring for both nee
and ha
in the same result. And that the $
essentially is considered a delimiter between words much like a .
or ,
. It's being passed into the query as "query": "content:(\"nee$ha\")"
so it hasn't been explicitly turned into an AND
and our default operator is an OR
.
Is this the right interpretation? And, more so, how do I get it to actually search for the $
and not ignore it?