Thanks, good to know.
From: "Igor Motov-3 [via ElasticSearch Users]" <ml-node+s115913n3853948h92@n3.nabble.commailto:ml-node+s115913n3853948h92@n3.nabble.com>
Date: Sat, 24 Mar 2012 10:24:42 -0500
To: Jimmy Chen <jchen@sugarcrm.commailto:jchen@sugarcrm.com>
Subject: Re: wildcard and slashes
A better way to get translated queries is coming in 0.19.2 and 0.20.0. See https://github.com/elasticsearch/elasticsearch/pull/1811 for details.
On Thursday, March 22, 2012 3:52:32 PM UTC-4, Igor Motov wrote:
3 gets broken into queries by query parser.
I am not aware of any simple way to get the translated queries. When I need to figure out what's actually going on with my queries I just start elasticsearch under debugger, place breakpoint here https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/search/query/QueryPhase.java#L176https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/search/query/QueryPhase.java#L176 and execute my search. The query variable there points to the actual Lucene query.
On Thursday, March 22, 2012 2:02:31 PM UTC-4, chimingc wrote:
Igor,
Thanks for the response. Really helpful.
I have a few more questions though.
Neither query 3 nor 5 is being analyzed, so why does 3 get broken down into 2 tokens but 5 doesn't?
Also, how did you get the translated queries? Anyway to query elastic to get them? I think it's very helpful to know what the query strings eventually become.
Thanks again,
jimmy
From: "Igor Motov-3 [via ElasticSearch Users]" <[hidden email]http://user/SendEmail.jtp?type=node&node=3849083&i=0>
Date: Wed, 21 Mar 2012 21:11:28 -0500
To: Jimmy Chen <[hidden email]http://user/SendEmail.jtp?type=node&node=3849083&i=1>
Subject: Re: wildcard and slashes
Assuming that you are using standard analyzer, this is what these 5 queries are translated into on Lucene level:
1: _all:24* - prefix query for terms that start with "24"
2: _all:24 _all:account - query for the term "24" or the term "account"
3: _all:24 _all:a* - query for the term "24" or prefix query for terms that start with "a"
4: _all:a* - prefix query for terms that start with "a"
5: _all:24/a* - prefix query for terms that start with "24/a"
Cases 2-4 are obvious, but cases 1 and 5, probably, require some explanation. By default, wildcard terms are not analyzed. This is why 5th case is getting translated into prefix query with the prefix "24/a". As you correctly noticed, "24/account" is indexed as two tokens "24" and "account". So, there are no tokens in the index that start with 24/a and therefore 5th case doesn't return any results. In the case 1, wildcard terms are analyzed and "24/a" is getting translated into two tokens "24" and "a". The token "a*" is a stopword and it's getting dropped and the query is getting translated into prefix query for terms that start with 24.
On Wednesday, March 21, 2012 6:29:49 PM UTC-4, chimingc wrote:
I've indexed this: "24/account".
I understand that it's been tokenized into "24" and "account", which is not
a problem for me.
However, when I query "24/a*", it finds no match.
Then I tried the following cases:
-
This works
"query_string": {
"analyze_wildcard": true,
"query":"24/a*"
}
-
This works
"query_string": {
"query":"24/account"
}
-
This works
"query_string": {
"query":"24 / a*"
}
-
This works
"query_string": {
"query":"a*"
}
-
This doesn't work
"query_string": {
"query":"24/a*"
}
I can't explain why 5 doesn't work. Perhaps without setting analyze_wildcard
to true, elasticsearch simply removes the slash and searches for "24a*"?
What exactly does analyze_wildcard do when set to true?
As you can see 3 and 4 work without setting analyze_wildcard to true. So
when do we need to set it to true?
Thanks,
jimmy
--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/wildcard-and-slashes-tp3847057p3847057.htmlhttp://elasticsearch-users.115913.n3.nabble.com/wildcard-and-slashes-tp3847057p3847057.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.
If you reply to this email, your message will be added to the discussion below:
http://elasticsearch-users.115913.n3.nabble.com/wildcard-and-slashes-tp3847057p3847358.htmlhttp://elasticsearch-users.115913.n3.nabble.com/wildcard-and-slashes-tp3847057p3847358.html
To unsubscribe from wildcard and slashes, click here.
NAMLhttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml
View this message in context: Re: wildcard and slasheshttp://elasticsearch-users.115913.n3.nabble.com/wildcard-and-slashes-tp3847057p3849083.html
Sent from the ElasticSearch Users mailing list archivehttp://elasticsearch-users.115913.n3.nabble.com/ at Nabble.com.
If you reply to this email, your message will be added to the discussion below:
http://elasticsearch-users.115913.n3.nabble.com/wildcard-and-slashes-tp3847057p3853948.html
To unsubscribe from wildcard and slashes, click herehttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3847057&code=amNoZW5Ac3VnYXJjcm0uY29tfDM4NDcwNTd8LTcxNDM5MzQ0Nw==.
NAMLhttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml