Treat "Dot" as a normal character in query_string query


(Curt Hu) #1

How can I treat the Dot '.' as the normal character in the query_string, as I want to search "www.google.com" as the whole string in the query_string, the current results for me are so strange..


(vineeth mohan-2) #2

Hello Curt ,

I believe you are chasing the wrong solution.
I feel what you need is something in the analyzer rather than search query.
Can you paste the output you are seeing.

Thanks
Vineeth

On Fri, Jul 18, 2014 at 11:26 AM, Curt Hu zhongting.hu@gmail.com wrote:

How can I treat the Dot '.' as the normal character in the query_string,
as I
want to search "www.google.com" as the whole string in the query_string,
the
current results for me are so strange..

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Treat-Dot-as-a-normal-character-in-query-string-query-tp4060154.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1405662997693-4060154.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3D_Ox16V9F-H7QyL-NBPn7Gn49saQkikD0KJoC-P1iN8w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Curt Hu) #3

Yeah, Thanks

It's kind of the strange results for me.
What I am doing is just a simple query_string query with a specific field, let's call that "domain".
If I do the query_string "www.google.com", I got like 10k results, looks good, since the "domain" field in the results are all "www.google.com" (I have not checked the total 10k results, I just randomly pick some), and if I changed to some bad strings like "www.abcdefg.com", then I got 0 hits, also good. But if I search something else like "www.loyal3.com", here are strange: I returned like 40k results, some of them the domain field is "www.loyal3.com", some are no any relationship, like "eatmywords365.com".

So, how does query_string query treat the dot? Or there are something else wrong?


(vineeth mohan-2) #4

Helli Curt ,

I believe the issue is as follows -

If you havnt done anything with the analyzer , the default behavior is as
follows -

When www.google.com is indexed its tokenized into [ "www" , "google" ,
"com" ]
and stored.
Same is applied when you search so if a document have either of www ,
google , com , then its a match.
So you wont mostly get what you want.

You need to do some tweaking in the analyzers to add this feature.

Thanks
Vineeth

On Fri, Jul 18, 2014 at 12:12 PM, Curt Hu zhongting.hu@gmail.com wrote:

Yeah, Thanks

It's kind of the strange results for me.
What I am doing is just a simple query_string query with a specific field,
let's call that "domain".
If I do the query_string "www.google.com", I got like 10k results, looks
good, since the "domain" field in the results are all "www.google.com" (I
have not checked the total 10k results, I just randomly pick some), and if
I
changed to some bad strings like "www.abcdefg.com", then I got 0 hits,
also
good. But if I search something else like "www.loyal3.com", here are
strange: I returned like 40k results, some of them the domain field is
"www.loyal3.com", some are no any relationship, like "eatmywords365.com".

So, how does query_string query treat the dot? Or there are something else
wrong?

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Treat-Dot-as-a-normal-character-in-query-string-query-tp4060154p4060160.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1405665722459-4060160.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGdPd5mX%3D4%3DZDrE6sgQv5%2B-NqWZjbmnyPRTmSNXXfP7KZKNnCw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Curt Hu) #5

Thanks very much, I have not done anything to analyzer, should use default.
So as you said.
"When www.google.com is indexed its tokenized into [ "www" , "google" , "com" ]
and stored.
Same is applied when you search so if a document have either of www , google , com , then its a match.
So you wont mostly get what you want."

Then my question is why my query on "www.abcde.com" doesn't give any hits, at least I have "www", "com" here and and should got hits for "www" and "com".


(system) #6