Exclamation point breaks field queries


(Nick Hoffman) #1

Hey guys. I just noticed that a field query that includes an exclamation
point ("!") results in a failed query. Eg:

curl 'localhost:9200/development_products/product/_search?pretty=true' -d '
{
"query" : {
"dis_max" : {
"queries" : [
{ "field" : {"name" : "Arise!"}},
{ "field" : {"catalog.name" : "Arise!"}}
]
}
}
}'

Here's a gist with more details:

Is this supposed to happen?

Thanks,
Nick


(Clinton Gormley) #2

On Sat, 2012-01-21 at 14:18 -0800, Nick Hoffman wrote:

Hey guys. I just noticed that a field query that includes an
exclamation point ("!") results in a failed query. Eg:

Yes - the query parameter should be in the Lucene query parser syntax:
http://lucene.apache.org/java/3_5_0/queryparsersyntax.html

If you don't want to use the lucene query parser syntax features
ie: field: + - AND OR NOT ~ ^ ()
then rather use a 'text' query. If you DO want to use these features,
then you need to preparse your query string to remove or escape any
illegal characters.

In the Perl API, I provide a module to do just that:
https://metacpan.org/module/ElasticSearch::QueryParser

clint

curl 'localhost:9200/development_products/product/_search?pretty=true'
-d '
{
"query" : {
"dis_max" : {
"queries" : [
{ "field" : {"name" : "Arise!"}},
{ "field" : {"catalog.name" : "Arise!"}}
]
}
}
}'

Here's a gist with more details:
https://gist.github.com/1654247

Is this supposed to happen?

Thanks,
Nick


(Nick Hoffman) #3

Awesome. Thanks for the explanation and links, Clint! Those're very helpful.


(vallabh-2) #4

Hi Clint,

Exclamation point ("!") results in a failed query.
Not only exclamation, *,{,}...etc

I do have a artist name like, !!! (chk chk chk) and ke$ha...etc
When i search chk then it gives the artist name, but when i search only
exclamation mark then it fails.
How to force exclamation in search query.*
*
I tried below code but it didn't work for me,

curl -X PUT 'http://localhost:9200/admin/?pretty=true' -d '
{
"settings" : {
"analysis" : {
"analyzer" : {
"artist_analyzer" : {
"tokenizer" : "standard",
"filter" : ["standard", "lowercase",
"artist_metaphone", "asciifolding"]
},
"synonym" : {
"tokenizer" : "whitespace",
"filter" : ["synonym"]
}
},
"filter" : {
"artist_metaphone" : {
"type" : "phonetic",
"encoder" : "metaphone",
"replace" : false
},
"synonym" : {
"type" : "synonym",
"synonyms" : [
"kesha => ke$ha",
"! => !!! (chk chk chk)"
]
}
}
}
}
}
'
Am i doing wrong something?
Any help is highly appreciated.

Thanks and regards,
Vallabh
On Monday, January 23, 2012 3:03:02 AM UTC+5:30, Nick Hoffman wrote:

Awesome. Thanks for the explanation and links, Clint! Those're very
helpful.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #5

You have to escape Lucene's special characters in a query.

    • && || ! ( ) { } [ ] ^ " ~ * ? : \

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(vallabh-2) #6

Hi Jörg,

Thanks for the quick response.
But i can not escape this character as i have these characters in my artist
name list.
Is there any way from where i can handle this character ?

Thanks and regarsd,
Vallabh

On Friday, November 1, 2013 6:46:33 PM UTC+5:30, Jörg Prante wrote:

You have to escape Lucene's special characters in a query.

    • && || ! ( ) { } [ ] ^ " ~ * ? : \

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #7

You provide a correct artist name list for synonym search.

In your queries to ES, you described it as single exclamation mark, you
have to escape Lucene's special characters.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Matt Weber) #8

Are you trying to support lucene query syntax in your query? If yes, you
need to escape special characters as Jorg mentioned. If not, you should be
using a MatchQuery.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-match-query.html

Thanks,
Matt Weber

On Fri, Nov 1, 2013 at 7:04 AM, joergprante@gmail.com <joergprante@gmail.com

wrote:

You provide a correct artist name list for synonym search.

In your queries to ES, you described it as single exclamation mark, you
have to escape Lucene's special characters.

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(vallabh) #9

Thanks for the suggestions,
Yes i am using "elasticsearch-analysis-phonetic/1.6.0" plugin and i think
it uses lucene.
I do have a artist name "Death (protopunk band)"
When i search for "Death (protopunk" including open brace "(" - query fails.
On the other hand when i search "(protopunk band)" including both open and
close brace ( ) then it gives expected result.
I am little confused in elasticsearch query.

On Friday, November 1, 2013 8:37:32 PM UTC+5:30, Matt Weber wrote:

Are you trying to support lucene query syntax in your query? If yes, you
need to escape special characters as Jorg mentioned. If not, you should be
using a MatchQuery.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-match-query.html

Thanks,
Matt Weber

On Fri, Nov 1, 2013 at 7:04 AM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:

You provide a correct artist name list for synonym search.

In your queries to ES, you described it as single exclamation mark, you
have to escape Lucene's special characters.

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(vallabh) #10

Hi Everyone,

I have escape Lucene's special characters and used synonyms search. Now it
is working as per the requirement.
But my concern is, i have to change "tokenizer" : "standard", TO
"tokenizer" : "whitespace", in the code to work synonyms search.

And when i change tokenizer standard to whitespace.
I am not getting the result for the particaular artist name ie, for Jaz-z
(include hyphen)

In tokenizer: standard, when i search Jaz z (without hyphen only space), it
gives me the output as Jay-z
But now this is not happening in the whitespace case.
Is there anything where i can use both tokenizer ?

Below is my code,

curl -X PUT 'http://localhost:9200/admin/?pretty=true' -d '
{
"settings" : {
"analysis" : {
"analyzer" : {
"artist_analyzer" : {
"tokenizer" : "whitespace",
"filter" : ["standard", "lowercase", "synonym",
"artist_metaphone", "asciifolding"]
}
},
"filter" : {
"artist_metaphone" : {
"type" : "phonetic",
"encoder" : "metaphone",
"replace" : false
},
"synonym" : {
"type" : "synonym",
"synonyms_path" :
"/var/www/html/elasticsearch-master/synonyms.txt"
}
}
}
}
}
'

Any help is very much aprreciated.
Thanks,

On Friday, November 8, 2013 12:57:57 PM UTC+5:30, Vallabh Bothre wrote:

Thanks for the suggestions,
Yes i am using "elasticsearch-analysis-phonetic/1.6.0" plugin and i think
it uses lucene.
I do have a artist name "Death (protopunk band)"
When i search for "Death (protopunk" including open brace "(" - query
fails.
On the other hand when i search "(protopunk band)" including both open and
close brace ( ) then it gives expected result.
I am little confused in elasticsearch query.

On Friday, November 1, 2013 8:37:32 PM UTC+5:30, Matt Weber wrote:

Are you trying to support lucene query syntax in your query? If yes, you
need to escape special characters as Jorg mentioned. If not, you should be
using a MatchQuery.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-match-query.html

Thanks,
Matt Weber

On Fri, Nov 1, 2013 at 7:04 AM, joerg...@gmail.com joerg...@gmail.comwrote:

You provide a correct artist name list for synonym search.

In your queries to ES, you described it as single exclamation mark, you
have to escape Lucene's special characters.

Jörg

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #11