Wildcard search


(ajan) #1

curl http://mysearchhost:9200/test/_search?q=Identifier:6099d3e1
returns successfully with hits, however

curl http://mysearchhost:9200/test/_search?q=Identifier:ABC6099d3e1*
does not return any hits

The Identifier value is actually
tag:mytest:Test::ABCTest-6099d3e1-474a-49e2-bc14-ae816cf719ac

Is it not correct to have multiple "*" wildcards in the request?


(Ivan Brusic) #2

First of all, you should avoid to use leading wildcards in a query for
performance reasons.

http://www.elasticsearch.org/guide/reference/query-dsl/wildcard-query.html

Second, how is the Identifier field analyzed? Make sure it is not
analyzed, or you would need to analyzed the wildcard query, which is
also not recommended.

Is leading wildcards enabled by default in ElasticSearch? It is not in Lucene.

Ivan

On Thu, May 3, 2012 at 6:47 AM, ajan jan.afzal@gmail.com wrote:

curl http://mysearchhost:9200/test/_search?q=Identifier:6099d3e1
returns successfully with hits, however

curl http://mysearchhost:9200/test/_search?q=Identifier:ABC6099d3e1*
does not return any hits

The Identifier value is actually
tag:mytest:Test::ABCTest-6099d3e1-474a-49e2-bc14-ae816cf719ac

Is it not correct to have multiple "*" wildcards in the request?


(ajan) #3

Thanks for your response Ivan.

I'm in total agreement that one should avoid use of leading wildcards
in a query, however, how would one stop the use form providing such a
query?

Second, this and all other fields are analyzed by the default
analyzer, there is no specific analyzer for these fields.

I'm not sure if leading wildcards in enabled in ES?

Jan

On May 4, 3:45 am, Ivan Brusic i...@brusic.com wrote:

First of all, you should avoid to use leading wildcards in a query for
performance reasons.

http://www.elasticsearch.org/guide/reference/query-dsl/wildcard-query...

Second, how is the Identifier field analyzed? Make sure it is not
analyzed, or you would need to analyzed the wildcard query, which is
also not recommended.

Is leading wildcards enabled by default in ElasticSearch? It is not in Lucene.

Ivan

On Thu, May 3, 2012 at 6:47 AM, ajan jan.af...@gmail.com wrote:

curlhttp://mysearchhost:9200/test/_search?q=Identifier:6099d3e1
returns successfully with hits, however

curlhttp://mysearchhost:9200/test/_search?q=Identifier:ABC6099d3e1*
does not return any hits

The Identifier value is actually
tag:mytest:Test::ABCTest-6099d3e1-474a-49e2-bc14-ae816cf719ac

Is it not correct to have multiple "*" wildcards in the request?


(Ivan Brusic) #4

True, sometimes leading wildcard searchers are unavoidable, but I tend
to use them in diagnostic queries and not a query that is meant to be
executed several times. Or use ngrams.

Query string queries with wildcards are not analyzed. You can set
analyze_wildcard:true, but I am not sure if it is possible via a
search url.

http://www.elasticsearch.org/guide/reference/query-dsl/query-string-query.html

However, your fields are not in English (or any other language), so it
makes sense to not have them analyzed and to use a WildcardQuery.

--
Ivan

On Wed, May 9, 2012 at 8:32 PM, ajan jan.afzal@gmail.com wrote:

Thanks for your response Ivan.

I'm in total agreement that one should avoid use of leading wildcards
in a query, however, how would one stop the use form providing such a
query?

Second, this and all other fields are analyzed by the default
analyzer, there is no specific analyzer for these fields.

I'm not sure if leading wildcards in enabled in ES?

Jan

On May 4, 3:45 am, Ivan Brusic i...@brusic.com wrote:

First of all, you should avoid to use leading wildcards in a query for
performance reasons.

http://www.elasticsearch.org/guide/reference/query-dsl/wildcard-query...

Second, how is the Identifier field analyzed? Make sure it is not
analyzed, or you would need to analyzed the wildcard query, which is
also not recommended.

Is leading wildcards enabled by default in ElasticSearch? It is not in Lucene.

Ivan

On Thu, May 3, 2012 at 6:47 AM, ajan jan.af...@gmail.com wrote:

curlhttp://mysearchhost:9200/test/_search?q=Identifier:6099d3e1
returns successfully with hits, however

curlhttp://mysearchhost:9200/test/_search?q=Identifier:ABC6099d3e1*
does not return any hits

The Identifier value is actually
tag:mytest:Test::ABCTest-6099d3e1-474a-49e2-bc14-ae816cf719ac

Is it not correct to have multiple "*" wildcards in the request?


(system) #5