Is there a way to search terms lower cased?

sezgin_kucukkaraasla · June 30, 2010, 11:57am

Hi,
When using with default configuration and with no mapping, fields are
analyzed with lowercase token filter. So when I index a field with value,
let's say "ABC", it is tokenized as "abc". When I try to search it as I
insert it with the following query I get no results:

{
"query":{"term":"ABC"}
}

It seems that only query strings supports analyzers during search. Is there
a plan to add this feature to Elastic Search ?

Thanks in advance,
Sezgin Kucukkaraaslan
www.ifountain.com

Lukas_Vlcek1 · June 30, 2010, 12:19pm

Hi,
I did not try myself but it is possible to specify analyzer as a query
parameter:
http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/#Request_Parameters
http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/#Request_ParametersSo
it should definitely work in JSON too. Also did you check
http://www.elasticsearch.com/docs/elasticsearch/index_modules/analysis/analyzer/#Default_Analyzers
?
Lukas

2010/6/30 sezgin küçükkaraaslan sezo104@gmail.com

Hi,
When using with default configuration and with no mapping, fields are
analyzed with lowercase token filter. So when I index a field with value,
let's say "ABC", it is tokenized as "abc". When I try to search it as I
insert it with the following query I get no results:

{
"query":{"term":"ABC"}
}

It seems that only query strings supports analyzers during search. Is there
a plan to add this feature to Elastic Search ?

Thanks in advance,
Sezgin Kucukkaraaslan
www.ifountain.com

Lukas_Vlcek1 · June 30, 2010, 12:22pm

Well... the term query is not analyzed, see:
http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/term_query/

On Wed, Jun 30, 2010 at 2:19 PM, Lukáš Vlček lukas.vlcek@gmail.com wrote:

Hi,
I did not try myself but it is possible to specify analyzer as a query
parameter:
http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/#Request_Parameters
http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/#Request_ParametersSo
it should definitely work in JSON too. Also did you check
Analysis | Elasticsearch Guide [8.11] | Elastic
?
Lukas

2010/6/30 sezgin küçükkaraaslan sezo104@gmail.com

Hi,

When using with default configuration and with no mapping, fields are
analyzed with lowercase token filter. So when I index a field with value,
let's say "ABC", it is tokenized as "abc". When I try to search it as I
insert it with the following query I get no results:

{
"query":{"term":"ABC"}
}

It seems that only query strings supports analyzers during search. Is
there a plan to add this feature to Elastic Search ?

Thanks in advance,
Sezgin Kucukkaraaslan
www.ifountain.com

kimchy · June 30, 2010, 4:44pm

If you want to have the text passed analyzed, then use the field query
(which is a nice field level wrapper for the query_string query):
http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/field_query/
.

Note that the analysis process can be a simple one as lowercasing, and can
be more complex one that generates several terms for a single term analyzed.

-shay.banon

On Wed, Jun 30, 2010 at 3:22 PM, Lukáš Vlček lukas.vlcek@gmail.com wrote:

Well... the term query is not analyzed, see:
http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/term_query/

On Wed, Jun 30, 2010 at 2:19 PM, Lukáš Vlček lukas.vlcek@gmail.comwrote:

Hi,
I did not try myself but it is possible to specify analyzer as a query
parameter:
http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/#Request_Parameters
http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/#Request_ParametersSo
it should definitely work in JSON too. Also did you check
Analysis | Elasticsearch Guide [8.11] | Elastic
?
Lukas

2010/6/30 sezgin küçükkaraaslan sezo104@gmail.com

Hi,

When using with default configuration and with no mapping, fields are
analyzed with lowercase token filter. So when I index a field with value,
let's say "ABC", it is tokenized as "abc". When I try to search it as I
insert it with the following query I get no results:

{
"query":{"term":"ABC"}
}

It seems that only query strings supports analyzers during search. Is
there a plan to add this feature to Elastic Search ?

Thanks in advance,
Sezgin Kucukkaraaslan
www.ifountain.com

Clinton_Gormley · June 30, 2010, 4:48pm

For context - lukasvlcek had the conversation below in IRC, then left.

I'm answering him here

lukasvlcek:
kimchy: I haven't been thinking about it before... what is the
rationale of not allowing analyzer setup for term query when
Query DSL is used? See
http://elasticsearch-users.115913.n3.nabble.com/Is-there-a-way-to-search-terms-lower-cased-tp932996.html

    I am just curious why user has to search -exact- terms (Lower vs
    Upper case)

sam_:
the default analyzer if nothing is specified is standard isn't
it?

lukasvlcek:
I did not try this particular example but I am confused by the
term query doc which explicitly says "not analyzed" (so even the
default analyzer is not used?)

sam_:
if it is not analyzed then I would suspect you need to provide
case
an exact match
the standard analyzer would result in it being converted

lukasvlcek:
wouldn't it be useful to have ability to specify analyzer?

sam_:
you can
well
at least when you define the mappings
the analyzer is used as part of the indexing
as an alternative I would think you could provide your own
parser implementation to which is what I'm trying to do
but have been unsuccessful

lukasvlcek:
but the point is if it is possible to specify analyzer when
querying via URL parameters then why can not specify analyzer
while using Query DSL
Gotta go now... but I would appreciate if anybody (kimchy?) can
follow up on that mail thread above (want to check that later)

ï»¿----------------------------------------------

Answer:

(Note - this is as I understand the situation - I'm open to correction)

All data stored in ElasticSearch/Lucene is stored as a 'term' which is
atomic - it can't be broken down further.

So if you index {"text": "The quick brown fox jumped over the LAZY dog"}
then the default analyzer would:

remove stopwords
lowercase all text
split on whitespace and punctuation
result in these terms:
'quick', 'brown', 'fox', 'jumped','over', 'lazy', 'dog'

If you then do this search:
{ "query_string": { "query": "QUICK dOg"}}

Then the default analyzer would analyze your query string and return the
following terms: "quick", "dog"

It then does a 'term' query for each of those terms and combines the
results.

If you did this search:
{ "wildcard": {"text": "o}}

Then it would first look at all terms, and find only those terms that
match that pattern, ie: 'brown', 'fox', 'over', 'dog'.

It then does a 'term' query for each of those terms and combines the
results.

So it doesn't make sense to analyze a 'term'. Terms are the result of
analysis. If you need to analyse a search "phrase" then you should use a
"query_string" or "field" query.

For the same reason, you can't sort on an analyzed field because the
original data doesn't exist. It is tokenised and stored as
terms. ï»¿(unless the field is also stored? - not sure)

The analyzer used to analyze a search phrase is selected in this order:

"analyzer" specified in the query DSL, eg:

{ "query_string": { "query": "foo bar", "analyzer": "keyword"}}
"search_analyzer" specified in the mapping
"analyzer" specified in the mapping
the default_search analyzer ï»¿specified in the index configuration
the default analyzer specified in the index configuration
the default_search analyzer specified in the node configuration
the default analyzer specified in the node configuration
the "standard" analyzer

(I think that's right - I may have added a couple in there that don't
actually exist)

Typically, it doesn't make sense to use a different analyzer at index
and search time, because you may end up searching for terms that don't
actually exist.

If a field is set to be 'not_analyzed', then the whole value is treated
as a term, so "ABC" and "abc" are different, and "abc" will not match
"abc def".

hope this helps

Clint

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.

Lukas_Vlcek1 · June 30, 2010, 11:01pm

Hey guys, thanks for keeping this conversation going. Appreciate this!
Lukas

On Wed, Jun 30, 2010 at 6:48 PM, Clinton Gormley clinton@iannounce.co.ukwrote:

For context - lukasvlcek had the conversation below in IRC, then left.

I'm answering him here

lukasvlcek:
kimchy: I haven't been thinking about it before... what is the
rationale of not allowing analyzer setup for term query when
Query DSL is used? See

http://elasticsearch-users.115913.n3.nabble.com/Is-there-a-way-to-search-terms-lower-cased-tp932996.html
   I am just curious why user has to search -exact- terms (Lower vs
   Upper case)
sam_:
the default analyzer if nothing is specified is standard isn't
it?

lukasvlcek:
I did not try this particular example but I am confused by the
term query doc which explicitly says "not analyzed" (so even the
default analyzer is not used?)

sam_:
if it is not analyzed then I would suspect you need to provide
case
an exact match
the standard analyzer would result in it being converted

lukasvlcek:
wouldn't it be useful to have ability to specify analyzer?

sam_:
you can
well
at least when you define the mappings
the analyzer is used as part of the indexing
as an alternative I would think you could provide your own
parser implementation to which is what I'm trying to do
but have been unsuccessful

lukasvlcek:
but the point is if it is possible to specify analyzer when
querying via URL parameters then why can not specify analyzer
while using Query DSL
Gotta go now... but I would appreciate if anybody (kimchy?) can
follow up on that mail thread above (want to check that later)

----------------------------------------------

Answer:

(Note - this is as I understand the situation - I'm open to correction)

All data stored in Elasticsearch/Lucene is stored as a 'term' which is
atomic - it can't be broken down further.

So if you index {"text": "The quick brown fox jumped over the LAZY dog"}
then the default analyzer would:

remove stopwords

lowercase all text

split on whitespace and punctuation

result in these terms:
'quick', 'brown', 'fox', 'jumped','over', 'lazy', 'dog'

If you then do this search:
{ "query_string": { "query": "QUICK dOg"}}

Then the default analyzer would analyze your query string and return the
following terms: "quick", "dog"

It then does a 'term' query for each of those terms and combines the
results.

If you did this search:
{ "wildcard": {"text": "o}}

Then it would first look at all terms, and find only those terms that
match that pattern, ie: 'brown', 'fox', 'over', 'dog'.

It then does a 'term' query for each of those terms and combines the
results.

So it doesn't make sense to analyze a 'term'. Terms are the result of
analysis. If you need to analyse a search "phrase" then you should use a
"query_string" or "field" query.

For the same reason, you can't sort on an analyzed field because the
original data doesn't exist. It is tokenised and stored as
terms. (unless the field is also stored? - not sure)

The analyzer used to analyze a search phrase is selected in this order:

"analyzer" specified in the query DSL, eg:

{ "query_string": { "query": "foo bar", "analyzer": "keyword"}}

"search_analyzer" specified in the mapping

"analyzer" specified in the mapping

the default_search analyzer specified in the index configuration

the default analyzer specified in the index configuration

the default_search analyzer specified in the node configuration

the default analyzer specified in the node configuration

the "standard" analyzer

(I think that's right - I may have added a couple in there that don't
actually exist)

Typically, it doesn't make sense to use a different analyzer at index
and search time, because you may end up searching for terms that don't
actually exist.

If a field is set to be 'not_analyzed', then the whole value is treated
as a term, so "ABC" and "abc" are different, and "abc" will not match
"abc def".

hope this helps

Clint

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.

sezgin_kucukkaraasla · July 1, 2010, 8:54am

Thanks for the replies..
I think I'd better to explain what I'm trying to do. I'm working on a IT
event management application and want to store my data on Elastic Search to
leverage it's clustering and redundancy features. The requirement is to
index data and give the operators the flexibility to search events case
insensitively from the UI. To gain from performance I don't want all fields
in my event model to be analyzed. For example I want to keep fields like
"identifier", which I know that it will consist of one word, as
"not_analyzed". The problem with field search here is that I can't search
these kind of properties with it. (It gives zero result.). So I will not be
able to use it all the time. After some thinking, I decide to use two kinds
of analyzers for my fields, which are:

myAnalyzer1 :
filter: [lowercase]
tokenizer: keyword

for the fields like "identifier", and :

myAnalyzer2:
filter:[lowercase]
tokenizer: whitespace

for the fields like "description", which can consist of multiple words.

I can take some advices here, am I in the right path? Is there any
performance loss that I will bear by using the first analyzer instead of
keeping it as "not_analyzed"?
Thank you very much again...

Sezgin Kucukkaraaslan
www.ifountain.com

On Thu, Jul 1, 2010 at 2:01 AM, Lukáš Vlček lukas.vlcek@gmail.com wrote:

Hey guys, thanks for keeping this conversation going. Appreciate this!
Lukas

On Wed, Jun 30, 2010 at 6:48 PM, Clinton Gormley clinton@iannounce.co.ukwrote:
For context - lukasvlcek had the conversation below in IRC, then left.

I'm answering him here

lukasvlcek:
kimchy: I haven't been thinking about it before... what is the
rationale of not allowing analyzer setup for term query when
Query DSL is used? See

http://elasticsearch-users.115913.n3.nabble.com/Is-there-a-way-to-search-terms-lower-cased-tp932996.html
   I am just curious why user has to search -exact- terms (Lower vs
   Upper case)
sam_:
the default analyzer if nothing is specified is standard isn't
it?

lukasvlcek:
I did not try this particular example but I am confused by the
term query doc which explicitly says "not analyzed" (so even the
default analyzer is not used?)

sam_:
if it is not analyzed then I would suspect you need to provide
case
an exact match
the standard analyzer would result in it being converted

lukasvlcek:
wouldn't it be useful to have ability to specify analyzer?

sam_:
you can
well
at least when you define the mappings
the analyzer is used as part of the indexing
as an alternative I would think you could provide your own
parser implementation to which is what I'm trying to do
but have been unsuccessful

lukasvlcek:
but the point is if it is possible to specify analyzer when
querying via URL parameters then why can not specify analyzer
while using Query DSL
Gotta go now... but I would appreciate if anybody (kimchy?) can
follow up on that mail thread above (want to check that later)

----------------------------------------------

Answer:

(Note - this is as I understand the situation - I'm open to correction)

All data stored in Elasticsearch/Lucene is stored as a 'term' which is
atomic - it can't be broken down further.

So if you index {"text": "The quick brown fox jumped over the LAZY dog"}
then the default analyzer would:

remove stopwords

lowercase all text

split on whitespace and punctuation

result in these terms:
'quick', 'brown', 'fox', 'jumped','over', 'lazy', 'dog'

If you then do this search:
{ "query_string": { "query": "QUICK dOg"}}

Then the default analyzer would analyze your query string and return the
following terms: "quick", "dog"

It then does a 'term' query for each of those terms and combines the
results.

If you did this search:
{ "wildcard": {"text": "o}}

Then it would first look at all terms, and find only those terms that
match that pattern, ie: 'brown', 'fox', 'over', 'dog'.

It then does a 'term' query for each of those terms and combines the
results.

So it doesn't make sense to analyze a 'term'. Terms are the result of
analysis. If you need to analyse a search "phrase" then you should use a
"query_string" or "field" query.

For the same reason, you can't sort on an analyzed field because the
original data doesn't exist. It is tokenised and stored as
terms. (unless the field is also stored? - not sure)

The analyzer used to analyze a search phrase is selected in this order:

"analyzer" specified in the query DSL, eg:

{ "query_string": { "query": "foo bar", "analyzer": "keyword"}}

"search_analyzer" specified in the mapping

"analyzer" specified in the mapping

the default_search analyzer specified in the index configuration

the default analyzer specified in the index configuration

the default_search analyzer specified in the node configuration

the default analyzer specified in the node configuration

the "standard" analyzer

(I think that's right - I may have added a couple in there that don't
actually exist)

Typically, it doesn't make sense to use a different analyzer at index
and search time, because you may end up searching for terms that don't
actually exist.

If a field is set to be 'not_analyzed', then the whole value is treated
as a term, so "ABC" and "abc" are different, and "abc" will not match
"abc def".

hope this helps

Clint

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.

kimchy · July 2, 2010, 1:34pm

Its a good way to solve what you are trying. You shouldn't notice the
performance difference in indexing time with this compared to
not_analyzed.

-shay.banon

2010/7/1 sezgin küçükkaraaslan sezo104@gmail.com

Thanks for the replies..
I think I'd better to explain what I'm trying to do. I'm working on a IT
event management application and want to store my data on Elastic Search to
leverage it's clustering and redundancy features. The requirement is to
index data and give the operators the flexibility to search events case
insensitively from the UI. To gain from performance I don't want all fields
in my event model to be analyzed. For example I want to keep fields like
"identifier", which I know that it will consist of one word, as
"not_analyzed". The problem with field search here is that I can't search
these kind of properties with it. (It gives zero result.). So I will not be
able to use it all the time. After some thinking, I decide to use two kinds
of analyzers for my fields, which are:

myAnalyzer1 :
filter: [lowercase]
tokenizer: keyword

for the fields like "identifier", and :

myAnalyzer2:
filter:[lowercase]
tokenizer: whitespace

for the fields like "description", which can consist of multiple words.

I can take some advices here, am I in the right path? Is there any
performance loss that I will bear by using the first analyzer instead of
keeping it as "not_analyzed"?
Thank you very much again...

Sezgin Kucukkaraaslan
www.ifountain.com

On Thu, Jul 1, 2010 at 2:01 AM, Lukáš Vlček lukas.vlcek@gmail.com wrote:
Hey guys, thanks for keeping this conversation going. Appreciate this!
Lukas

On Wed, Jun 30, 2010 at 6:48 PM, Clinton Gormley <clinton@iannounce.co.uk

wrote:
For context - lukasvlcek had the conversation below in IRC, then left.

I'm answering him here

lukasvlcek:
kimchy: I haven't been thinking about it before... what is the
rationale of not allowing analyzer setup for term query when
Query DSL is used? See

http://elasticsearch-users.115913.n3.nabble.com/Is-there-a-way-to-search-terms-lower-cased-tp932996.html
   I am just curious why user has to search -exact- terms (Lower vs
   Upper case)
sam_:
the default analyzer if nothing is specified is standard isn't
it?

lukasvlcek:
I did not try this particular example but I am confused by the
term query doc which explicitly says "not analyzed" (so even the
default analyzer is not used?)

sam_:
if it is not analyzed then I would suspect you need to provide
case
an exact match
the standard analyzer would result in it being converted

lukasvlcek:
wouldn't it be useful to have ability to specify analyzer?

sam_:
you can
well
at least when you define the mappings
the analyzer is used as part of the indexing
as an alternative I would think you could provide your own
parser implementation to which is what I'm trying to do
but have been unsuccessful

lukasvlcek:
but the point is if it is possible to specify analyzer when
querying via URL parameters then why can not specify analyzer
while using Query DSL
Gotta go now... but I would appreciate if anybody (kimchy?) can
follow up on that mail thread above (want to check that later)

----------------------------------------------

Answer:

(Note - this is as I understand the situation - I'm open to correction)

All data stored in Elasticsearch/Lucene is stored as a 'term' which is
atomic - it can't be broken down further.

So if you index {"text": "The quick brown fox jumped over the LAZY dog"}
then the default analyzer would:

remove stopwords

lowercase all text

split on whitespace and punctuation

result in these terms:
'quick', 'brown', 'fox', 'jumped','over', 'lazy', 'dog'

If you then do this search:
{ "query_string": { "query": "QUICK dOg"}}

Then the default analyzer would analyze your query string and return the
following terms: "quick", "dog"

It then does a 'term' query for each of those terms and combines the
results.

If you did this search:
{ "wildcard": {"text": "o}}

Then it would first look at all terms, and find only those terms that
match that pattern, ie: 'brown', 'fox', 'over', 'dog'.

It then does a 'term' query for each of those terms and combines the
results.

So it doesn't make sense to analyze a 'term'. Terms are the result of
analysis. If you need to analyse a search "phrase" then you should use a
"query_string" or "field" query.

For the same reason, you can't sort on an analyzed field because the
original data doesn't exist. It is tokenised and stored as
terms. (unless the field is also stored? - not sure)

The analyzer used to analyze a search phrase is selected in this order:

"analyzer" specified in the query DSL, eg:

{ "query_string": { "query": "foo bar", "analyzer": "keyword"}}

"search_analyzer" specified in the mapping

"analyzer" specified in the mapping

the default_search analyzer specified in the index configuration

the default analyzer specified in the index configuration

the default_search analyzer specified in the node configuration

the default analyzer specified in the node configuration

the "standard" analyzer

(I think that's right - I may have added a couple in there that don't
actually exist)

Typically, it doesn't make sense to use a different analyzer at index
and search time, because you may end up searching for terms that don't
actually exist.

If a field is set to be 'not_analyzed', then the whole value is treated
as a term, so "ABC" and "abc" are different, and "abc" will not match
"abc def".

hope this helps

Clint

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.

sezgin_kucukkaraasla · July 7, 2010, 11:25am

Thanks...

On Fri, Jul 2, 2010 at 4:34 PM, Shay Banon shay.banon@elasticsearch.comwrote:

Its a good way to solve what you are trying. You shouldn't notice the
performance difference in indexing time with this compared to
not_analyzed.

-shay.banon

2010/7/1 sezgin küçükkaraaslan sezo104@gmail.com

Thanks for the replies..
I think I'd better to explain what I'm trying to do. I'm working on a IT
event management application and want to store my data on Elastic Search to
leverage it's clustering and redundancy features. The requirement is to
index data and give the operators the flexibility to search events case
insensitively from the UI. To gain from performance I don't want all fields
in my event model to be analyzed. For example I want to keep fields like
"identifier", which I know that it will consist of one word, as
"not_analyzed". The problem with field search here is that I can't search
these kind of properties with it. (It gives zero result.). So I will not be
able to use it all the time. After some thinking, I decide to use two kinds
of analyzers for my fields, which are:

myAnalyzer1 :
filter: [lowercase]
tokenizer: keyword

for the fields like "identifier", and :

myAnalyzer2:
filter:[lowercase]
tokenizer: whitespace

for the fields like "description", which can consist of multiple words.

I can take some advices here, am I in the right path? Is there any
performance loss that I will bear by using the first analyzer instead of
keeping it as "not_analyzed"?
Thank you very much again...

Sezgin Kucukkaraaslan
www.ifountain.com

On Thu, Jul 1, 2010 at 2:01 AM, Lukáš Vlček lukas.vlcek@gmail.comwrote:
Hey guys, thanks for keeping this conversation going. Appreciate this!
Lukas

On Wed, Jun 30, 2010 at 6:48 PM, Clinton Gormley <
clinton@iannounce.co.uk> wrote:
For context - lukasvlcek had the conversation below in IRC, then left.

I'm answering him here

lukasvlcek:
kimchy: I haven't been thinking about it before... what is the
rationale of not allowing analyzer setup for term query when
Query DSL is used? See

http://elasticsearch-users.115913.n3.nabble.com/Is-there-a-way-to-search-terms-lower-cased-tp932996.html
   I am just curious why user has to search -exact- terms (Lower vs
   Upper case)
sam_:
the default analyzer if nothing is specified is standard isn't
it?

lukasvlcek:
I did not try this particular example but I am confused by the
term query doc which explicitly says "not analyzed" (so even the
default analyzer is not used?)

sam_:
if it is not analyzed then I would suspect you need to provide
case
an exact match
the standard analyzer would result in it being converted

lukasvlcek:
wouldn't it be useful to have ability to specify analyzer?

sam_:
you can
well
at least when you define the mappings
the analyzer is used as part of the indexing
as an alternative I would think you could provide your own
parser implementation to which is what I'm trying to do
but have been unsuccessful

lukasvlcek:
but the point is if it is possible to specify analyzer when
querying via URL parameters then why can not specify analyzer
while using Query DSL
Gotta go now... but I would appreciate if anybody (kimchy?) can
follow up on that mail thread above (want to check that later)

----------------------------------------------

Answer:

(Note - this is as I understand the situation - I'm open to correction)

All data stored in Elasticsearch/Lucene is stored as a 'term' which is
atomic - it can't be broken down further.

So if you index {"text": "The quick brown fox jumped over the LAZY dog"}
then the default analyzer would:

remove stopwords

lowercase all text

split on whitespace and punctuation

result in these terms:
'quick', 'brown', 'fox', 'jumped','over', 'lazy', 'dog'

If you then do this search:
{ "query_string": { "query": "QUICK dOg"}}

Then the default analyzer would analyze your query string and return the
following terms: "quick", "dog"

It then does a 'term' query for each of those terms and combines the
results.

If you did this search:
{ "wildcard": {"text": "o}}

Then it would first look at all terms, and find only those terms that
match that pattern, ie: 'brown', 'fox', 'over', 'dog'.

It then does a 'term' query for each of those terms and combines the
results.

So it doesn't make sense to analyze a 'term'. Terms are the result of
analysis. If you need to analyse a search "phrase" then you should use a
"query_string" or "field" query.

For the same reason, you can't sort on an analyzed field because the
original data doesn't exist. It is tokenised and stored as
terms. (unless the field is also stored? - not sure)

The analyzer used to analyze a search phrase is selected in this order:

"analyzer" specified in the query DSL, eg:

{ "query_string": { "query": "foo bar", "analyzer": "keyword"}}

"search_analyzer" specified in the mapping

"analyzer" specified in the mapping

the default_search analyzer specified in the index configuration

the default analyzer specified in the index configuration

the default_search analyzer specified in the node configuration

the default analyzer specified in the node configuration

the "standard" analyzer

(I think that's right - I may have added a couple in there that don't
actually exist)

Typically, it doesn't make sense to use a different analyzer at index
and search time, because you may end up searching for terms that don't
actually exist.

If a field is set to be 'not_analyzed', then the whole value is treated
as a term, so "ABC" and "abc" are different, and "abc" will not match
"abc def".

hope this helps

Clint

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.

Topic		Replies	Views
Case-insensitive term query Elasticsearch	3	2979	January 20, 2017
Requesting help with Case-insensitive Analyzer Elasticsearch	3	413	March 27, 2024
Case Insensitive Term Filters Elasticsearch	2	1621	July 6, 2017
Prefix query is case sensitive despite both index and search analyzers using lowercase filter? Elasticsearch	1	1251	June 13, 2018
Mapping case-insensitive, prefix enabled analyzer Elasticsearch	1	525	July 6, 2017

Is there a way to search terms lower cased?

Related topics