Newbie question about analyzed vs not analyzed


(Bob Ngu) #1

I am just learning ES and would appreciate a quick explanation on analyzed
vs not analyzed searches. My basic understanding is that unless indicated
otherwise, fields are analyzed during indexing time but when it comes to
search time, a term query matches documents for terms that are not
analyzed, hence it must match exactly. I am not sure what that means
exactly because for one, the fields are already analyzed and tokenized
during indexing time, secondly, when I execute a term query, it only
matches in lowercase but with match or query_string queries, they are case
insensitive. Any insights will be much appreciated.

Thanks,
Bob

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/65638a99-eda4-4d90-8913-fb89380e63ae%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Bob Ngu) #2

Oh I think I get it now, the analyzed value in the index is all lowercase
and hence the exact match must be lowercase for an exact match using term
query. When using match query, the term is first analyzed making it
lowercase before doing the search, hence case insensitive. Am I right?

On Thursday, January 16, 2014, Bob bobngu@gmail.com wrote:

I am just learning ES and would appreciate a quick explanation on analyzed
vs not analyzed searches. My basic understanding is that unless indicated
otherwise, fields are analyzed during indexing time but when it comes to
search time, a term query matches documents for terms that are not
analyzed, hence it must match exactly. I am not sure what that means
exactly because for one, the fields are already analyzed and tokenized
during indexing time, secondly, when I execute a term query, it only
matches in lowercase but with match or query_string queries, they are case
insensitive. Any insights will be much appreciated.

Thanks,
Bob

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/Fs5ZN-JpCdo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com <javascript:_e({}, 'cvml',
'elasticsearch%2Bunsubscribe@googlegroups.com');>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/65638a99-eda4-4d90-8913-fb89380e63ae%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAFANT4%2BetJ%3DmQZA1nB29v%3D%2BCfK%3DAOXCLtd1gmE7VTx-X7VRpDA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #3

Correct. A term query does not analyze the terms, while a match query does.
Generally, you should use text queries on non-analyzed fields, and match
queries on analyzed ones. Analysis does not always mean lowercasing terms,
but that is what the default (standard) analyzer does. All these concepts
derive from Lucene.

Cheers,

Ivan

On Thu, Jan 16, 2014 at 6:07 PM, Bob Ngu bobngu@gmail.com wrote:

Oh I think I get it now, the analyzed value in the index is all lowercase
and hence the exact match must be lowercase for an exact match using term
query. When using match query, the term is first analyzed making it
lowercase before doing the search, hence case insensitive. Am I right?

On Thursday, January 16, 2014, Bob bobngu@gmail.com wrote:

I am just learning ES and would appreciate a quick explanation on
analyzed vs not analyzed searches. My basic understanding is that unless
indicated otherwise, fields are analyzed during indexing time but when it
comes to search time, a term query matches documents for terms that are not
analyzed, hence it must match exactly. I am not sure what that means
exactly because for one, the fields are already analyzed and tokenized
during indexing time, secondly, when I execute a term query, it only
matches in lowercase but with match or query_string queries, they are case
insensitive. Any insights will be much appreciated.

Thanks,
Bob

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/Fs5ZN-JpCdo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/65638a99-eda4-4d90-8913-fb89380e63ae%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAFANT4%2BetJ%3DmQZA1nB29v%3D%2BCfK%3DAOXCLtd1gmE7VTx-X7VRpDA%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDVoxCVQFt1g%3DFrNAa_o9xXD3kw%2BPQL0hUk5V%2BSSuPyvQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #4

Correction: meant to say use term queries on non-analyzed fields, not
text queries.

On Thu, Jan 16, 2014 at 6:22 PM, Ivan Brusic ivan@brusic.com wrote:

Correct. A term query does not analyze the terms, while a match query
does. Generally, you should use text queries on non-analyzed fields, and
match queries on analyzed ones. Analysis does not always mean lowercasing
terms, but that is what the default (standard) analyzer does. All these
concepts derive from Lucene.

Cheers,

Ivan

On Thu, Jan 16, 2014 at 6:07 PM, Bob Ngu bobngu@gmail.com wrote:

Oh I think I get it now, the analyzed value in the index is all lowercase
and hence the exact match must be lowercase for an exact match using term
query. When using match query, the term is first analyzed making it
lowercase before doing the search, hence case insensitive. Am I right?

On Thursday, January 16, 2014, Bob bobngu@gmail.com wrote:

I am just learning ES and would appreciate a quick explanation on
analyzed vs not analyzed searches. My basic understanding is that unless
indicated otherwise, fields are analyzed during indexing time but when it
comes to search time, a term query matches documents for terms that are not
analyzed, hence it must match exactly. I am not sure what that means
exactly because for one, the fields are already analyzed and tokenized
during indexing time, secondly, when I execute a term query, it only
matches in lowercase but with match or query_string queries, they are case
insensitive. Any insights will be much appreciated.

Thanks,
Bob

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/Fs5ZN-JpCdo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/65638a99-eda4-4d90-8913-fb89380e63ae%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAFANT4%2BetJ%3DmQZA1nB29v%3D%2BCfK%3DAOXCLtd1gmE7VTx-X7VRpDA%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC9dmr%2BsD_y%3DuuRYr_Am2sQ-svXBaVutphM61%2BD%3D_3%2Bgw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Bob Ngu) #5

Yep, thanks for confirming my understanding, in the case of using term
queries on non-analyzed fields, the search value must match the exact case
of the original value if I understand this correctly.

Bob
On Jan 16, 2014, at 6:23 PM, Ivan Brusic
<ivan@brusic.com<javascript:_e({}, 'cvml', 'ivan@brusic.com');>>
wrote:

Correction: meant to say use term queries on non-analyzed fields, not
text queries.

On Thu, Jan 16, 2014 at 6:22 PM, Ivan Brusic
<ivan@brusic.com<javascript:_e({}, 'cvml', 'ivan@brusic.com');>

wrote:

Correct. A term query does not analyze the terms, while a match query
does. Generally, you should use text queries on non-analyzed fields, and
match queries on analyzed ones. Analysis does not always mean lowercasing
terms, but that is what the default (standard) analyzer does. All these
concepts derive from Lucene.

Cheers,

Ivan

On Thu, Jan 16, 2014 at 6:07 PM, Bob Ngu <bobngu@gmail.com<javascript:_e({}, 'cvml', 'bobngu@gmail.com');>

wrote:

Oh I think I get it now, the analyzed value in the index is all lowercase
and hence the exact match must be lowercase for an exact match using term
query. When using match query, the term is first analyzed making it
lowercase before doing the search, hence case insensitive. Am I right?

On Thursday, January 16, 2014, Bob <bobngu@gmail.com <javascript:_e({},
'cvml', 'bobngu@gmail.com');>> wrote:

I am just learning ES and would appreciate a quick explanation on
analyzed vs not analyzed searches. My basic understanding is that unless
indicated otherwise, fields are analyzed during indexing time but when it
comes to search time, a term query matches documents for terms that are not
analyzed, hence it must match exactly. I am not sure what that means
exactly because for one, the fields are already analyzed and tokenized
during indexing time, secondly, when I execute a term query, it only
matches in lowercase but with match or query_string queries, they are case
insensitive. Any insights will be much appreciated.

Thanks,
Bob

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/Fs5ZN-JpCdo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/65638a99-eda4-4d90-8913-fb89380e63ae%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com <javascript:_e({},
'cvml', 'elasticsearch%2Bunsubscribe@googlegroups.com');>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAFANT4%2BetJ%3DmQZA1nB29v%3D%2BCfK%3DAOXCLtd1gmE7VTx-X7VRpDA%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/Fs5ZN-JpCdo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com <javascript:_e({}, 'cvml',
'elasticsearch%2Bunsubscribe@googlegroups.com');>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC9dmr%2BsD_y%3DuuRYr_Am2sQ-svXBaVutphM61%2BD%3D_3%2Bgw%40mail.gmail.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAFANT4%2BF6hMvOZ1nYSM2thw2z15ENvff3Jo%3DzQAb1xp%2Bb1Exig%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6