Search query / analyzer issue dealing with spaces

Hello, I am trying to run a query that distinguishes between spaces in
values. Let's say I have a field called 'color' in my index. Record 1 has
"color" : "metallic red" whereas Record 2 has "color": "metallic"

I want to search for 'metallic' but NOT retrieve 'metallic red', and a
search for 'metallic red' should not return 'red'.

The query below works for 'metallic red' but entering 'red' returns both
records. The query also appears to be bypassing Analyzers specified in the
mappings (such as keyword) as they have no affect. What should I change it
to instead?

//Query
GET /myindex/_search
{
"query": {
"match_phrase": {
"color": "metallic red"
}
}
}

//Data
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "1" } }
{ "color" : "metallic red" }
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "2" } }
{ "color" : "Metallic RED"}
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "3" } }
{ "color" : "rEd" }

//Mapping (no effect for query)
curl -XPUT 'http://localhost:9200/myindex/' -d '{
"settings" : {
"analysis": {
"analyzer": {
"my_analyzer":{
"type": "custom",
"tokenizer" : "keyword",
"lowercase" : true
}
}
},
"mappings" : {
"episode" : {
"_source" : { "enabled" : false },
"properties" : {
"color" : { "type" : "string", "analyzer" : "not_analyzed" }
}
}
}
}
}'

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c18b793b-3288-4935-afb7-28e51fd590c9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

There is no "analyzer" : "not_analyzed", it must be "index" :
"not_analyzed".

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/mapping-intro.html#_literal_index_literal

Or you can set the analyzer "keyword" in the mapping.

Like this:

Jörg

On Wed, Oct 29, 2014 at 10:38 PM, Jarrod C <
last.remaining.screen.name.2009@gmail.com> wrote:

Hello, I am trying to run a query that distinguishes between spaces in
values. Let's say I have a field called 'color' in my index. Record 1 has
"color" : "metallic red" whereas Record 2 has "color": "metallic"

I want to search for 'metallic' but NOT retrieve 'metallic red', and a
search for 'metallic red' should not return 'red'.

The query below works for 'metallic red' but entering 'red' returns both
records. The query also appears to be bypassing Analyzers specified in the
mappings (such as keyword) as they have no affect. What should I change it
to instead?

//Query
GET /myindex/_search
{
"query": {
"match_phrase": {
"color": "metallic red"
}
}
}

//Data
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "1" } }
{ "color" : "metallic red" }
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "2" } }
{ "color" : "Metallic RED"}
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "3" } }
{ "color" : "rEd" }

//Mapping (no effect for query)
curl -XPUT 'http://localhost:9200/myindex/' -d '{
"settings" : {
"analysis": {
"analyzer": {
"my_analyzer":{
"type": "custom",
"tokenizer" : "keyword",
"lowercase" : true
}
}
},
"mappings" : {
"episode" : {
"_source" : { "enabled" : false },
"properties" : {
"color" : { "type" : "string", "analyzer" : "not_analyzed"
}
}
}
}
}
}'

Thanks!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c18b793b-3288-4935-afb7-28e51fd590c9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c18b793b-3288-4935-afb7-28e51fd590c9%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFdFtesaoe1aONzjLm1nq5PoOtzDc2t77jTAoQ%2BV6OevA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Actually, there are two problems here. Change the analyzer to the name of
your custom analyzer and you are missing a curly brace to close out the
"settings" property. Not sure why it doesn't cause an error but it
definitely doesn't create a mapping. You can check if there is a mapping by
looking at: http://localhost:9200/myindex/_mapping

Here is how it should be:

{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"lowercase": true
}
}
}
},
"mappings": {
"episode": {
"_source": {
"enabled": false
},
"properties": {
"color": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}

On Wednesday, October 29, 2014 2:38:36 PM UTC-7, Jarrod C wrote:

Hello, I am trying to run a query that distinguishes between spaces in
values. Let's say I have a field called 'color' in my index. Record 1 has
"color" : "metallic red" whereas Record 2 has "color": "metallic"

I want to search for 'metallic' but NOT retrieve 'metallic red', and a
search for 'metallic red' should not return 'red'.

The query below works for 'metallic red' but entering 'red' returns both
records. The query also appears to be bypassing Analyzers specified in the
mappings (such as keyword) as they have no affect. What should I change it
to instead?

//Query
GET /myindex/_search
{
"query": {
"match_phrase": {
"color": "metallic red"
}
}
}

//Data
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "1" } }
{ "color" : "metallic red" }
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "2" } }
{ "color" : "Metallic RED"}
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "3" } }
{ "color" : "rEd" }

//Mapping (no effect for query)
curl -XPUT 'http://localhost:9200/myindex/' -d '{
"settings" : {
"analysis": {
"analyzer": {
"my_analyzer":{
"type": "custom",
"tokenizer" : "keyword",
"lowercase" : true
}
}
},
"mappings" : {
"episode" : {
"_source" : { "enabled" : false },
"properties" : {
"color" : { "type" : "string", "analyzer" : "not_analyzed"
}
}
}
}
}
}'

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/99dfc5ad-5efe-409b-a54c-5bde5ad7685b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Actually, change it to "index": "not_analyzed" as shown in the JSON.

On Wednesday, October 29, 2014 5:13:46 PM UTC-7, Mike Maddox wrote:

Actually, there are two problems here. Change the analyzer to the name of
your custom analyzer and you are missing a curly brace to close out the
"settings" property. Not sure why it doesn't cause an error but it
definitely doesn't create a mapping. You can check if there is a mapping by
looking at: http://localhost:9200/myindex/_mapping

Here is how it should be:

{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"lowercase": true
}
}
}
},
"mappings": {
"episode": {
"_source": {
"enabled": false
},
"properties": {
"color": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}

On Wednesday, October 29, 2014 2:38:36 PM UTC-7, Jarrod C wrote:

Hello, I am trying to run a query that distinguishes between spaces in
values. Let's say I have a field called 'color' in my index. Record 1 has
"color" : "metallic red" whereas Record 2 has "color": "metallic"

I want to search for 'metallic' but NOT retrieve 'metallic red', and a
search for 'metallic red' should not return 'red'.

The query below works for 'metallic red' but entering 'red' returns both
records. The query also appears to be bypassing Analyzers specified in the
mappings (such as keyword) as they have no affect. What should I change it
to instead?

//Query
GET /myindex/_search
{
"query": {
"match_phrase": {
"color": "metallic red"
}
}
}

//Data
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "1" } }
{ "color" : "metallic red" }
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "2" } }
{ "color" : "Metallic RED"}
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "3" } }
{ "color" : "rEd" }

//Mapping (no effect for query)
curl -XPUT 'http://localhost:9200/myindex/' -d '{
"settings" : {
"analysis": {
"analyzer": {
"my_analyzer":{
"type": "custom",
"tokenizer" : "keyword",
"lowercase" : true
}
}
},
"mappings" : {
"episode" : {
"_source" : { "enabled" : false },
"properties" : {
"color" : { "type" : "string", "analyzer" :
"not_analyzed" }
}
}
}
}
}'

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/220d0be3-c86d-4473-b957-b90b35d3da80%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Thanks for the replies. Unfortunately the analyzer portion is not the
problem (I pasted the original text in the midst of experimentation). When
I had "analyzer" : "my_analyzer" in the mapping it didn't make a
difference. I get results from the analysis query below so I assume it was
configured properly:
GET /myindex/_analyze?analyzer=my_analyzer

However, it does not seem to make a difference between using my custom
"my_analyzer" or using "keyword", or even using "index" : "not_analyzed".
In each case, if I search for "red" I get back all results when in fact I
only want 1.

Perhaps my query is the problem?

On Wednesday, October 29, 2014 8:17:40 PM UTC-4, Mike Maddox wrote:

Actually, change it to "index": "not_analyzed" as shown in the JSON.

On Wednesday, October 29, 2014 5:13:46 PM UTC-7, Mike Maddox wrote:

Actually, there are two problems here. Change the analyzer to the name of
your custom analyzer and you are missing a curly brace to close out the
"settings" property. Not sure why it doesn't cause an error but it
definitely doesn't create a mapping. You can check if there is a mapping by
looking at: http://localhost:9200/myindex/_mapping

Here is how it should be:

{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"lowercase": true
}
}
}
},
"mappings": {
"episode": {
"_source": {
"enabled": false
},
"properties": {
"color": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}

On Wednesday, October 29, 2014 2:38:36 PM UTC-7, Jarrod C wrote:

Hello, I am trying to run a query that distinguishes between spaces in
values. Let's say I have a field called 'color' in my index. Record 1 has
"color" : "metallic red" whereas Record 2 has "color": "metallic"

I want to search for 'metallic' but NOT retrieve 'metallic red', and a
search for 'metallic red' should not return 'red'.

The query below works for 'metallic red' but entering 'red' returns both
records. The query also appears to be bypassing Analyzers specified in the
mappings (such as keyword) as they have no affect. What should I change it
to instead?

//Query
GET /myindex/_search
{
"query": {
"match_phrase": {
"color": "metallic red"
}
}
}

//Data
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "1" } }
{ "color" : "metallic red" }
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "2" } }
{ "color" : "Metallic RED"}
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "3" } }
{ "color" : "rEd" }

//Mapping (no effect for query)
curl -XPUT 'http://localhost:9200/myindex/' -d '{
"settings" : {
"analysis": {
"analyzer": {
"my_analyzer":{
"type": "custom",
"tokenizer" : "keyword",
"lowercase" : true
}
}
},
"mappings" : {
"episode" : {
"_source" : { "enabled" : false },
"properties" : {
"color" : { "type" : "string", "analyzer" :
"not_analyzed" }
}
}
}
}
}'

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/695fa623-c228-4026-a296-1fe9266294c1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jarrod,

I understand that you think the analyzer is not the problem. However, the
original mapping wasn't correctly formatted so the color type was being
analyzed using the default analyzer which would also cause the query to use
the default analyzer as well. If you fix the syntax and then change color
to use your analyzer it will work. One note, your mapping is also incorrect
in that it references the "episode" type when you're actually adding data
to the "car" type. Using your analyzer, it would be indexed as one lower
case string. Now, your query does make a difference but if you have the
analyzer set correctly, it will analyze the input string using the same
analyzer that you set in the mapping. You would be better off just doing a
term query or filter.

Mike

On Thursday, October 30, 2014 8:06:36 AM UTC-7, Jarrod C wrote:

Thanks for the replies. Unfortunately the analyzer portion is not the
problem (I pasted the original text in the midst of experimentation). When
I had "analyzer" : "my_analyzer" in the mapping it didn't make a
difference. I get results from the analysis query below so I assume it was
configured properly:
GET /myindex/_analyze?analyzer=my_analyzer

However, it does not seem to make a difference between using my custom
"my_analyzer" or using "keyword", or even using "index" : "not_analyzed".
In each case, if I search for "red" I get back all results when in fact I
only want 1.

Perhaps my query is the problem?

On Wednesday, October 29, 2014 8:17:40 PM UTC-4, Mike Maddox wrote:

Actually, change it to "index": "not_analyzed" as shown in the JSON.

On Wednesday, October 29, 2014 5:13:46 PM UTC-7, Mike Maddox wrote:

Actually, there are two problems here. Change the analyzer to the name
of your custom analyzer and you are missing a curly brace to close out the
"settings" property. Not sure why it doesn't cause an error but it
definitely doesn't create a mapping. You can check if there is a mapping by
looking at: http://localhost:9200/myindex/_mapping

Here is how it should be:

{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"lowercase": true
}
}
}
},
"mappings": {
"episode": {
"_source": {
"enabled": false
},
"properties": {
"color": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}

On Wednesday, October 29, 2014 2:38:36 PM UTC-7, Jarrod C wrote:

Hello, I am trying to run a query that distinguishes between spaces in
values. Let's say I have a field called 'color' in my index. Record 1 has
"color" : "metallic red" whereas Record 2 has "color": "metallic"

I want to search for 'metallic' but NOT retrieve 'metallic red', and a
search for 'metallic red' should not return 'red'.

The query below works for 'metallic red' but entering 'red' returns
both records. The query also appears to be bypassing Analyzers specified
in the mappings (such as keyword) as they have no affect. What should I
change it to instead?

//Query
GET /myindex/_search
{
"query": {
"match_phrase": {
"color": "metallic red"
}
}
}

//Data
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "1" } }
{ "color" : "metallic red" }
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "2" } }
{ "color" : "Metallic RED"}
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "3" } }
{ "color" : "rEd" }

//Mapping (no effect for query)
curl -XPUT 'http://localhost:9200/myindex/' -d '{
"settings" : {
"analysis": {
"analyzer": {
"my_analyzer":{
"type": "custom",
"tokenizer" : "keyword",
"lowercase" : true
}
}
},
"mappings" : {
"episode" : {
"_source" : { "enabled" : false },
"properties" : {
"color" : { "type" : "string", "analyzer" :
"not_analyzed" }
}
}
}
}
}'

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d3de2cf4-1c3e-4f2f-af11-1bf3931ed54b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Thanks Mike, it appears referencing the 'episode' instead of 'car' from a
previous example was the problem. That seems to have progressed me further
however my queries are still case sensitive despite lowercase being true.
Allow me to repost what I have for clarity. Thanks

// mapping
curl -XPUT 'http://localhost:9200/myindex/' -d '{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"lowercase": true
}
}
}
},
"mappings": {
"car": {
"_source": {
"enabled": false
},
"properties": {
"color": {
"type": "string",
"analyzer": "my_analyzer"
}
}
}
}
}'

//query matches 'Metallic RED' but not 'Metallic Red'
GET /myindex/car/_search
{
"query": {
"match": {
"color": "Metallic RED"
}
}
}

On Thursday, October 30, 2014 2:41:10 PM UTC-4, Mike Maddox wrote:

Jarrod,

I understand that you think the analyzer is not the problem. However, the
original mapping wasn't correctly formatted so the color type was being
analyzed using the default analyzer which would also cause the query to use
the default analyzer as well. If you fix the syntax and then change color
to use your analyzer it will work. One note, your mapping is also incorrect
in that it references the "episode" type when you're actually adding data
to the "car" type. Using your analyzer, it would be indexed as one lower
case string. Now, your query does make a difference but if you have the
analyzer set correctly, it will analyze the input string using the same
analyzer that you set in the mapping. You would be better off just doing a
term query or filter.

Mike

On Thursday, October 30, 2014 8:06:36 AM UTC-7, Jarrod C wrote:

Thanks for the replies. Unfortunately the analyzer portion is not the
problem (I pasted the original text in the midst of experimentation). When
I had "analyzer" : "my_analyzer" in the mapping it didn't make a
difference. I get results from the analysis query below so I assume it was
configured properly:
GET /myindex/_analyze?analyzer=my_analyzer

However, it does not seem to make a difference between using my custom
"my_analyzer" or using "keyword", or even using "index" : "not_analyzed".
In each case, if I search for "red" I get back all results when in fact I
only want 1.

Perhaps my query is the problem?

On Wednesday, October 29, 2014 8:17:40 PM UTC-4, Mike Maddox wrote:

Actually, change it to "index": "not_analyzed" as shown in the JSON.

On Wednesday, October 29, 2014 5:13:46 PM UTC-7, Mike Maddox wrote:

Actually, there are two problems here. Change the analyzer to the name
of your custom analyzer and you are missing a curly brace to close out the
"settings" property. Not sure why it doesn't cause an error but it
definitely doesn't create a mapping. You can check if there is a mapping by
looking at: http://localhost:9200/myindex/_mapping

Here is how it should be:

{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"lowercase": true
}
}
}
},
"mappings": {
"episode": {
"_source": {
"enabled": false
},
"properties": {
"color": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}

On Wednesday, October 29, 2014 2:38:36 PM UTC-7, Jarrod C wrote:

Hello, I am trying to run a query that distinguishes between spaces in
values. Let's say I have a field called 'color' in my index. Record 1 has
"color" : "metallic red" whereas Record 2 has "color": "metallic"

I want to search for 'metallic' but NOT retrieve 'metallic red', and a
search for 'metallic red' should not return 'red'.

The query below works for 'metallic red' but entering 'red' returns
both records. The query also appears to be bypassing Analyzers specified
in the mappings (such as keyword) as they have no affect. What should I
change it to instead?

//Query
GET /myindex/_search
{
"query": {
"match_phrase": {
"color": "metallic red"
}
}
}

//Data
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "1" } }
{ "color" : "metallic red" }
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "2" } }
{ "color" : "Metallic RED"}
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "3" } }
{ "color" : "rEd" }

//Mapping (no effect for query)
curl -XPUT 'http://localhost:9200/myindex/' -d '{
"settings" : {
"analysis": {
"analyzer": {
"my_analyzer":{
"type": "custom",
"tokenizer" : "keyword",
"lowercase" : true
}
}
},
"mappings" : {
"episode" : {
"_source" : { "enabled" : false },
"properties" : {
"color" : { "type" : "string", "analyzer" :
"not_analyzed" }
}
}
}
}
}'

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e4293e0e-5d0f-4697-aa9e-d24421e5ae2e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jarod,

The format of your analyzer is wrong. Note that you have to set the filter
property. Use this:

{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
},
"mappings": {
"car": {
"properties": {
"color": {
"type": "string",
"analyzer": "my_analyzer"
}
}
}
}
}

On Thursday, October 30, 2014 12:52:45 PM UTC-7, Jarrod C wrote:

Thanks Mike, it appears referencing the 'episode' instead of 'car' from a
previous example was the problem. That seems to have progressed me further
however my queries are still case sensitive despite lowercase being true.
Allow me to repost what I have for clarity. Thanks

// mapping
curl -XPUT 'http://localhost:9200/myindex/' -d '{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"lowercase": true
}
}
}
},
"mappings": {
"car": {
"_source": {
"enabled": false
},
"properties": {
"color": {
"type": "string",
"analyzer": "my_analyzer"
}
}
}
}
}'

//query matches 'Metallic RED' but not 'Metallic Red'
GET /myindex/car/_search
{
"query": {
"match": {
"color": "Metallic RED"
}
}
}

On Thursday, October 30, 2014 2:41:10 PM UTC-4, Mike Maddox wrote:

Jarrod,

I understand that you think the analyzer is not the problem. However, the
original mapping wasn't correctly formatted so the color type was being
analyzed using the default analyzer which would also cause the query to use
the default analyzer as well. If you fix the syntax and then change color
to use your analyzer it will work. One note, your mapping is also incorrect
in that it references the "episode" type when you're actually adding data
to the "car" type. Using your analyzer, it would be indexed as one lower
case string. Now, your query does make a difference but if you have the
analyzer set correctly, it will analyze the input string using the same
analyzer that you set in the mapping. You would be better off just doing a
term query or filter.

Mike

On Thursday, October 30, 2014 8:06:36 AM UTC-7, Jarrod C wrote:

Thanks for the replies. Unfortunately the analyzer portion is not the
problem (I pasted the original text in the midst of experimentation). When
I had "analyzer" : "my_analyzer" in the mapping it didn't make a
difference. I get results from the analysis query below so I assume it was
configured properly:
GET /myindex/_analyze?analyzer=my_analyzer

However, it does not seem to make a difference between using my custom
"my_analyzer" or using "keyword", or even using "index" : "not_analyzed".
In each case, if I search for "red" I get back all results when in fact I
only want 1.

Perhaps my query is the problem?

On Wednesday, October 29, 2014 8:17:40 PM UTC-4, Mike Maddox wrote:

Actually, change it to "index": "not_analyzed" as shown in the JSON.

On Wednesday, October 29, 2014 5:13:46 PM UTC-7, Mike Maddox wrote:

Actually, there are two problems here. Change the analyzer to the name
of your custom analyzer and you are missing a curly brace to close out the
"settings" property. Not sure why it doesn't cause an error but it
definitely doesn't create a mapping. You can check if there is a mapping by
looking at: http://localhost:9200/myindex/_mapping

Here is how it should be:

{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"lowercase": true
}
}
}
},
"mappings": {
"episode": {
"_source": {
"enabled": false
},
"properties": {
"color": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}

On Wednesday, October 29, 2014 2:38:36 PM UTC-7, Jarrod C wrote:

Hello, I am trying to run a query that distinguishes between spaces
in values. Let's say I have a field called 'color' in my index. Record 1
has "color" : "metallic red" whereas Record 2 has "color": "metallic"

I want to search for 'metallic' but NOT retrieve 'metallic red', and
a search for 'metallic red' should not return 'red'.

The query below works for 'metallic red' but entering 'red' returns
both records. The query also appears to be bypassing Analyzers specified
in the mappings (such as keyword) as they have no affect. What should I
change it to instead?

//Query
GET /myindex/_search
{
"query": {
"match_phrase": {
"color": "metallic red"
}
}
}

//Data
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "1" } }
{ "color" : "metallic red" }
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "2" } }
{ "color" : "Metallic RED"}
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "3" } }
{ "color" : "rEd" }

//Mapping (no effect for query)
curl -XPUT 'http://localhost:9200/myindex/' -d '{
"settings" : {
"analysis": {
"analyzer": {
"my_analyzer":{
"type": "custom",
"tokenizer" : "keyword",
"lowercase" : true
}
}
},
"mappings" : {
"episode" : {
"_source" : { "enabled" : false },
"properties" : {
"color" : { "type" : "string", "analyzer" :
"not_analyzed" }
}
}
}
}
}'

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f56d067a-5140-4481-ab63-b31c7a6d795f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Excellent! That works perfectly. Thank you very much Mike.

On Thursday, October 30, 2014 4:21:30 PM UTC-4, Mike Maddox wrote:

Jarod,

The format of your analyzer is wrong. Note that you have to set the filter
property. Use this:

{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
},
"mappings": {
"car": {
"properties": {
"color": {
"type": "string",
"analyzer": "my_analyzer"
}
}
}
}
}

On Thursday, October 30, 2014 12:52:45 PM UTC-7, Jarrod C wrote:

Thanks Mike, it appears referencing the 'episode' instead of 'car' from a
previous example was the problem. That seems to have progressed me further
however my queries are still case sensitive despite lowercase being true.
Allow me to repost what I have for clarity. Thanks

// mapping
curl -XPUT 'http://localhost:9200/myindex/' -d '{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"lowercase": true
}
}
}
},
"mappings": {
"car": {
"_source": {
"enabled": false
},
"properties": {
"color": {
"type": "string",
"analyzer": "my_analyzer"
}
}
}
}
}'

//query matches 'Metallic RED' but not 'Metallic Red'
GET /myindex/car/_search
{
"query": {
"match": {
"color": "Metallic RED"
}
}
}

On Thursday, October 30, 2014 2:41:10 PM UTC-4, Mike Maddox wrote:

Jarrod,

I understand that you think the analyzer is not the problem. However,
the original mapping wasn't correctly formatted so the color type was being
analyzed using the default analyzer which would also cause the query to use
the default analyzer as well. If you fix the syntax and then change color
to use your analyzer it will work. One note, your mapping is also incorrect
in that it references the "episode" type when you're actually adding data
to the "car" type. Using your analyzer, it would be indexed as one lower
case string. Now, your query does make a difference but if you have the
analyzer set correctly, it will analyze the input string using the same
analyzer that you set in the mapping. You would be better off just doing a
term query or filter.

Mike

On Thursday, October 30, 2014 8:06:36 AM UTC-7, Jarrod C wrote:

Thanks for the replies. Unfortunately the analyzer portion is not the
problem (I pasted the original text in the midst of experimentation). When
I had "analyzer" : "my_analyzer" in the mapping it didn't make a
difference. I get results from the analysis query below so I assume it was
configured properly:
GET /myindex/_analyze?analyzer=my_analyzer

However, it does not seem to make a difference between using my custom
"my_analyzer" or using "keyword", or even using "index" : "not_analyzed".
In each case, if I search for "red" I get back all results when in fact I
only want 1.

Perhaps my query is the problem?

On Wednesday, October 29, 2014 8:17:40 PM UTC-4, Mike Maddox wrote:

Actually, change it to "index": "not_analyzed" as shown in the JSON.

On Wednesday, October 29, 2014 5:13:46 PM UTC-7, Mike Maddox wrote:

Actually, there are two problems here. Change the analyzer to the
name of your custom analyzer and you are missing a curly brace to close out
the "settings" property. Not sure why it doesn't cause an error but it
definitely doesn't create a mapping. You can check if there is a mapping by
looking at: http://localhost:9200/myindex/_mapping

Here is how it should be:

{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"lowercase": true
}
}
}
},
"mappings": {
"episode": {
"_source": {
"enabled": false
},
"properties": {
"color": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}

On Wednesday, October 29, 2014 2:38:36 PM UTC-7, Jarrod C wrote:

Hello, I am trying to run a query that distinguishes between spaces
in values. Let's say I have a field called 'color' in my index. Record 1
has "color" : "metallic red" whereas Record 2 has "color": "metallic"

I want to search for 'metallic' but NOT retrieve 'metallic red', and
a search for 'metallic red' should not return 'red'.

The query below works for 'metallic red' but entering 'red' returns
both records. The query also appears to be bypassing Analyzers specified
in the mappings (such as keyword) as they have no affect. What should I
change it to instead?

//Query
GET /myindex/_search
{
"query": {
"match_phrase": {
"color": "metallic red"
}
}
}

//Data
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "1" } }
{ "color" : "metallic red" }
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "2" } }
{ "color" : "Metallic RED"}
{ "index" : { "_index" : "myindex", "_type" : "car", "_id" : "3" } }
{ "color" : "rEd" }

//Mapping (no effect for query)
curl -XPUT 'http://localhost:9200/myindex/' -d '{
"settings" : {
"analysis": {
"analyzer": {
"my_analyzer":{
"type": "custom",
"tokenizer" : "keyword",
"lowercase" : true
}
}
},
"mappings" : {
"episode" : {
"_source" : { "enabled" : false },
"properties" : {
"color" : { "type" : "string", "analyzer" :
"not_analyzed" }
}
}
}
}
}'

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/db21198f-6483-4919-9e7b-5509f6d5f9e6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.