Question about reindexing everything due to incorrect Mapping

Hi,

I will start with what happened first before listing out the questions.

I started by indexing a bunch of data. Each data type (a fish) has a name
and a type.

curl -XPUT 'http://localhost:9200/animal/fish/1' -d '{"name" : "channel cat
fish", "type" : "cat-fish"}'

curl -XPUT 'http://localhost:9200/animal/fish/2' -d '{"name" : "blue cat
fish", "type" : "cat-fish"}'

curl -XPUT 'http://localhost:9200/animal/fish/3' -d '{"name" : "atlantic
salmon", "type" : "salmon"}'

curl -XPUT 'http://localhost:9200/animal/fish/4' -d '{"name" : "blue
salmon", "type" : "salmon"}'

curl -XPUT 'http://localhost:9200/animal/fish/5' -d '{"name" : "blue carp",
"type" : "carp"}'

Now I would like to search for fish that contains "blue" but must be either
a "cat-fish" or a "salmon". So I follow the guide and build myself a
filtered query. I want it to return "blue cat fish" and "blue salmon"

curl -XGET 'http://localhost:9200/animal/fish/_search?pretty=true' -d
'{
"query": {
"filtered" : {
"query" : {
"match" : { "_all" : "blue" }
},
"filter" : {
"bool" : {
"should" : {
"term": {"type" : "salmon"}
},
"should" : {
"term": {"type" : "cat-fish"}
}
}
}
}
}
}'

However, it only returns "blue salmon", its fails to return "blue cat
fish". I check the documentation and find that "cat-fish" was processed
using the standard analyzer and it was tokenized before being indexed. So
searching directly with term filter doesn't work.

I realized that I have to changed the mapping, specifically adding "index:
not_analyzed" to "type". So I delete everything that was indexed, set the
mapping, insert everything back. Now the search works great.

curl -XDELETE 'http://localhost:9200/animal/fish/'

curl -XPUT 'http://localhost:9200/animal/fish/_mapping' -d '{
"fish" : {
"properties" : {
"name" : { "type" : "string" },
"type" : { "type" : "string", "index" : "not_analyzed"}
}
}
}'

My questions

  • Would it be possible to perform the same filtering without changing
    the mapping? Say ... using a different query.
  • If I must change the mapping, is it possible to change the mapping
    without deleting everything first? e.g. update a field to "not_analyzed"
    without deleting and reindexing everything.
  • If I maintain mapping, would it be better to use something like
    spring-elasticsearch or just client.admin().indices().preparePutMapping?

Thanks! Any help would be appreciated.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,
To answer your questions:

  • Would it be possible to perform the same filtering without changing
    the mapping? Say ... using a different query.

Not in any sane way. If your data set is extremely small you could wrap all
the terms in wild cards cat fish but this will match substrings and is
extremely discouraged. I hope people don't think badly of me for even
mentioning it.

  • If I must change the mapping, is it possible to change the mapping
    without deleting everything first? e.g. update a field to "not_analyzed"
    without deleting and reindexing everything.

Do not fear re-indexing as your data model changes it will be a necessity.
It also generates a clean defragmented index. You should use index aliasing
to avoid having to delete, re-create, re-index. Instead re-index to new
index with new mappings, swap alias, delete old index. There are plugins to
help with the re-index
process: https://github.com/karussell/elasticsearch-reindex

  • If I maintain mapping, would it be better to use something like
    spring-elasticsearch or just client.admin().indices().preparePutMapping?

Sorry, can't really give a valid comparison. I have a Java app that talks
to elasticsearch that pulls index settings and applies them with the Java
AP(preparePutMapping)I, haven't used the spring integration.

Best Regards,
Paul

On Thursday, March 28, 2013 7:17:04 PM UTC-6, Fish Tastic wrote:

Hi,

I will start with what happened first before listing out the questions.

I started by indexing a bunch of data. Each data type (a fish) has a name
and a type.

curl -XPUT 'http://localhost:9200/animal/fish/1' -d '{"name" : "channel
cat fish", "type" : "cat-fish"}'

curl -XPUT 'http://localhost:9200/animal/fish/2' -d '{"name" : "blue cat
fish", "type" : "cat-fish"}'

curl -XPUT 'http://localhost:9200/animal/fish/3' -d '{"name" : "atlantic
salmon", "type" : "salmon"}'

curl -XPUT 'http://localhost:9200/animal/fish/4' -d '{"name" : "blue
salmon", "type" : "salmon"}'

curl -XPUT 'http://localhost:9200/animal/fish/5' -d '{"name" : "blue
carp", "type" : "carp"}'

Now I would like to search for fish that contains "blue" but must be
either a "cat-fish" or a "salmon". So I follow the guide and build myself a
filtered query. I want it to return "blue cat fish" and "blue salmon"

curl -XGET 'http://localhost:9200/animal/fish/_search?pretty=true' -d
'{
"query": {
"filtered" : {
"query" : {
"match" : { "_all" : "blue" }
},
"filter" : {
"bool" : {
"should" : {
"term": {"type" : "salmon"}
},
"should" : {
"term": {"type" : "cat-fish"}
}
}
}
}
}
}'

However, it only returns "blue salmon", its fails to return "blue cat
fish". I check the documentation and find that "cat-fish" was processed
using the standard analyzer and it was tokenized before being indexed. So
searching directly with term filter doesn't work.

I realized that I have to changed the mapping, specifically adding "index:
not_analyzed" to "type". So I delete everything that was indexed, set the
mapping, insert everything back. Now the search works great.

curl -XDELETE 'http://localhost:9200/animal/fish/'

curl -XPUT 'http://localhost:9200/animal/fish/_mapping' -d '{
"fish" : {
"properties" : {
"name" : { "type" : "string" },
"type" : { "type" : "string", "index" : "not_analyzed"}
}
}
}'

My questions

  • Would it be possible to perform the same filtering without changing
    the mapping? Say ... using a different query.
  • If I must change the mapping, is it possible to change the mapping
    without deleting everything first? e.g. update a field to "not_analyzed"
    without deleting and reindexing everything.
  • If I maintain mapping, would it be better to use something like
    spring-elasticsearch or just client.admin().indices().
    preparePutMapping?

Thanks! Any help would be appreciated.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Paul, that answers all my questions! Thank you very much.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.