Changing the analyzer used on a specific index type mapping

Hi there,

I have a 3 different types ("general", "action", "language") on an index
("data")

Both the "action" and "language" types have a field called "locale" which i
use (together with a mapping
analyzer: http://www.elasticsearch.org/guide/reference/mapping/analyzer-field.html)
to allow me to deal with different language inputs. e.g:

{"action":{"_analyzer":{"path":"locale"},"properties":{..."locale":{"type":"string","index":"not_analyzed"}}}}

I already have the analyzer settings set up on my index ("data") like
in http://www.elasticsearch.org/guide/reference/index-modules/analysis/

Due to requirements change, i now want the the same behaviour on the
"general" type - the ability to store analyzed text in certain fields based
on the locale. The index/type already has data in it but that data all went
in using the default english analyzer; now i want to use other languages
based on the locale field. (i.e. there will now be multiple languages
rather than just english).

I've been trying to update the mapping for this and it does not update
correctly; is there anyway to do this with out a) reindexing and/or b)
deleting the mapping(and data) and recreating?

I understand in general that you do not want to completely change an
analyzer but here i just want to modify the existing one to support more
languages and it seems to me that the existing/new tokenised data should be
ok? (e.g. change from just english -> english and spanish)

Thanks in advance,

Derry

--

Unfortunately, some mapping parameters, including analyzer field path
cannot be changed once they are set. When you created mapping for the
"general" field, the analyzer field was set to "_analyzer" and now
elasticsearch will not allow you to change it to "locale" without
recreating the mapping. It might be inconvenient, but I think the best
thing you can do here is to just use the field "_analyzer" for the new
type.

The only other alternative that I can think of is to create a new index
with the same name but new mapping and then while cluster is down, merge
metadata from the new index with data from the old index. But because it's
quite dangerous and cumbersome, I cannot really recommend doing this.

On Thursday, January 17, 2013 11:13:23 AM UTC-5, Derry O' Sullivan wrote:

Hi there,

I have a 3 different types ("general", "action", "language") on an index
("data")

Both the "action" and "language" types have a field called "locale" which
i use (together with a mapping analyzer:
Elasticsearch Platform — Find real-time answers at scale | Elastic)
to allow me to deal with different language inputs. e.g:

{"action":{"_analyzer":{"path":"locale"},"properties":{..."locale":{"type":"string","index":"not_analyzed"}}}}

I already have the analyzer settings set up on my index ("data") like in
Elasticsearch Platform — Find real-time answers at scale | Elastic

Due to requirements change, i now want the the same behaviour on the
"general" type - the ability to store analyzed text in certain fields based
on the locale. The index/type already has data in it but that data all went
in using the default english analyzer; now i want to use other languages
based on the locale field. (i.e. there will now be multiple languages
rather than just english).

I've been trying to update the mapping for this and it does not update
correctly; is there anyway to do this with out a) reindexing and/or b)
deleting the mapping(and data) and recreating?

I understand in general that you do not want to completely change an
analyzer but here i just want to modify the existing one to support more
languages and it seems to me that the existing/new tokenised data should be
ok? (e.g. change from just english -> english and spanish)

Thanks in advance,

Derry

--

HI Igor,

thanks for the response.

Just to clarify - "general" is not a field, it's an index type. I want to
use a 'new' field called "locale" to be used to allow language specific
analysis.

So can i not change the analyzer at all once it is already created, or is
it because i am creating a 'dynamic' analyser based on another field?

Derry

On Friday, 18 January 2013 19:26:53 UTC, Igor Motov wrote:

Unfortunately, some mapping parameters, including analyzer field path
cannot be changed once they are set. When you created mapping for the
"general" field, the analyzer field was set to "_analyzer" and now
elasticsearch will not allow you to change it to "locale" without
recreating the mapping. It might be inconvenient, but I think the best
thing you can do here is to just use the field "_analyzer" for the new
type.

The only other alternative that I can think of is to create a new index
with the same name but new mapping and then while cluster is down, merge
metadata from the new index with data from the old index. But because it's
quite dangerous and cumbersome, I cannot really recommend doing this.

On Thursday, January 17, 2013 11:13:23 AM UTC-5, Derry O' Sullivan wrote:

Hi there,

I have a 3 different types ("general", "action", "language") on an index
("data")

Both the "action" and "language" types have a field called "locale" which
i use (together with a mapping analyzer:
Elasticsearch Platform — Find real-time answers at scale | Elastic)
to allow me to deal with different language inputs. e.g:

{"action":{"_analyzer":{"path":"locale"},"properties":{..."locale":{"type":"string","index":"not_analyzed"}}}}

I already have the analyzer settings set up on my index ("data") like in
Elasticsearch Platform — Find real-time answers at scale | Elastic

Due to requirements change, i now want the the same behaviour on the
"general" type - the ability to store analyzed text in certain fields based
on the locale. The index/type already has data in it but that data all went
in using the default english analyzer; now i want to use other languages
based on the locale field. (i.e. there will now be multiple languages
rather than just english).

I've been trying to update the mapping for this and it does not update
correctly; is there anyway to do this with out a) reindexing and/or b)
deleting the mapping(and data) and recreating?

I understand in general that you do not want to completely change an
analyzer but here i just want to modify the existing one to support more
languages and it seems to me that the existing/new tokenised data should be
ok? (e.g. change from just english -> english and spanish)

Thanks in advance,

Derry

--

Sorry, I meant to write "When you created mapping for the "general"
type....". Currently the "general" type is set to use the path "_analyzer"
to look up document specific analyzer. Every time you index a document with
the "general" type, elasticsearch checks the field "_analyzer" in case
something is specified there to be used as an analyzer for the document.
Obviously, you didn't ever use this field, but elasticsearch doesn't really
know that. So, when you are trying to rename this path from "_analyzer" to
"locale" elasticsearch doesn't allow you to do that to
avoid inconsistency (old documents indexed using analyzer specified by the
field "_analyzer" and new documents indexed using analyzer specified by the
field "locale"). Does it make sense?

On Monday, January 21, 2013 9:28:07 AM UTC-5, Derry O' Sullivan wrote:

HI Igor,

thanks for the response.

Just to clarify - "general" is not a field, it's an index type. I want to
use a 'new' field called "locale" to be used to allow language specific
analysis.

So can i not change the analyzer at all once it is already created, or is
it because i am creating a 'dynamic' analyser based on another field?

Derry

On Friday, 18 January 2013 19:26:53 UTC, Igor Motov wrote:

Unfortunately, some mapping parameters, including analyzer field path
cannot be changed once they are set. When you created mapping for the
"general" field, the analyzer field was set to "_analyzer" and now
elasticsearch will not allow you to change it to "locale" without
recreating the mapping. It might be inconvenient, but I think the best
thing you can do here is to just use the field "_analyzer" for the new
type.

The only other alternative that I can think of is to create a new index
with the same name but new mapping and then while cluster is down, merge
metadata from the new index with data from the old index. But because it's
quite dangerous and cumbersome, I cannot really recommend doing this.

On Thursday, January 17, 2013 11:13:23 AM UTC-5, Derry O' Sullivan wrote:

Hi there,

I have a 3 different types ("general", "action", "language") on an
index ("data")

Both the "action" and "language" types have a field called "locale"
which i use (together with a mapping analyzer:
Elasticsearch Platform — Find real-time answers at scale | Elastic)
to allow me to deal with different language inputs. e.g:

{"action":{"_analyzer":{"path":"locale"},"properties":{..."locale":{"type":"string","index":"not_analyzed"}}}}

I already have the analyzer settings set up on my index ("data") like in
Elasticsearch Platform — Find real-time answers at scale | Elastic

Due to requirements change, i now want the the same behaviour on the
"general" type - the ability to store analyzed text in certain fields based
on the locale. The index/type already has data in it but that data all went
in using the default english analyzer; now i want to use other languages
based on the locale field. (i.e. there will now be multiple languages
rather than just english).

I've been trying to update the mapping for this and it does not update
correctly; is there anyway to do this with out a) reindexing and/or b)
deleting the mapping(and data) and recreating?

I understand in general that you do not want to completely change an
analyzer but here i just want to modify the existing one to support more
languages and it seems to me that the existing/new tokenised data should be
ok? (e.g. change from just english -> english and spanish)

Thanks in advance,

Derry

--

Yup, thanks for the clarificiation. I have the raw data so i think it makes
sense to reindex ;-(

On Monday, 21 January 2013 14:52:00 UTC, Igor Motov wrote:

Sorry, I meant to write "When you created mapping for the "general"
type....". Currently the "general" type is set to use the path "_analyzer"
to look up document specific analyzer. Every time you index a document with
the "general" type, elasticsearch checks the field "_analyzer" in case
something is specified there to be used as an analyzer for the document.
Obviously, you didn't ever use this field, but elasticsearch doesn't really
know that. So, when you are trying to rename this path from "_analyzer" to
"locale" elasticsearch doesn't allow you to do that to
avoid inconsistency (old documents indexed using analyzer specified by the
field "_analyzer" and new documents indexed using analyzer specified by the
field "locale"). Does it make sense?

On Monday, January 21, 2013 9:28:07 AM UTC-5, Derry O' Sullivan wrote:

HI Igor,

thanks for the response.

Just to clarify - "general" is not a field, it's an index type. I want to
use a 'new' field called "locale" to be used to allow language specific
analysis.

So can i not change the analyzer at all once it is already created, or is
it because i am creating a 'dynamic' analyser based on another field?

Derry

On Friday, 18 January 2013 19:26:53 UTC, Igor Motov wrote:

Unfortunately, some mapping parameters, including analyzer field path
cannot be changed once they are set. When you created mapping for the
"general" field, the analyzer field was set to "_analyzer" and now
elasticsearch will not allow you to change it to "locale" without
recreating the mapping. It might be inconvenient, but I think the best
thing you can do here is to just use the field "_analyzer" for the new
type.

The only other alternative that I can think of is to create a new index
with the same name but new mapping and then while cluster is down, merge
metadata from the new index with data from the old index. But because it's
quite dangerous and cumbersome, I cannot really recommend doing this.

On Thursday, January 17, 2013 11:13:23 AM UTC-5, Derry O' Sullivan wrote:

Hi there,

I have a 3 different types ("general", "action", "language") on an
index ("data")

Both the "action" and "language" types have a field called "locale"
which i use (together with a mapping analyzer:
Elasticsearch Platform — Find real-time answers at scale | Elastic)
to allow me to deal with different language inputs. e.g:

{"action":{"_analyzer":{"path":"locale"},"properties":{..."locale":{"type":"string","index":"not_analyzed"}}}}

I already have the analyzer settings set up on my index ("data") like
in Elasticsearch Platform — Find real-time answers at scale | Elastic

Due to requirements change, i now want the the same behaviour on the
"general" type - the ability to store analyzed text in certain fields based
on the locale. The index/type already has data in it but that data all went
in using the default english analyzer; now i want to use other languages
based on the locale field. (i.e. there will now be multiple languages
rather than just english).

I've been trying to update the mapping for this and it does not update
correctly; is there anyway to do this with out a) reindexing and/or b)
deleting the mapping(and data) and recreating?

I understand in general that you do not want to completely change an
analyzer but here i just want to modify the existing one to support more
languages and it seems to me that the existing/new tokenised data should be
ok? (e.g. change from just english -> english and spanish)

Thanks in advance,

Derry

--