Disable merging to avoid index corruption

Hi All,

We are facing index corruption due to field conflicts.

for ex:
i have a field "Number" in type 1 and 2 in a same INDEX.
in type 1 it is defined as "integer"
in type 2 it is defined as "string"

elasticsearch not supporting the above scenario, due to this indices are corrupted during merge.

My question is:
By stopping the segment merge can avoid this issue?

This issue is not caused by segment merging. You cannot have mappings in different types in the same index which are not compatible. This is set at the mapping level even before indexing. For the purposes of this explanation I will assume you are relying on Elasticsearch's out of the box dynamic mapping.

Lets imagine you have a new index with no documents. When you index the first document:

PUT index/type1/1
{
  "number": 12
}

Elasticsearch recognises that the JSON type of the number field is an integer so will map the number field in type 1 to the integer type and index the document. Note that when the document is indexed by Lucene the field is number (i.e. Lucene has no concept of the document type).

Now you index second document:

PUT index/type2/2
{
  "number": "40"
}

This time Elasticsearch recognises that the JSON type of the number field is a string as tries to map the number field for type2 to the string type. The mapping code then throws an error because there is already a mapping for the number field (provided by type1) that specifies number as an integer field not a string field.

In order to fix this issue, you will either need to normalise your data before you index it into Elasticsearch (using for example Logstash), or explicitly define your mapping using the mapping API.

For more information on Mappings see here: https://www.elastic.co/guide/en/elasticsearch/reference/2.3/mapping.html

For information on how to set the mapping using the mapping API see here: https://www.elastic.co/guide/en/elasticsearch/reference/2.3/indices-put-mapping.html

Hope this helps

2 Likes

Hi @colings86

Thanks for your update,
we are using elasticsearch 1.3.7.
yes, whenever we try to update different field types for same field name we got exception like "Failed to merge".

But if we delete and update for the 2nd type mapping update will be successful. :cry: [ this is problem ]

we had these types of mapping collisions in staging and prod also.
" All are says upgrade your version will help for you "

But even upgrade i need to solve this mapping issue, any idea to solve this issue .
also let me know what are the issues in copy_to method

As I said before you need to ensure that fields which are shared between types have the same mapping. Also, you should definitely upgrade to the latest version as there are many improvements in 2.x to mapping to make mapping conflict less possible: https://www.elastic.co/blog/great-mapping-refactoring

1 Like