Detecting entries causing mapping conflict

Hi,

In my index pattern, I am getting a warning on one of my fields that it is defined as multiple types (text, float, long) and so it causing a mapping conflict.

I do know how to solve it by re-indexing.

I am wondering if there is a way to find which entries (logs) are flowing through as text. If I can do this I am hoping to fix the issue at the source (i.e. which project is generates logs with text in this field) but at the moment i am blind.

Thanks in advance,
Amy

Hi Amy,

Could you describe what's happening with the data upon a mapping conflict? Does the document with the text field get discarded or partially indexed?

Please also share your ES version number, your index mapping and highlight the relevant field where the clash happens.

By "field defined as multiple types", do you mean sub-fields? Or that the field was mapped as a numeric type and thus cannot store non-numeric text?

Which version are you using? You can see which index has the same field with a different mapping in the Data View settings.

Go into the data view, and it will show a banner telling you how many fields have conflict, click in View Conflicts to filter by those fields, then on the list, on the Type column you can click on conflict to show the type of the field in each index.

The data does appear in Discover with the below warning attached. I am unable to use it in visualisation charts.

We are using v7.16.2. We don't define the mapping which is part of the problem.

We are using v7.16.2. Yeah, seeing the problematic indices is helpful for re-indexing. However, I am hoping you can advise me on how to inspect the index for more information on the source of the problematic log so we can identify which project it is coming from :pray:

You need to go to Stac Management, Index Patterns, look for the index pattern where you have the conflits, there you will see more information about the conflicts.

Sadly, this only gives me the name of the index and the type mapped to it (dynamically).

Yeah, this is exactly what I told you to look for.

With the index name you need to look in your system what is writing to that index and then fix the mapping, there is not much you can do as conflicts are mapping issues and you need to fix the mapping issues.

What are you looking for in this case?

Are you saying that this cannot be done from the ELK Stack?

I'm not sure what exactly you are trying to do.

A mapping conflict means that your index pattern/data view matches two or more indices where the same field has different data types.

In Kibana you can check the index pattern and see the data type for the conflicting field in each index.

To fix this conflict you need to create a template/mapping with the correct data type and reindex your data,

But you won't be able to find what is the source of the documents that lead elastiscearch to interpret the field as a different data type, unless you are tagging or adding custom fields to each source.

We do have fields which can be used to differentiate which project code-set produced which logs. My problem is, from Kibana, I cannot see which project sent a text value to the durationInSeconds field causing the conflict. So, I was wondering if there was a way to find out this.

Our plan is to introduce a mapping template to avoid the problems with the dynamic mapping. However, in the short term, i was hoping to be able to identify the project so a code change can be made to fix this.

The thing here is that you are using dynamic mapping and with dynamic mapping elasticsearch will try to infer the mapping of the field based on the first time it receives a document with the field and its value.

In your example durationInSeconds, you may have a document where the value was "10.5", and because of the double quotes it may be intepreted as a text, while 10.5 may be interepreted as a float.

You have the index name where this field is mapped as text, you can then filter by the index name and see all the documents present on that index, since you have some fields that can identify the source, maybe this will be enough.

You need to fix this in the mapping, changing the source is not a fix as the mapping is created by Elasticsearch.

Exactly. I won't be able to identify the source this way. So, a mapping template must be only way. It's just we have been running for years with no problem until now so wanted to avoid if at all possible.