How can I change an existing field type using Python?

seanthegeek · November 16, 2018, 3:37am

Hi Everyone,

I have an open source project that uses Elasticsearch and Kibana to visualize DMARC data for almost a year now.

Just recently, a user found a bug where I mapped a field to a long when it should be text (it went unnoticed for so long because the value is almost always "0".

github.com/domainaware/parsedmarc

Error Importing to Elastic Search

opened 08:30PM - 06 Nov 18 UTC

closed 03:58AM - 22 Nov 18 UTC

kingbutter

bug

Fresh Install, Ubuntu 18.04.1 Python 3.6.6 parsedmarc 4.3.8 elasticsearch: …6.4.3 Reports seem to grab okay via IMAP, but fail when trying to import to elasticsearch. Traceback (most recent call last): File "/usr/local/bin/parsedmarc", line 11, in <module> sys.exit(_main()) File "/opt/venvs/parsedmarc/site-packages/parsedmarc/cli.py", line 333, in _main process_reports(results) File "/opt/venvs/parsedmarc/site-packages/parsedmarc/cli.py", line 42, in process_reports report, index=es_aggregate_index) File "/opt/venvs/parsedmarc/site-packages/parsedmarc/elastic.py", line 292, in save_aggregate_report_to_elasticsearch agg_doc.save() File "/opt/venvs/parsedmarc/site-packages/parsedmarc/elastic.py", line 88, in save return super().save(** kwargs) File "/opt/venvs/parsedmarc/site-packages/elasticsearch_dsl/document.py", line 383, in save self.full_clean() File "/opt/venvs/parsedmarc/site-packages/elasticsearch_dsl/utils.py", line 444, in full_clean self.clean_fields() File "/opt/venvs/parsedmarc/site-packages/elasticsearch_dsl/utils.py", line 430, in clean_fields data = field.clean(data) File "/opt/venvs/parsedmarc/site-packages/elasticsearch_dsl/field.py", line 207, in clean data.full_clean() File "/opt/venvs/parsedmarc/site-packages/elasticsearch_dsl/utils.py", line 444, in full_clean self.clean_fields() File "/opt/venvs/parsedmarc/site-packages/elasticsearch_dsl/utils.py", line 430, in clean_fields data = field.clean(data) File "/opt/venvs/parsedmarc/site-packages/elasticsearch_dsl/field.py", line 95, in clean data = self.deserialize(data) File "/opt/venvs/parsedmarc/site-packages/elasticsearch_dsl/field.py", line 91, in deserialize return self._deserialize(data) File "/opt/venvs/parsedmarc/site-packages/elasticsearch_dsl/field.py", line 314, in _deserialize return int(data) ValueError: invalid literal for int() with base 10: '0:1:d:s'

So now I need to find a way to correct this field type in every user's index and Kibana index pattern, preferably in a fully automated way.

My attempt at doing this failed, but hopefully it gives you a good idea of what I'm going for:

def migrate_indexes(aggregate_indexes=None, forensic_indexes=None):
    """
    Updates index mappings

    Args:
        aggregate_indexes (list): A list of aggregate index names
        forensic_indexes (list): A list of forensic index names
    """
    if aggregate_indexes is None:
        aggregate_indexes = []
    if forensic_indexes is None:
        forensic_indexes = []
    for aggregate_index_name in aggregate_indexes:
        aggregate_index = Index(aggregate_index_name)
        body = { "properties": {"published_policy.fo": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            }
        }
        }
        doc = "doc"
        fo_field = "published_policy.fo"
        fo = "fo"
        fo_mapping = aggregate_index.get_field_mapping(fields=[fo_field])[
            aggregate_index_name]["mappings"][doc][fo_field]["mapping"][fo]
        fo_type = fo_mapping["type"]
        if fo_type == "long":
            aggregate_index.put_mapping(doc_type=doc, body=body)
    for forensic_index in forensic_indexes:
        pass

elasticsearch.exceptions.RequestError: RequestError(400, 'illegal_argument_exception', 'mapper [published_policy.fo] of different type, current_type [long], merged_type [text]')

What am I doing wrong, and how can I fix this mess?

warkolm · November 16, 2018, 3:38am

You cannot change a mapping once it is in place.
Your best option would be to use a template and then update that, so future indices use the correct format.

seanthegeek · November 16, 2018, 3:44am

Thanks for the very fast reply. Is there any way to copy the existing data over to a new index with the correct mapping, so I don't lose data?

warkolm · November 16, 2018, 4:03am

You can do a reindex into a new index, then delete the old index and put an alias on the new index so it can still be queried via the original name.

system · December 14, 2018, 4:04am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to update a field type of existing index in Elasticsearch Elasticsearch	13	133394	July 5, 2017
Can't change a field type Elasticsearch	4	2116	April 15, 2021
How to change the mappings of an index in python itself? Elasticsearch	3	519	January 4, 2023
How to update the maapings of a certain index in Elastic Search 8.4.1. using Python? Elasticsearch language-clients	1	254	January 10, 2023
How can I dynamic change a field's index type? Elasticsearch	3	405	July 6, 2017

How can I change an existing field type using Python?

Related topics