Change field type from string to integer? and how to re-index?


(Michael Zoet) #1

Hi all Elasticsearch users,

in a newly created ELK setup I have done the mistake of not explicitly setting the type to integer (or float) on some fields.

What do I need to do in Elasticsearch (and/or Kibana) after I have setup everything correct in Logstash with the grok or mutate filter? As far as I understand I need to completely erase the existing index and create it new. Is this correct? Or is there another way to do this? Just deleting only the fields I need to change?
What will happen if I change the type in Logstash without changing anything on the Elasticsearch index?

And which is better to use grok or mutate?

Thx in advance,
Michael


(Magnus Bäck) #2

As far as I understand I need to completely erase the existing index and create it new. Is this correct? Or is there another way to do this? Just deleting only the fields I need to change?

Yes, you have to reindex. You can do that with Logstash (example configs have been posted in the past) or third-party tools like es-reindex. After reindexing to a new name (e.g. the original name with an underscore appended) you can delete the original index and create an alias named like the original index that points to the new index. Thereby everything will work as before.

What will happen if I change the type in Logstash without changing anything on the Elasticsearch index?

Then the next index that's created (the next day if you use daily indexes) will have the correct mappings.

And which is better to use grok or mutate?

Use for what, making a field an integer or float? Using the grok filter is typically easier, but not all fields are created by that filter.


(Michael Zoet) #3

Thanks again Magnus for the answer. You really helped me a lot for this setup! At least I have plan now :-).

One last question: What problems might arise when I do not change anything in Elasticsearch and do not reindex? (Only change the values in Logstash with grok and mutate? I have daily indexes (as recommended) and at the moment I do not care about the old data. The setup will be used in production sometime next week and old data is not of any interest at the moment.

Michael


(Magnus Bäck) #4

As long as you don't attempt aggregations or range queries that span over days with both string values and integer/float values I you're going to be okay.


(Michael Zoet) #5

Thanks a lot! First I will go for the least effort and see where this goes.


(Florin Andrei) #6

So, what happened? Did the type magically change next day?


(Michael Zoet) #7

Yes and no ;-).

The type changed from the date on where I did the change. But as
expected the older data kept it's data type. The problem with this is
that you can not visualize any data with this correctly for this index
field.
As I wanted to avoid reindexing the whole data, I just changed the name
of the index field. But this only worked because the older data was not
important. Otherwise reindexing is a must.


(Andrii Cherkasov) #8

Sorry for bumping an old thread but could you be so kind to explain exactly what you did to get it solved?
I need to change one field from 'string' to 'integer' (bytes filed from apache logs) but for some reason can't. I have added 'mutate { convert => { "bytes" => "integer" } }' to an appropriate logstash conf but see no result. Also tried to create a new index wth correct values but it seems that I have to somehow change the default index (logstash-*)...

I am fine with reindexing or/and removing all the data.
Feeling lot here :frowning:

Thank you very much for any info.


(Jeremy Colton) #9

Hi, I also have used mutate to change an existing, indexed field from a String to a Float.

I followed this article https://www.elastic.co/guide/en/elasticsearch/guide/current/reindex.html and ran the following in the Kibana Sense plugin:

GET /logstash-2016.05.*/_search?scroll=1m
{
    "query": {
        "range": {
            "date": {
                "gte":  "2016-05-01",
                "lt":   "2016-05-16"
            }
        }
    },
    "sort": ["_doc"],
    "size":  1000
}

I saw the following output:

{
  "_scroll_id": "cXVlcnlUaGVuRmV0Y2g7ODA7NjE2Nzg3OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODAxOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2Nzg4OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2Nzg5OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2NzkwOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2NzkxOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2NzkyOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2NzkzOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2Nzk0OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2Nzk1OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2Nzk2OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2Nzk3OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2Nzk4OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2Nzk5OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODAwOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODAyOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODAzOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODA0OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODA1OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODA2OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODA3OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODA4OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODA5OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODEwOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODExOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODEyOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODEzOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODE0OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODE1OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODE2OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODE3OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODE4OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODE5OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODIwOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODIxOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODIyOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODIzOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODI0OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODI1OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODI2OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODI3OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODI4OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODI5OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODMwOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODMxOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODMyOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODMzOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODM0OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODM1OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODM2OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODM3OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODM4OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODM5OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODQwOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODQxOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODQyOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODQzOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODQ0OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODQ1OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODQ2OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODQ3OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODQ4OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODQ5OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODUwOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODUxOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODUyOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODUzOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODU0OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODU1OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODU2OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODU3OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODU4OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODU5OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODYwOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODYxOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODYyOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODYzOnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODY0OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODY1OnkzNU55V1Z2U015X2JERHdmVVZqN3c7NjE2ODY2OnkzNU55V1Z2U015X2JERHdmVVZqN3c7MDs=",
  "took": 26,
  "timed_out": false,
  "_shards": {
    "total": 80,
    "successful": 80,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

But the Settings link in Kibana still shows a conflict for the field which I changed from a String to a Float via Logstash's mutate filter.

Any ideas what I have done wrong/missed? Many thanks...


(Chris Earle) #10

@Jeremy_Colton

Probably worth creating your own thread, but offhand it appears that your searched the wrong field? Logstash creates a @timestamp field by default, not a date field. Perhaps you crated the date field though.


(Jeremy Colton) #11

Why a new thread - the title is about re-indexing?

I took the example JSON from the Elastic Search article... I took your advice and changed the search field from "date" to "@timestamp" - but still no difference to my conflicted field in the Settings tab.

So I removed the search field altogether and ran:

GET /logstash-2016.05.*/_search?scroll=1m
{
    "sort": ["_doc"],
    "size":  1000
}
  • no difference. Any idea? Many thanks...

(Magnus Bäck) #12

Why a new thread - the title is about re-indexing?

That doesn't mean that this thread is suitable for all questions about reindexing.


(Jeremy Colton) #13

My apologies, I've created a new thread for my re-indexing issue:


#14

I meet the same problem now.How can I solve it?


(Hanan) #15

I have posted the same question in another thread "Number format Exception" but after read this thread i though it is related as well.

I have same issue recently. i have production index mapping where i defined "auid" as "string", then later on i found it should be "integer" so i created a new index with mapping "auid" as integer, then using reindex API to re-index data from old index to the newindex, but I am getting the exception below??

"failures" : [
{
"index" : "newindex",
"type" : "author",
"id" : "123",
"cause" : {
"type" : "mapper_parsing_exception",
"reason" : "failed to parse [auid]",
"caused_by" : {
"type" : "number_format_exception",
"reason" : "For input string: "1111""
}
},
"status" : 400
},


(system) #16