I'm running into what I would assume is a common problem. Apologies in advanced for the long post, I just reviewed my steps and I realize that there are a lot of steps involved. Hopefully fleshing out this process will result in something simpler for myself and others.
I'm trying to transform my field types to better use my existing data. I've seen a lot of posts about this but still I'm struggling to get to a resolution, so figured I'd document my steps where others can re-use the information.
I have an index alias httpd
that refers to the last 35d worth of indices. I have corrected the type in my logstash filter so that new documents are coming in with the proper field type using the grok "int" suffix on field processing time
like so %{INT:processing_time:int}
. Previously had tested mutate convert
to the same effect, for those following along at home who aren't using grok, but do note that the convert statement should be in a separate mutate block than any existing mutate blocks. This appears to be working, as now I have a mapping conflict on the httpd-*
index pattern.
Now of course, I need to convert my previous indices so that I can properly aggregate on this new field type. I found a post which outlined the steps:
- Define an ingest pipeline which uses convert processor.
- Use this ingest pipeline in the reindex operation.
After some more research to tailor this to my needs, I ended up with the following, partially working API requests:
PUT _ingest/pipeline/make-integer
{
"description": "converts the field type of the defined field to an integer",
"processors" : [
{
"convert" : {
"field" : "processing_time",
"type": "integer",
"ignore_failure":true
}
}
]
}
POST _reindex
{
"source": {
"index": "httpd-2019.10.01-000002"
},
"dest": {
"index": "httpd-2019.10.01-000002-convert-integer-processing_time_1",
"pipeline": "make-integer"
}
, "conflicts": "proceed"
}
Please note that you should not need the "ignore_failure":true
or "conflicts": "proceed"
statements if your data is more standardized than mine, I just ran into some trouble with a very small percentage of my log not containing this value and failing halfway through the job's batches.
These 2 requests and a couple hours later, I have a replica of the index with what I assume is a corrected field type. I have tested the mapping with GET /httpd-2019.10.01-000002-convert-integer-processing_time_4/_mapping/field/processing_time
which shows the new mapping type that I expect. Now, the problem that I am having is that the new index is throwing ILM errors. This initially looked like index [httpd] is not the write index for alias [httpd-2019.10.11-000012]
but I found a guide that described how to create a new policy to age out old indices and apply it without giving any rollover or active policy. This appears to fit my needs, as I had already rolled the index manually once the type change was implemented. However, now my index has the error illegal_argument_exception: index.lifecycle.rollover_alias [httpd] does not point to index [httpd-2019.10.01-000002-convert-integer-processing_time_4]
. I see that others have corrected this with Update index alias API, but my query doesn't seem to do the trick - there doesn't appear to be any change in the error.
POST /_aliases
{
"actions" : [
{ "add" : { "index" : "httpd-2019.10.01-000002-convert-integer-processing_time_4", "alias" : "httpd" } }
]
}
So for now, that's about where I'm stuck. I'd also like to get this index name back to its original httpd-2019.10.01-000002
which I assume is a simple fix, but I'd like to validate the type conversion before taking this step.
<edit - mistook block quotes for multiline preformatted text.
Doing some more research, it appears that renaming an index is actually NOT simple. Any way that it is done is by copying an index and deleting the old one. Is it really not possible to make this field conversion happen without deleting the original data in one way or another? I understand why it must be done in the long run but is there some way that I can test that everything is healthy before committing to a delete? At this point it is seeming like my best next step is to delete the original index and re-copy the converted index back over to the original name while hoping for the best.