I have to modify an index mapping and I feel like when doing this in the past it feels more difficult and error-prone than it should be. Specifically I have to modify a geopoint field. I'm going to use a json text editor and then delete the existing mapping and then update to the new mapping, using the console in kibana. But editing mappings really requires a lot of care because of the json nesting ... I guess I should copy the existing mapping, then modify it and post it back to the server? are there any tools that facilitate this process? thanks
then delete the existing mapping
Unfortunately, while we can delete a mapping template, we can't actually delete mappings themselves [1]; this is because a mapping is the thing that binds all of our data together, and without it, Elasticsearch wouldn't know how to apply a new mapping to the bits that are distributed across our cluster so deleting a mapping is effectively the same as deleting all of the data in an index .
There are two paths forward, depending on whether or not downtime is acceptable.
- If you can stop your input stream, the Elasticsearch _reindex API may be a great option; it effectively copies all documents from one index to another previously configured index, optionally transforming each document with a script; after deleting the old index, you can then add an Index Alias if necessary.
- Otherwise, you may need to do a phased migration; since we can add fields to an existing index, we can dual-write the new representation to a new field alongside the existing old field, migrate the existing data, then update our application to just use the new field instead.
The phases look something like this:
starting state
we start in a state where applications are using field foo
, but the mappings are insufficient for future use
prep phase
- index template updated to add new mapping
bar
- existing indexes updated to add new mapping
bar
- applications configured to dual-write to
foo
and new mappingbar
at this point, our applications are still reading from foo
, but new data is being persisted to both foo
and bar
migration phase
- Logstash, Elasticsearch _update_by_query API, or some other tool iterates over all documents, reading from
foo
, transforming and persisting fieldbar
that has our new mapping
at this point, our applications are still reading from foo
, but data has also been fully migrated to bar
end phase
- applications updated to read from
bar
- applications updated to stop dual-writing to
foo
, and only write tobar
at this point, we're fully using the new field bar
, but there's a little cleanup left
post-migration cleanup
- index templates updated to remove mapping
foo
(this means new indices will never create thefoo
field) - optionally, sweep through existing indices to unset
foo
since it is no longer used
Thanks very much, this helps alot.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.