I would indeed use ingest pipelines for a case like this. For your example case, you can use the date
processor to extract a date and ignore_failure: true
to skip the step if parsing the date fails, the set
processor with a condition to copy a
to a_string
only if the field a_date
does not exist, and finally a remove
processor to remove the field a
from the document.
Here's an example pipeline:
PUT _ingest/pipeline/maybe-dates
{
"processors": [
{
"date": {
"field": "a",
"target_field": "a_date",
"formats": ["yyyy-MM-dd"],
"ignore_failure": true
}
},
{
"set": {
"if": "ctx.a_date == null",
"field": "a_string",
"value": "{{a}}"
}
},
{
"remove": {
"field": "a"
}
}
]
}
And here's a call to the simulate pipeline API with a couple of example docs demonstrating that it works.
POST _ingest/pipeline/maybe-dates/_simulate
{
"docs": [
{
"_source": {
"a": "asdf"
}
},
{
"_source": {
"a": "2019-02-15"
}
}
]
}
Which should return something like:
{
"docs" : [
{
"doc" : {
"_index" : "_index",
"_type" : "_type",
"_id" : "_id",
"_source" : {
"a_string" : "asdf"
},
"_ingest" : {
"timestamp" : "2019-02-15T23:20:21.918Z"
}
}
},
{
"doc" : {
"_index" : "_index",
"_type" : "_type",
"_id" : "_id",
"_source" : {
"a_date" : "2019-02-15T00:00:00.000Z"
},
"_ingest" : {
"timestamp" : "2019-02-15T23:20:21.918Z"
}
}
}
]
}
If you need other types, you can do something similar with the convert
processor instead of date
.