Hi Miguel, by default all nodes allow you to specify ingest pipelines on them, which allow you to define ways to process ingested data.
In your case, you can specify an ingest pipeline that uses a script processor to extract the substring and assign it as a field to the document. In the example below, this pipeline is called extract-substring.
PUT _ingest/pipeline/extract-substring
{
"description" : "Extract a substring from a serial number",
"processors" : [
{
"script": {
"source": """
ctx.extractedSubstring = ctx.sourceField.substring(9, 13)
"""
}
}
]
}
If you were to ingest a document and assign this pipeline like this:
PUT test/_doc/test-document?pipeline=extract-substring
{
"sourceField": "20010525d18811888m y0pory01030103ba"
}
Then the indexed document will have this resulting shape:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.