Partition and Index by Text Field Contents

I have an index with the following records.

text
The triangle is blue.
The square is red.
There are two red circles.
The triangle is green.

I want to partition them by mentions of color in the text field, meaning that I want to add a color column like so.

text                                          color
The triangle is blue.               blue
The square is red.                  red
There are two red circles.    red
The triangle is green.            green

The color is determined by running a regular expression. Assume for simplicity's sake that exactly one color is mentioned in every text field. I want to perform a transform with color as the pivot field.

What is the best way to do this?

What do you think of this?

POST _ingest/pipeline/_simulate
{
  "docs": [
    { "_index": "foo", "_source": { "text": "The triangle is blue." } },
    { "_index": "foo", "_source": { "text": "The square is red." } },
    { "_index": "foo", "_source": { "text": "There are two red circles." } },
    { "_index": "foo", "_source": { "text": "The triangle is green." } }
  ],
  "pipeline": {
    "processors": [
      {
        "lowercase": {
          "field": "text",
          "target_field": "tmp"
        }
      },
      {
        "grok": {
          "field": "tmp",
          "patterns": [
            "%{COLOR:color}"
          ],
          "pattern_definitions": {
            "COLOR": "red|green|blue"
          }
        }
      },
      {
        "remove": {
          "field": "tmp"
        }
      }
    ]
  }
}

This gives:

{
  "docs": [
    {
      "doc": {
        "_index": "foo",
        "_version": "-3",
        "_id": "_id",
        "_source": {
          "color": "blue",
          "text": "The triangle is blue."
        },
        "_ingest": {
          "timestamp": "2025-03-19T20:32:02.568078692Z"
        }
      }
    },
    {
      "doc": {
        "_index": "foo",
        "_version": "-3",
        "_id": "_id",
        "_source": {
          "color": "red",
          "text": "The square is red."
        },
        "_ingest": {
          "timestamp": "2025-03-19T20:32:02.568106186Z"
        }
      }
    },
    {
      "doc": {
        "_index": "foo",
        "_version": "-3",
        "_id": "_id",
        "_source": {
          "color": "red",
          "text": "There are two red circles."
        },
        "_ingest": {
          "timestamp": "2025-03-19T20:32:02.568112315Z"
        }
      }
    },
    {
      "doc": {
        "_index": "foo",
        "_version": "-3",
        "_id": "_id",
        "_source": {
          "color": "green",
          "text": "The triangle is green."
        },
        "_ingest": {
          "timestamp": "2025-03-19T20:32:02.568117517Z"
        }
      }
    }
  ]
}

Yes, that's what I'm looking for. I'm still inexperienced with Grok.

How would I make this a runtime mapping instead? (I'm playing around but can't quite get the syntax.)

Definitely better to do that at index time. I'd not use runtime fields for this.