Foreach Ingest processor + conditional append processor

I have documents with fields that looks like this (from Jaeger):

{
	"something": "some value",
	...
	"logs": [
		{
			"fields": [
				{
					"key": "error",
					"value": "This is an error message."
				},
				{
					"key": "info",
					"value": "Arbitrary info"
				}
			]
		}
	]
}

I am trying to add an ingest processor to create a new field log_values containing just the value of each field. So far I have this working perfectly with two foreach processors:

{
	"foreach": {
		"field": "logs",
		"processor": {
			"foreach": {
				"field": "_ingest._value.fields",
				"processor": {
					"append": {
						"field": "log_values",
						"value": ["{{_ingest._value.value}}"]
					}
				}
			}
		}
	}
}

This results in a new field being created in the indexed document:

{
	"something": "some value",
	...
	"logs": [...],
	"log_values": ["This is an error message.", "Arbitrary info"]
}

However, I only want to append logs where the key field is error. So I tried the obvious:

"append": {
	"field": "log_values",
	"if": "_ingest._value.key == 'error'",
	"value": ["{{_ingest._value.value}}"]
}

Elasticsearch doesn't like this though, as I now get errors such as Variable [_ingest] is not defined. I've made a few more attempts using ctx, but nothing has worked and I'm really not sure where to go next.

Is what I'm trying to do possible? Thank you for your time.

Try a script processor...

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "script": {
          "lang": "painless",
          "source": """
ctx.error_values = ctx.logs.stream().map(log -> 
  log.fields.stream()
    .filter(field -> field.key == "error")
    .map(field -> field.value)
    .collect(Collectors.toList())
  )
  .flatMap(l -> l.stream())
  .collect(Collectors.toList())"""
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "something": "some value",
        "logs": [
          {
            "fields": [
              {
                "key": "error",
                "value": "This is an error message."
              },
              {
                "key": "info",
                "value": "Arbitrary info"
              },
              {
                "key": "error",
                "value": "This is a second error message."
              }
            ]
          }
        ]
      }
    },
    {
      "_source": {
        "something": "some value",
        "logs": [
          {
            "fields": [
              {
                "key": "error",
                "value": "This is a third error message."
              },
              {
                "key": "info",
                "value": "Arbitrary info"
              }
            ]
          }
        ]
      }
    }
    
  ]
}

note, that the triple double ticks syntax is from kibana and might ne to be replaced when used with curl.

Wow, thank you! I was trying to get it done with the built-in filters, but this works perfectly and is easier to maintain as well. I really need to learn how to use scripts in ES.

Thank you again!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.