(Ingest Pipeline) Grok on an Array using Foreach -> Pipeline processors

Hello,
I have a document structure that looks like the example at the end and the pipeline is just after.

I tried building a pipeline that would then use a foreach processor to call a pipeline in which I used a grok processor. The first problem was obviously that there was no way to directly return the result so I added a script processor to combine the resulting fields inside a single entry and then pushed that entry to the _ingest._value field. Again, it worked for the first item (my field value was now a json string stored on the first item of the array). But I still had the error for all the next entries.

I then figured that if I tried to remove fields at the end of each subPipeline run, grok wouldn't complain but the error didn't change.

Is there any way of doing something like that without having to enable painless regex?

The error is

(grok [Unity]) cannot set [0] with parent object of type [java.lang.String] as part of path [_ingest._grok_match_index.0]

{
  "doc" : {
    "_index" : "_index",
    "_type" : "_doc",
    "_id" : "_id",
    "_source" : {
      "pipeline_version" : 0.3,
      "event.provider" : "ProductA",
      "message" : "I love burmese cats!",
      "Detail" : {
        "ApplicationName" : "/UrlA/UrlB",
        "AdditionalInfo" : {
          "UnityContainer" : [
            "+ IUnityContainer  '[default]'  Container",
            "+ IStorage -> StorageSession  '[default]'  Singleton",
            "+ IRequestManagement -> RequestManagement  '[default]'  Transient"
          ],
        "_pk_id" : {
          "20" : {
            "ef04" : "00aaa0aa00000a00.0000000000.0.0000000000.0000000000."
          },
          "3" : {
            "ef04" : "a00a0a00a0000a00.0000000000.0.0000000000.0000000000."
          }
        },
        "MVC_action" : "ActionA",
        "RolesClient" : "Non disponible",
        "ConteneurUnityCount" : "3",
        "MVC_area" : "AreaA"
      }
    }
  },
  "_ingest" : {
    "timestamp" : "2020-08-17T19:20:06.231195Z"
  }
}

"processors": [
  {
    "foreach": {
      "if": "ctx.Detail?.AdditionalInfo?.UnityContainer != null && ctx.Detail.AdditionalInfo.containsKey('UnityContainer')",
      "field": "Detail.AdditionalInfo.UnityContainer",
      "processor" : {"pipeline" : {"name": "SubPipeline"}}
    }
  }
]

PUT _ingest/pipeline/SubPipeline
{
    "processors": [
      {
        "grok": {
          "field": "_ingest._value",
          "patterns" : [
            "\\+ %{DATA:_ingest.Unity.Interface} -> %{DATA:_ingest.Unity.Mapped}  '%{DATA:_ingest.Unity.Type}'  %{DATA:_ingest.Unity.LifeTimeManager}$",
            "\\+ %{DATA:_ingest.Unity.Interface}  '%{DATA:_ingest.Unity.Type}'  %{DATA:_ingest.Unity.LifeTimeManager}$"
          ],
          "trace_match" : true,
          "on_failure" : [{"set" : {"field" : "error", "value" : "{{error}} || (grok [Unity]) {{ _ingest.on_failure_message }}"}}]
        }
      },
      {"set":{"field": "_ingest._value", "value":"{{_ingest.Unity}}"}},
      {"remove":{"field": "_ingest.Unity", "ignore_missing":true}}
    ]
  }

After looking more closely at similar errors, I realised that the problem was not with my processing but with "_ingest._grok_match_index" that is automatically set when "trace_match" : true is set on the grok processor. Setting it to False or removing the field prior to reexcuting Grok fixes the situation.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.