Flatten JSON Array in Logstash Filter

I have a REST API call that returns the structure below. When I send this structure to Elasticsearch, all of the records within the "data" array are combined into one Elasticsearch document.

{
    "data": [
        {
            "updateDate": "2017/12/13 19:53",
            "id": "1234-ABCDE",
            "title": "Foo",
            "status": "ACTIVE",
            "key": {
                "number": 100,
                "version": 2,
                "year": 2017
            }
        },
        {
            "updateDate": "2017/12/14 10:22",
            "id": "4567-EFGHI",
            "title": "Bar",
            "status": "INACTIVE",
            "key": {
                "number": 200,
                "version": 5,
                "year": 2018
            }
        },
    ]
}

My goal is to manipulate the data to the following output so each "data" element is put into its own Elasticsearch document.

{
    "updateDate": "2017/12/13 19:53",
    "id": "1234-ABCDE",
    "title": "Foo",
    "status": "ACTIVE",
    "key": {
        "number": 100,
        "version": 2,
        "year": 2017
    }
},
{
    "updateDate": "2017/12/14 10:22",
    "id": "4567-EFGHI",
    "title": "Bar",
    "status": "INACTIVE",
    "key": {
        "number": 200,
        "version": 5,
        "year": 2018
    }
}

Is this possible in a filter? I've used the SPLIT plugin to break out each "data" element into its own document, but that results in fields that are still nested under a "data" element like this:

{
  "_index": "FOOBAR",
  "_type": "doc",
  "_id": "STdrPmIBPlj91gLKwYBv",
  "_score": 1,
  "_source": {
    "data": {
      "updateDate": "2017/12/13 19:53",
      "id": "1234-ABCDE",
      "title": "Foo",
      "status": "ACTIVE",
      "key": {
          "number": 100,
          "version": 2,
          "year": 2017
      }
    },
  "@version": "1",
  "@timestamp": "2018-03-19T13:21:11.794Z"
}

You can use a mutate filter to move the fields to the top level of the document. If the field names aren't known beforehand you can use a ruby filter to iterate over them and move them all.

1 Like

Thank you for the guidance! For the benefit of future readers, this is my working filter:

filter {

  split {     
    field => "[data]"
  }
  
  mutate {
    add_field => { 
      "id" => "%{[data][id]}"
      "updateDate" => "%{[data][updateDate]}"
      "title" => "%{[data][title]}"
      "status" => "%{[data][status]}"      
    }
    
    remove_field => [ "[data]" ]
  }
  
}

It results in the structure below:

{
  "_index": "FOOBAR",
  "_type": "doc",
  "_id": "lDexPmIBPlj91gLKC7N7",
  "_score": 1,
  "_source": {
    "@timestamp": "2018-03-19T14:36:52.762Z",
    "updateDate": "2017/12/13 19:53",
    "title": "Foo",
    "status": "ACTIVE",
    "id": "1234-ABCDE",
    "@version": "1"
  }
}
6 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.