Logstash json input file only required fields to output

I'm new in ELK & I have logs in JSON format. Below is the json sample log file. I want only item level array, others fields not required.
Sample logs

{
"source": "mdm/pim",
  "topic": "pim-record-globalfields",
  "subject": "record",
  "eventType": "PIM.Export.FileCreated.Incremental",
  "eventTime": "2023-08-23T05:48:19Z",
  "data": {
    "item": [
      {
        "id": "373892",
        "crud": "Update",
        "fields": {
          "partid": "0004400000",
          "displaypartid": "0004400000",
          
        }
      },
      {
        "id": "373895",
        "crud": "Update",
        "fields": {
          "partid": "0006200000",
          "displaypartid": "0006200000",
         
        }
      }
    ]
  }
}

configuration file


input {
	file {
	
		path => "C:/narayan regal work/sample json files/sample2.json"
		 start_position => "beginning"
		sincedb_path => "NUL"
		codec => "json"
	}
}

filter {
   		
	json {
        source => "message"
      }
	mutate {
	 remove_field => [ ,"@version","source","host","log","@timestamp","event"]
	}
}

output {
		elasticsearch {
			hosts => ["localhost:9200"]
			index => "samplejsonforlogstash9"
			 document_type => "json"
			document_id => "%{id}"
		}
	
	stdout{}
}

Instead of removing fields, i want to get only required fields array

I want only from item level array, others upper fields(source, topic to data) not required.

You could try

    mutate { rename => { "[data][item]" => "item" } }
    prune { whitelist_names => [ "item" ] }
1 Like

Thanks for quick response, it is working getting json output like this

[
    {
        "item": [
            {
                "fields": {
                    "partid": "0004400000",
                    "displaypartid": "0004400000"
                },
                "id": "373892",
                "crud": "Update"
            },
            {
                "fields": {
                    "partid": "0006200000",
                    "displaypartid": "0006200000"
                },
                "id": "373895",
                "crud": "Update"
            }
        ]
    }
]

But now i have to get output json like below. Please give advise how to achieve.

[
  {
    "fields": {
      "partid": "0004400000",
      "displaypartid": "0004400000"
    },
    "id": "373892",
    "crud": "Update"
  },
  {
    "fields": {
      "partid": "0006200000",
      "displaypartid": "0006200000"
    },
    "id": "373895",
    "crud": "Update"
  }
]

If you have an array in which the first entry is a hash that has a key called "item" that is an array of hashes and you want to move that to overwrite the outer array you could use

mutate { rename => { "[someField][0][item]" => "someField" } }

Can you please send me exact code, i could not able to understand. Thank you

I do not understand what your events look like. Can you show one using

output { stdout { codec => rubydebug } }

This is the output


{
    "item" => [
        [0] {
              "crud" => "Update",
                "id" => "373892",
            "fields" => {
                "displaypartid" => "0004400000",
                       "partid" => "0004400000"
            }
        },
        [1] {
              "crud" => "Update",
                "id" => "373895",
            "fields" => {
                "displaypartid" => "0006200000",
                       "partid" => "0006200000"
            }
        }
    ]
}

But i need in this format

[
  {
    "fields": {
      "partid": "0004400000",
      "displaypartid": "0004400000"
    },
    "id": "373892",
    "crud": "Update"
  },
  {
    "fields": {
      "partid": "0006200000",
      "displaypartid": "0006200000"
    },
    "id": "373895",
    "crud": "Update"
  }
]

This is the config file i am using after your suggestion

input {
	file {
		path => "C:/narayan regal work/sample json files/sample3.json"
		 start_position => "beginning"
		sincedb_path => "NUL"
		codec => "json"
	}
}

filter {
   		mutate { 
		   rename => { "[data][item]" => "item" }
		}
		prune { 
				whitelist_names =>  "item" 
		}
 }
output { stdout { codec => rubydebug } }

I do not understand what you mean by that. It appears to be an array of hashes, but the array has to have a name, which is currently [item]. The field has to have a name.

I want to insert only the elements inside "item" field to the elasticsearch each as an individual doc. I want to use their own id as elasticsearch doc id.
The docs I want to insert into elasticsearch will be as follows:

[
  {
    "id": "373892",
    "crud": "Update",
    "fields": {
      "partid": "0004400000",
      "displaypartid": "0004400000"
    }
  },
  {
    "id": "373895",
    "crud": "Update",
    "fields": {
      "partid": "0006200000",
      "displaypartid": "0006200000"
    }
  }
]

JSON file which i am having right now

{
"source": "mdm/pim",
  "topic": "pim-record-globalfields",
  "subject": "record",
  "eventType": "PIM.Export.FileCreated.Incremental",
  "eventTime": "2023-08-23T05:48:19Z",
  "data": {
    "item": [
      {
        "id": "373892",
        "crud": "Update",
        "fields": {
          "partid": "0004400000",
          "displaypartid": "0004400000",
          
        }
      },
      {
        "id": "373895",
        "crud": "Update",
        "fields": {
          "partid": "0006200000",
          "displaypartid": "0006200000",
         
        }
      }
    ]
  }
}

Thanks @Badger able to resolve the issue with your guidance, when json file is in single line.

input {
	
	file {
		path => "C:/narayan regal work/ELK data/pim-data/*.json"
		start_position => "beginning"
		sincedb_path => "NUL"
		codec => json
	}
}

filter {
      mutate { 
	   rename => { "[data][item]" => "item" }
	}
	split{
		field => "[item]"
	}

	prune { 
		whitelist_names =>  "item" 
	}
	ruby {
        code => '
			event.get("item").each { |k, v|
			event.set(k,v)
			}
			event.remove("item")
			'
		}
}

output {
	if "_rubyexception" not in [tags]{
		elasticsearch {
			hosts => ["localhost:9200"]
			index => "samplejsonforlogstash5"
			 document_type => "json"
			document_id => "%{id}"
			doc_as_upsert => true
			action => "update"
		}
		 stdout{}
	}
}

@Badger How to convert multi line json file to single line json file


{
  "source": "logs",
  "data": {
    "item": [
      {
        "id": "101",
        "crud": "Create",
        "fields": {
          "partid": "101"
        },
        "logistics": {
          "weight": "6.10",
          "height": "4"
        }
      },
      {
        "id": "102",
        "crud": "Create",
        "fields": {
          "partid": "102"
        },
        "logistics": {
          "weight": "0.62",
          "height": "3"
        }
      },
      {
        "id": "103",
        "crud": "Create",
        "fields": {
          "partid": "103"
        },
        "logistics": {
          "weight": "4.88",
          "height": "2"
        }
      }
    ]
  }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.