Parse json Array input


(Mehdi AOUADI) #1

I am retrieving some json data from a REST API ussing the http_poller input pluging :

http_poller {
  
  urls => {
     "myurl" => "https://myAPI"
  }
  interval => 30
  type => "myType"
  add_field => {
     "tag" => "myTag"
  }

}

This returns a json formatted data :
{"data_from_cache": false, "logs": [{"protocol": "PESIT", "processed": false }]}

I need this data to be indexed in elasticsearch and I already prepared a mapping :

{
    "my_mapping" : {
      "properties" : {
         "protocol"  : { "type": "string" },
         "processed" : { "type": "boolean" },  		
         "tag" : { "type" : "String" }
        }
    }
}

The data is not wrapped like the mapping. This configuration put the data like the following in elasticsearch :

data_from_cache: "false"
logs: "{ "protocol": "PESIT", "processed": false} "

I need the fields "protocol" and "processed" to be mapped as mentioned in separate fields. I do not need the data_from_cache field, I just need the data within logs in separate fields. How can I do that ? Should I use a json filter or a json codec ?


JSON file parsing
(Magnus B├Ąck) #2

It's unclear exactly what the data is stored like in Elasticsearch (use a stdout { codec => rubydebug } output to make thing unambiguous), but you may have to add codec => json to your http_poller input. Additionally you need a mutate filter that renames the protocol and processed subfields to the top level and deletes the undesired data_from_cache field.


(Mehdi AOUADI) #3

I already put a json codec . Finally I used a split filter to split the json data to fields and a mutate filter to delete the unused ones and rename the others in order to remove parent.child names of the fields (logs.protocol and logs.reprocessed). Here is my final config :

input {
  http_poller {      
		urls => {
		  "myurl" => "https://myRestAPIurl"
		}
		interval => 30
		type => "mytype"
		add_field => {
			"tag" => "myTag"
		 }
		 codec => "json" 
	}
}
filter {
 if [tag] == "myTag" {
		split {
			field => "logs"
		}
		mutate { 
			remove_field => [ "data_from_cache" ]
		}
		mutate {
			rename => { "[logs][protocol]" => "protocol" }
			rename => { "[logs][reprocessed]" => "reprocessed" }
		}
    }
output {
if [tag] == "myTag" {
		elasticsearch {
			hosts => [ "localhost:9200" ]
			index => "myIndex"
			document_type => "myType"
		}	
	}      
}

(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.