Merge two urls json at one doc

Hi,

I use http_poller imput logstash but i have two URLs and this create two docs....

I want to create only one doc with all fields of this urls, its is possible?

input {
http_poller {
urls => {
station_information => "https://gbfs.nextbike.net/maps/gbfs/v1/nextbike_le/de/station_information.json"
station_status => "https://gbfs.nextbike.net/maps/gbfs/v1/nextbike_le/de/station_status.json"
}
request_timeout => 60
schedule => { every => "10s"}
codec => "json"
}
}
filter {
{
field => "[data][stations]"
}
}

output {
elasticsearch {
hosts => "x.x.x.x:9200"
user => "xxxx"
password => "xxxx"
document_type => "logs"
index => "name-%{+YYYY.MM}"
}

}

Yes, you can do that.
You want to have something like:

  {
    "station_id": "11249439",
    "num_bikes_available": 5,
    "num_docks_available": 0,
    "is_installed": 1,
    "is_renting": 1,
    "is_returning": 1,
    "last_reported": 1559300580,
    "station_id": "11249439",
    "name": "Durstexpress",
    "short_name": "4108",
    "lat": 51.384999453022,
    "lon": 12.39079819381,
    "region_id": "1"
  }

right?

That will be one document in my understanding.
Other stations will create another document?

Yes!! How to made this??

And yes, one docuemnt per station...

Hold on, writing the answer and testing.

There you go:
The easiest solution is to load the document with the same ID to the ES and let it handle the merge there.

input {
	http_poller {
	urls => {
		station_information => "https://gbfs.nextbike.net/maps/gbfs/v1/nextbike_le/de/station_information.json"
		station_status => "https://gbfs.nextbike.net/maps/gbfs/v1/nextbike_le/de/station_status.json"
		}
	request_timeout => 60
	schedule => { every => "10s"}
	codec => "json"
	}
}

filter {
	if [data][stations][1]{
		split {
			field => "[data][stations]"
		}
	}

	fingerprint {
		method => "MD5"
		concatenate_sources => true
		source => ["[stations][station_id]"]
		target => ["fingerprint"]
	}
}

output {
	elasticsearch {
		hosts => ["192.168.1.1:9200", "192.168.1.2:9200"]
		index => "stations"
		action => "update"
		document_id => "%{fingerprint}"
	}
}


Good luck with your endeavor!

With your .conf i have this error:

[WARN ] 2019-06-03 08:48:46.784 [[main]>worker0] elasticsearch - Could not index event to Elasticsearch. {:status=>404, :action=>["update", {:_id=>"9894f753eabb697a9578eedd6a749d29", :_index=>"stations", :_type=>"doc", :routing=>nil, :_retry_on_conflict=>1}, #<LogStash::Event:0x2c15d3e>], :response=>{"update"=>{"_index"=>"stations", "_type"=>"doc", "_id"=>"9894f753eabb697a9578eedd6a749d29", "status"=>404, "error"=>{"type"=>"document_missing_exception", "reason"=>"[doc][9894f753eabb697a9578eedd6a749d29]: document missing", "index_uuid"=>"E4B5M576Syi8RQ2ht9_a0g", "shard"=>"0", "index"=>"stations"}}}}

Preformatted text[WARN ] 2019-06-03 08:48:46.785 [[main]>worker0] elasticsearch - Could not index event to Elasticsearch. {:status=>404, :action=>["update", {:_id=>"9894f753eabb697a9578eedd6a749d29", :_index=>"stations", :_type=>"doc", :routing=>nil, :_retry_on_conflict=>1}, #LogStash::Event:0x9b6f80e], :response=>{"update"=>{"_index"=>"stations", "_type"=>"doc", "_id"=>"9894f753eabb697a9578eedd6a749d29", "status"=>404, "error"=>{"type"=>"document_missing_exception", "reason"=>"[doc][9894f753eabb697a9578eedd6a749d29]: document missing", "index_uuid"=>"E4B5M576Syi8RQ2ht9_a0g", "shard"=>"0", "index"=>"stations"}}}}

I need create one doc with all information per schedule retry (60s for example) becouse i need to do it histogram or pie, etc...

Hey.
Apologies, pasted the different version of the config.

The working version is:

input {
        http_poller {
        urls => {
                station_information => "https://gbfs.nextbike.net/maps/gbfs/v1/nextbike_le/de/station_information.json"
                station_status => "https://gbfs.nextbike.net/maps/gbfs/v1/nextbike_le/de/station_status.json"
                }
        request_timeout => 60
        schedule => { every => "10s"}
        codec => "json"
        }
}

filter {
        if [data][stations][1]{
                split {
                        field => "[data][stations]"
                }
        }

        fingerprint {
                method => "MD5"
                concatenate_sources => true
                source => ["[data][stations][station_id]"]
                target => ["fingerprint"]
        }
}

output {
        elasticsearch {
                hosts => ["192.168.1.1:9200", "192.168.1.2:9200"]
                index => "stations"
                document_id => "%{fingerprint}"
                action => "update"
                doc_as_upsert => "true"
        }
}

With this configuration I do not get any warnings or errors as you have pasted one above.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.