The target directive for http_poller is a little... weird; I would avoid it, and rely on the codec putting the message in the message field; if you need it to end up in data, you can rename it immediately after the input in a filter:
Without the target directive, an Event is created by the codec; most simple codecs (like multiline) will capture the message and create an Event that looks something like:
When the target is set, the codec still creates the above Event, but then http_poller converts the Event to a Hash (which throws out the @metadata), and puts the result in a new event at the target address:
Good stuff yaauie, thanks for the info. I BELIEVE I started without the target option and was encountering the same issue. Unfortunately, after requesting information from the link provider, they took the feed down for maintenance, I guess I notified them of an issue they weren't aware of. So right now, I have nothing to test against or the time to go out and find another one, I might be able to later tonight though.
Another question, if the page is only updated say, once a day, but I poll the page every hour, does that mean I will have duplicate entries or does http_poller (or some other input/codec/filter) have the ability to track and process only changed data?
Another question, if the page is only updated say, once a day, but I poll the page every hour, does that mean I will have duplicate entries or does http_poller (or some other input/codec/filter) have the ability to track and process only changed data?
No, but if you save the data in an ES document with a fixed name you'll overwrite the same document again and again.
Use the elasticsearch output's document_id option. Not sure I understand why you want to create a new document at fixed intervals (regardless of whether the source has changed), but you could do e.g.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.