I don't want post my company data, but the above query (with the trailing ">" removed) is returning valid JSON. The thing is I have a couchapp hosted on the couchdb. So, some of the return values and key look like html. Maybe that's what chokes it?
I found the offending key-value pair: "selectors":{"a[href=#logout]":{"click":["doLogout"]}}
(There are a few more that says login, signup, etc.)
So, is this key invalid? I thought JSON key can be any string. Or is logstash trying to interpret the key in some way? If so, I need to disable that. How can I do that?
That is a known issue. The code uses event.set and there is no way to stop event.set parsing field references.
If the set of offending keys is fixed you could mutate+gsub them. If it is dynamic then you would have to use a ruby filter (reusing code from the json filter) and iterate over the resulting keys to mutate them if needed before doing the event.set call.
Same result. So, I seems I need to filter before the input. Changing the order of the input block and the filter block didn't help. It would be great if I can tell the input plugin to skip that one document, but there doesn't seem to be such an option.
Oh, I see. The couchdb_changes input is parsing the JSON. No way to avoid the exception with that input. Maybe switch to an http_poller input and do the mutate before parsing it with a json filter.
The http_poller doesn't kick in even once 15 minutes after restarting logstash, even though I've set "every" => "1m" (I hope "1m" doesn't mean 1 month.) What did I do wrong?
Also, if I use http-poller, then I need to somehow keep track of the seq_no of couchdb changes, otherwise it will read the whole DB every time. Doesn't seem trivial to me.
OK. Let me summarize what potential solutions I have tried and failed:
Turning off field dereferencing. Impossible. No way to get around event.get().
Skipping the offending field. Impossible. Because filters kick in after input.
Skipping the offending document. Impossible. Because the input plug-in doesn't contain internal filters and it doesn't recognize filters (or views) in couchdb, so there is no way to pre-filter.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.