data structure I have given there is copy of _source taken from ES. I need to check if there are any objects that has no "availability" array. Could that be causing the error?
Once they are split, do I refer to them as root level attributes? i.e. instead of "[availability][price]" should I use "price" ?
That did help to get what I need. I had to move IF condition into filter block. Apart from that it does the job. I also used ruby code to move attributes to root element.
How do I copy "availableFrom" into timestamp type column in DB?
There is a jdbc output plugin written by a third party. It is not supported by elastic and the person who wrote it has not modified it in over a year, so it is not clear that they are supporting it either.
date {
match => ["availableFrom", "yyyy-MM-dd'T'HH:mm:ss"]
target => "start_date"
}
Afterwards I had to cast it to date type in INSERT statement as otherwise DB treated start_date attribute to be string even though it is a date type (though date filter converts it).
statement => ["INSERT INTO products (asset_id, asset_type, available_from) values (?, ?, CAST (? AS timestamp))", "assetId", "type", "start_date"]
This allows inserting date attribute into timestamptz (Timestamp with time zone) field in the database table.
Since "split" filter clones events, I do have another problem. This means the entries in root level get duplicated as well. I'm thinking of using two separate INSERT statements with two tables, one for root and another one for nested array items. In order to avoid duplicates, is it a must to run them as two separate pipelines with two separate configuration files?
It may not be a "must" but the forked-path pattern for pipeline-to-pipeline communications might be useful to you.
To do it in a single pipeline you would use a clone filter to duplicate the event, then process one copy with the split and in the other just retain the fields you want in the root.
Thanks @Badger. That is an awesome idea. If I clone the event and split the cloned copy, how do I enforce Logstash to loop through split items while process original event (single event) in output plugin?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.