I'm attempting my first pivot transform with a goal of calculating duration between timestamps. In the group_by clause, it results in a two documents, each with a timestamp. I have been able to calculate the duration using a min and max aggregation, along with a bucket_script like this:
There are a number of other fields, nested within a sub-object metadata for example foo a string and bar an integer, and several others. How can I include them in the result? Note that they are guaranteed to be identical between the two documents I just need any copy of the values.
Do I need a second transformation, combining this destination index and the original source? Or are there aggregations I can use? Ideally it can copy the whole metadata without enumerating the individual fields. Or must I enumerate these fields in the group_by clause?
For the pivot function, the destination index output fields must either be specified in the group_by section or the aggregations section. It would be best to experiment against your local data as to which is most appropriate.
I suppose adding Scripted metric to the transform is the possible way. You may create custom metric just retrieving the first document and discard the rest.
If you only contain numeric fields, Top Metrics aggregation could be another option while you need list up all fields needed.
Top metrics does actually support keywords (naming is hard) and I suspect it is more performant than a scripted metric -- but it depends on your data and both should work.
I attempted top_metrics but it gave me null results. Not sure if this is related, but the context-sensitive autocomplete did not provide top_metrics as a suggestion, only top_hits. Not sure if this is due to it being within a pivot transform or not.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.