Including the max and min aggregations in a query

Hi all,

I have a set of hierarchical jobs which are executed every night as part of a batch. I have log files where each line contains the name of the job prefixed with the names of all its parent jobs up to the highest level, time take by the job and the date on which the batch was executed.
I parsed the log files through logstash and fed the output to elasticsearch, by creating a custom index. Each document in the index consists the following fields:
parent1, parent2, ..., parentn, job_time and batch_date.

Each job may have zero or more parents depending on its position in the hierarchy. There is an additional document that represents the batch and the time taken by the batch as a whole.

I add new documents to the same index every day.

I have created a Kibana bar chart visualization with max aggregation for job time on y axis and date histogram aggregation for the batch date on x axis.

Now, I need to compare the dates on which the batch took the maximum and the minimum time. I then need to figure out which job caused the most difference in the batch times of the two dates.

Is there any way to query the dates on which the job took the max and the min times? If so, is there a way to use its result to compute the difference between the job times of each job for both dates and find out the job that causes the max time difference.

I'm using elasticsearch 5.2.0 and kibana 5.2.0 on windows 7.

Thanks in advance

Hi Gautam,

I don't quite understand why you need to "compute the difference between the job times of each job..." to find out the job that cause the max time difference?

Don't you just need to find the job with the max time? Isn't it the one that cause the max time difference?

Maybe I'm not understanding your data. Could you paste an example couple of docs here?

Regards,
Lee

@LeeDr Thanks for the response.

Here is a screenshot showing the fields in my docs.

l1_job_name represents the highest level, i.e., level 1 jobs and the entire batch as well.
l2_job_name represents the level 2 jobs, i.e., the children of level 1 jobs.
l3_job_name represents the level 3 jobs, i.e., the children of the level 2 jobs.
job_time represents the time taken by a job/batch.
batch_date represents the date on which the batch was executed.

As it can be seen, the batch was executed on 3 different dates, which are 16, 25 and 26.

I would like to have a query where I can obtain the dates on which the batch took the max and the min time, which are 16 and 25 respectively.

After obtaining the two dates, I would like to know which job caused the most difference in the times of the two dates. I would also like to know the duration by the which the particular job caused the difference, hence I need to compute the difference between the job times of both dates.

Please let me know if you need additional information to help understand my problem.

Thanks & regards,
Gautam

Thanks for the additional info Gautam. I'm afraid you're trying to do something Kibana wasn't designed for.

On Discover tab you can see individual docs and sort by a field to get min or max. But you can't take the difference between times or things like that.

Visualize, on the other hand, only deals with aggregations of data.

So you could create a Data Table visualization to find the job with min job_time (as long as there's only one job per batch_date with the max job_time).
And you could create another Data Table Visualization to find the max job_time.
And you could combine those together on a dashboard.

But you can't select individual docs based on some other derived values from aggregations. (at least not any way I know of)

Regards,
Lee

Thanks for your input @LeeDr

I'll have to look into other options to fulfill my requirement.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.