We just started to use logstash for parsing logs and adding the data to
ElasticSearch. Using kibana, we can get some good visualizations on our
process runs
I have the records in ES as follows ( these are basically log entries from
log stash)
Timestamp
TxnId
Caller
EventName
Duration
Other Data..
2013-12-20T22:35:17.109365+00:00
9f3c264b-8ee3-4b2a-ac16-162290c7cf45
Worker
Start
..
2013-12-20T22:38:17.109365+00:00
9f3c264b-8ee3-4b2a-ac16-162290c7abcd
Worker
Start
..
2013-12-20T22:40:17.109365+00:00
9f3c264b-8ee3-4b2a-ac16-162290c7cf45
Worker
End
20
..
2013-12-20T22:42:17.109365+00:01
9f3c264b-8ee3-4b2a-ac16-162290c7abcd
Worker
End
40
..
2013-12-20T22:45:17.109365+00:00
9f3c264b-8ee3-4b2a-ac16-162290c7efgh
Worker
Start
..
Basically this is a log of worker process and their start and finish
timesFor a given time period, we would like to know all worker processes
that have started and ended and worker process that have not ended. For
e.g.: the result for the data above would be
Timestamp
TxnId
Caller
EventName
Other
Duration
Other Data..
2013-12-20T22:35:17.109365+00:00
9f3c264b-8ee3-4b2a-ac16-162290c7cf45
Worker
Start
Ended
20
..
2013-12-20T22:38:17.109365+00:00
9f3c264b-8ee3-4b2a-ac16-162290c7abcd
Worker
Start
Ended
40
..
2013-12-20T22:45:17.109365+00:00
9f3c264b-8ee3-4b2a-ac16-162290c7efgh
Worker
Start
..
This can be achieved in sql by grouping on TxnId. We could ideally filter
this further with having to find workers that have not ended ( and how long
they have been working/timeout). I think this is basically a co-relation
on TxnId and EventName.
I have looked at facets (and aggregations) , nested documents, parent
child objects. I did not find lot of documentation and examples on new way
of faceting ( aggregations). It seems to me that it would be ideal if I
could build another index that uses TxnId as co-relation and then I can get
the results I want. I guess this is pretty close to what facets can do. I
am still not sure that if I facet these documents by TxnId would I be able
to run a query across all faceted entries asking for start time and end
time. Can I build an alternate index of nested documents based on TxnId ?
If so, I can pretty much query what I want . The other alternative is to
update the document with appropriate data while reading from log entries.
It would be great if someone can point me in the right direction here.
Thanks,
Vijay
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c80a812-7c79-4794-98ca-c13e367302b6%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.