Dynamic pattern matching over a sequence of events


(Andr'e) #1

I'm new to ES and i'm trying to figure out if a certain scenario is
possible before making a decision on using ES. Here is the scenario: I have
millions of events that are tied to employees, each event also contains the
employee id so no need for a join, data duplication and flat tables are
fine. I'll like to store these events in ES in such a way that I can run
queries across multiple events and employees in sequence that match a
specific pattern.

Ok, there is an example, I'll like to be able to run a query that returns
all employees that did X and then did y and then did z in that order
between 1/2/1024 and 1/31/2014 and it will return a distinct list/group of
employees or employee id's.

The key point is the then, its ok if the employee did other things in
between, it would be nice to enforce that but for now I just need the basic
working. in non search world, the solution will most likely be iterating
over a subset of queries. Meaning give me all users that did X and store
somewhere and then another query to give me a subset of the result from
query 1 that did Y, etc...but I figured there has to be a better way.

Any ideas or suggestions will be greatly appreciated.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1fde2820-6f99-45c8-9003-7b1d10139ca9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #2

For what you want to achieve, the aggregations feature looks perfect.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGzrXwP_bKiqQY8fO1C45qr%3D8cH64kHzYexAk-JK%3D0tAQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Andr'e) #3

Thanks,

However, other than 1.0 is not out yet with aggregations, I'm not sure it
will give me what I need even if it was. Here is a different scenario that
might help.

GA has a similar concept here is a screenshot of what it looks like.

https://lh4.googleusercontent.com/-nUpgymxd3hk/UtA-o42IU0I/AAAAAAAAACw/6nRFYwXcbbo/s1600/Untitled.png

so if we change the example and use visitors and pages viewed on a website.

Visitor1 - viewed (Page1.html,Page2.html,Page3.html,Page4.html,Page5.html)
Visitor2 - viewed
(Page15.html,Page12.html,Page1.html,Page4.html,Page15.html)
Visitor3 - viewed (Page9.html,Page2.html,Page3.html,Page6.html,Page4.html)

if I run a query say give me all visitors who viewed the following
(Page2.html,Page3.html,Page4.htmll)

the results will be *Visitor1 *and Visitor2.

Thanks for all your help so far.
Andr'e

On Friday, January 10, 2014 10:46:12 AM UTC-5, Jörg Prante wrote:

For what you want to achieve, the aggregations feature looks perfect.

https://www.youtube.com/watch?v=yZu4jQtBUPg#t=885

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8d827461-68a7-4abd-9074-165f18ce4125%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Stein Kåre Skytteren) #4

If you are able to put everything into one document you might try the span
near query with ordering.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-span-near-query.html

Stein Kåre

On Friday, January 10, 2014 7:47:17 PM UTC+1, Andr'e wrote:

Thanks,

However, other than 1.0 is not out yet with aggregations, I'm not sure it
will give me what I need even if it was. Here is a different scenario that
might help.

GA has a similar concept here is a screenshot of what it looks like.

https://lh4.googleusercontent.com/-nUpgymxd3hk/UtA-o42IU0I/AAAAAAAAACw/6nRFYwXcbbo/s1600/Untitled.png

so if we change the example and use visitors and pages viewed on a website.

Visitor1 - viewed (Page1.html,Page2.html,Page3.html,Page4.html,Page5.html)
Visitor2 - viewed
(Page15.html,Page12.html,Page1.html,Page4.html,Page15.html)
Visitor3 - viewed (Page9.html,Page2.html,Page3.html,Page6.html,Page4.html)

if I run a query say give me all visitors who viewed the following
(Page2.html,Page3.html,Page4.htmll)

the results will be *Visitor1 *and Visitor2.

Thanks for all your help so far.
Andr'e

On Friday, January 10, 2014 10:46:12 AM UTC-5, Jörg Prante wrote:

For what you want to achieve, the aggregations feature looks perfect.

https://www.youtube.com/watch?v=yZu4jQtBUPg#t=885

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0dbee863-ba3e-473f-8734-2bd9e8a60c81%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Vinay Khandelwal) #5

@Andr_e did you find a solution for this problem?