Finding in-process events by comparing time ranges


#1

Hi!

We're getting all events that happen (=start and end) during a certain time span:

{
   "filter": {
      "range": {
         "start": {
            "gte": "2016-10-21T05:00:00+02:00",
            "lt": "2016-10-22T05:00:00+02:00"
         }
      }
   },
   "sort": {
      "start": {
         "order": "asc"
      }
   },
   "size": 1000,
   "query": {
      "match_all": {}
   }
}

Of course this does not return events that started before the specified point. How should we approach this?


(Mark Harwood) #2

Query for less than your choice of start date, sorted by date descending with size=1


#3

Thanks for your reply! Unfortunately we don't know precisely how long the previous event is and when it started. Could be from a minute up to a couple hours.


(Mark Harwood) #4

Sorry. I presumed the "start" and "end" you referred to in your first post were the query's time span and the events were only a single point in time. If I understand correctly you're saying the start and end are properties of event docs and you want to find events whose times spans intersect with that of the query time span?


#5

Correct! We have events with a start and an end and want all events that are "running" during a specific time range (i.e. intersect with it).

Currently I'm thinking to combine the above query with two where I look for the last event to start before the start of the time range and one with the first event to end after the end of the range but I'm not sure if that's the easiest way.


(Mark Harwood) #6

It ends up being 3 query clauses ORed together in a bool should.

So with the aid of some ascii art to represent time on the x-axis :

  1. The case where event end falls between the query span:

    qry s=============e
    ev s==========e

  2. or the case where event start falls between the query span:

    qry s=============e
    ev s==========e

  3. or the case where event start and end spans the query span:

    qry s============e
    ev s=====================e

Let me know if you have trouble figuring out the query syntax for these clauses.


#7

I am indeed struggling :blush: Would you mind giving me some pointers?


#8

Actually... what about this? Am I missing anything important or have a flawed understanding of ES?

GET _search
{
  "size": 1000,
  "query": {
    "bool": {
      "should": [
        { "range": { "start": {"lte": "2016-10-21T06:00:00+02:00"} }},
        { "range": { "end": {"gte": "2016-10-21T05:00:00+02:00"} }}
      ],
      "minimum_should_match": 2
    }
  }
}

(Mark Harwood) #9

Actually that's more succinct than what I originally thought was required :slight_smile: You can drop the "should" /minimum_should_match:2 and replace with a "must" (or "filter" if on a new version of es).

It's worth knowing that our geo guy has been busy adding extra support to represent dimensions into Lucene and we can expect efficient implementation of this sort of range intersection query logic in future versions of elasticsearch.

Here's my test rig for your query on elasticsearch 5.0:

DELETE test
POST test/event
{
	"start":1,
	"end":5,
	"text":"This should match a 3-7 query"
}
POST test/event
{
	"start":1,
	"end":2,
	"text":"This should NOT match a 3-7 query"
}
POST test/event
{
	"start":8,
	"end":10,
	"text":"This should NOT match a 3-7 query"
}

POST test/event
{
	"start":5,
	"end":10,
	"text":"This should match a 3-7 query"
}
POST test/event
{
	"start":1,
	"end":10,
	"text":"This should match a 3-7 query"
}

GET _search
{
  "query": {
	"bool": {
	  "filter": [
		{ "range": { "start": {"lte": 7} }},
		{ "range": { "end": {"gte": 3} }}
	  ]
	}
  }
}

#10

It worked! Had some troubles with the FOS Elastica QueryBuilder but now it works perfectly. Thanks a bunch for the help!


(Mark Harwood) #11

No problem. I took the liberty of renaming the question title to help future people looking for answers to the same problem


(system) #12