Thoughts on replacing a date numeric range filter with a script filter

Hi folks,

I want to be able to do some scripting in a numeric range of dates within a
filtered query. So I though I would show you what I have done and ask you
for your advice as I'm left with the suspicion that I could have done
things differently or more elegantly!

I'm starting with something simple like this numeric_range filter

{
"numeric_range" : {
"start_date" : {
"gte" : "2010-01-01",
"lte" : "2013-01-01"
}
}
}

And going to something like this:

{
"script": {
"script": "doc['start_date'].value / 1000 >= start_param && doc['
start_date'].value / 1000 <= end_param",
"params": {
"start_param": 1262304000
"end_param": 1356998400
}
}
}

The main differences are that the value of a date is in milliseconds, and I
am passing in unix/epoch timestamps in seconds, hence the divide by 1000

Now, my actual use for this is for a script filter where "start_range" also
stored in the doc and represents the number of days:
"script": "(doc['start_date].value / 1000 )+ (doc['start_range'].value *
86400) <= param1 "

So, it works, and performs fine. My questions are: does this make sense?
Could I have done it easier or cleaner? Is my solution worse for
performance / caching than the numeric_range? Have I missed something
glaringly obvious? Perhaps mvel and elasticsearch expose some date values
that I could use?

Cheers!

Tim

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

hey,

On Wednesday, May 15, 2013 10:59:41 PM UTC+2, Tim Waters wrote:

Hi folks,

I want to be able to do some scripting in a numeric range of dates within
a filtered query. So I though I would show you what I have done and ask
you for your advice as I'm left with the suspicion that I could have done
things differently or more elegantly!

I'm starting with something simple like this numeric_range filter

{
"numeric_range" : {
"start_date" : {
"gte" : "2010-01-01",
"lte" : "2013-01-01"
}
}
}

And going to something like this:

{
"script": {
"script": "doc['start_date'].value / 1000 >= start_param && doc['
start_date'].value / 1000 <= end_param",
"params": {
"start_param": 1262304000
"end_param": 1356998400
}
}
}

The main differences are that the value of a date is in milliseconds, and
I am passing in unix/epoch timestamps in seconds, hence the divide by 1000

Now, my actual use for this is for a script filter where "start_range"
also stored in the doc and represents the number of days:
"script": "(doc['start_date].value / 1000 )+ (doc['start_range'].value *
86400) <= param1 "

So, it works, and performs fine. My questions are: does this make sense?
Could I have done it easier or cleaner? Is my solution worse for
performance / caching than the numeric_range? Have I missed something
glaringly obvious? Perhaps mvel and elasticsearch expose some date values
that I could use?

the numeric range is actually executed on the inverted index and has very
low memory requirements while is still fast. you can also index the
timestamp in seconds if you want as a long and use numeric ranges on that
so you get kind of 'best of both worlds'. The script is likely to be way
slower if you match lots of docs and it needs to represent all dates in
memory so that is 64bit per document heapspace.

hope that helps....

Cheers!

Tim

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Heya Simon

As I understand it, the numeric_range query uses fielddata, not the
inverted index. The range query uses the inverted index.

clint

On 15 May 2013 23:27, simonw simon.willnauer@elasticsearch.com wrote:

hey,

On Wednesday, May 15, 2013 10:59:41 PM UTC+2, Tim Waters wrote:

Hi folks,

I want to be able to do some scripting in a numeric range of dates within
a filtered query. So I though I would show you what I have done and ask
you for your advice as I'm left with the suspicion that I could have
done things differently or more elegantly!

I'm starting with something simple like this numeric_range filter

{
"numeric_range" : {
"start_date" : {
"gte" : "2010-01-01",
"lte" : "2013-01-01"
}
}
}

And going to something like this:

{
"script": {
"script": "doc['start_date'].value / 1000 >= start_param && doc['
start_date'].value / 1000 <= end_param",
"params": {
"start_param": 1262304000
"end_param": 1356998400
}
}
}

The main differences are that the value of a date is in milliseconds, and
I am passing in unix/epoch timestamps in seconds, hence the divide by 1000

Now, my actual use for this is for a script filter where "start_range"
also stored in the doc and represents the number of days:
"script": "(doc['start_date].value / 1000 )+ (doc['start_range'].value *
86400) <= param1 "

So, it works, and performs fine. My questions are: does this make sense?
Could I have done it easier or cleaner? Is my solution worse for
performance / caching than the numeric_range? Have I missed something
glaringly obvious? Perhaps mvel and elasticsearch expose some date values
that I could use?

the numeric range is actually executed on the inverted index and has very
low memory requirements while is still fast. you can also index the
timestamp in seconds if you want as a long and use numeric ranges on that
so you get kind of 'best of both worlds'. The script is likely to be way
slower if you match lots of docs and it needs to represent all dates in
memory so that is 64bit per document heapspace.

hope that helps....

Cheers!

Tim

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.