Accessing field data faster in script


(avacados-2) #1

How to access field data faster from native (java) script ??? should i
enable 'doc values'?

I am already using doc().getField() and casting to long. It is date field
type. But whenever, my argument to script changes, it has poor performance
for search query. Subsequent call with same argument has good performance.
(might be because _cache is true for that script filter.)

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/afd89e62-0773-4684-904d-53805d9d7358%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Adrien Grand) #2

Script filters are inherently slow due to the fact that they cannot
leverage the inverted index in order to skip efficiently over non-matching
documents. Even if they were written in assembly, this would likely still
be slow.

What kind of filtering are you trying to do with scripts?

On Thu, Aug 14, 2014 at 8:42 AM, avacados kotadia.akash@gmail.com wrote:

How to access field data faster from native (java) script ??? should i
enable 'doc values'?

I am already using doc().getField() and casting to long. It is date field
type. But whenever, my argument to script changes, it has poor performance
for search query. Subsequent call with same argument has good performance.
(might be because _cache is true for that script filter.)

Thanks.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/afd89e62-0773-4684-904d-53805d9d7358%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/afd89e62-0773-4684-904d-53805d9d7358%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5MH4Pw_sLy9M7Tr01gH0L-QQbRfXQSQZg7iYrFT_EQtA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(avacados-2) #3

Thanks Adrien for reply.

My script filter was,

{
"script": {
"script": "xyz",
"params": {
"startRange": 1407939675, // Timestamp in
milliseconds ... keep changing on all queries
"endRange": 1410531675 // Timestamp in
milliseconds..... keep changing on all queries
},
"lang": "native",
"_cache": true // I removed this caching and i
found significant performance improvement... do you know why ? :slight_smile:
}
},

===============================================================================================
My Native(Java) script code // Return true if date ranges overlaps.

===========================================================================

ScriptDocValues XsDocValue = (ScriptDocValues) doc().get(

"start_time");

long XsLong = 0l;

if (XsDocValue != null && !XsDocValue.isEmpty()) {

XsLong = ((ScriptDocValues.Longs) doc().get("start_time"))

.getValue();

}

ScriptDocValues XeDocValue = (ScriptDocValues) doc().get("end_time");

long XeLong = 0l;

if (XeDocValue != null && !XeDocValue.isEmpty()) {

XeLong = ((ScriptDocValues.Longs) doc().get("end_time"))

.getValue();

}

if ((endRange >= XsLong) && (startRange <= XeLong)) {

return true;

}

===========================

On Monday, August 18, 2014 1:50:17 PM UTC+5:30, Adrien Grand wrote:

Script filters are inherently slow due to the fact that they cannot
leverage the inverted index in order to skip efficiently over non-matching
documents. Even if they were written in assembly, this would likely still
be slow.

What kind of filtering are you trying to do with scripts?

On Thu, Aug 14, 2014 at 8:42 AM, avacados <kotadi...@gmail.com
<javascript:>> wrote:

How to access field data faster from native (java) script ??? should i
enable 'doc values'?

I am already using doc().getField() and casting to long. It is date field
type. But whenever, my argument to script changes, it has poor performance
for search query. Subsequent call with same argument has good performance.
(might be because _cache is true for that script filter.)

Thanks.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/afd89e62-0773-4684-904d-53805d9d7358%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/afd89e62-0773-4684-904d-53805d9d7358%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/31232686-2208-4c9e-a0a5-53e7e33ba275%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Adrien Grand) #4

Your filter would be faster if you used range filters on the start/end
dates instead of using a script.

On Mon, Aug 18, 2014 at 10:52 AM, avacados kotadia.akash@gmail.com wrote:

                    "_cache": true   // I removed this caching and i

found significant performance improvement... do you know why ? :slight_smile:

Yes: when caching a filter, it needs to be evaluated over all documents of
your index in order to be loaded into a bit set. On the other hand, when a
script filter is not cached it will typically only be evaluated on
documents that match the query.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6q0HJc_-J0i_mLBe%2BGKhkFdBEeTTabuYFGx21VToRVnQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #5