Native script caching

Sergey_Novikov · March 16, 2015, 4:32pm

Hi,

I'm trying to cache script results using

cache = CacheBuilder.newBuilder()

.maximumSize(CACHE_MAX_SIZE)
.recordStats()
.build();

then in the script I have

@Override

public Integer run() {
try {
return cache.get(getCacheKey(), callable);
} catch (ExecutionException e) {
throw new ScriptException(e.getMessage(), e);
}
}

and the callable is:

new Callable() {

@Override
public Integer call() throws Exception {
    return getCalculatedResult();
}

};

Could you please help me to create a proper cache key? I want to keep
unique results for each document/index. As I understand, cache is shared
between multiple indices, so I need to put it in the cache key.

Questions:

What should I use to identify the document? Can I use
indexLookup().getDocId()? Or I should use docFieldLongs("id").getValue() (I
have this field in documents)? Can I access "_id" property?
Can I get the index/type during script execution?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/474ad09b-3800-4bd0-a50a-97bfd6d9086e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jpountz · March 16, 2015, 6:20pm

indexlookup().getDocId() will not work since these ids change when there is
a merge. Using a document property is a better idea if the output of your
computation solely depends on this value. The default configuration does
not let you have access to _id, but you have _uid however. Beware that you
might want to also take the index name into account if your cluster is
serving several indices... But before adding caching, I think it would help
to figure out if it would be possible to not need caching, eg. by modeling
data differently?

On Mon, Mar 16, 2015 at 5:32 PM, Sergey Novikov snov@snov.me wrote:

Hi,

I'm trying to cache script results using

cache = CacheBuilder.newBuilder()
.maximumSize(CACHE_MAX_SIZE)
.recordStats()
.build();
then in the script I have

@Override

public Integer run() {
try {
return cache.get(getCacheKey(), callable);
} catch (ExecutionException e) {
throw new ScriptException(e.getMessage(), e);
}
}

and the callable is:

new Callable() {
@Override
public Integer call() throws Exception {
    return getCalculatedResult();
}
};
Could you please help me to create a proper cache key? I want to keep
unique results for each document/index. As I understand, cache is shared
between multiple indices, so I need to put it in the cache key.

Questions:

What should I use to identify the document? Can I use
indexLookup().getDocId()? Or I should use docFieldLongs("id").getValue()
(I have this field in documents)? Can I access "_id" property?

Can I get the index/type during script execution?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/474ad09b-3800-4bd0-a50a-97bfd6d9086e%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/474ad09b-3800-4bd0-a50a-97bfd6d9086e%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
http://www.elastic.co

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAO5%3DkAi26c%3DWHSG4%3Db0J_qbnX9Oc3HBX2Zxo7a83RFboPota%3Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Sergey_Novikov · March 16, 2015, 7:42pm

Hi Adrien,

Thank you for the answer.

The output of computation depends on document data, and script parameters.
It works already ok, but with caching it seems to be several times faster.

Do you know if it's possible to get the index name from within the script?
I understand I can pass it with the script parameters, but is there a
better solution? Maybe it's already available to the script?

On Monday, March 16, 2015 at 7:20:49 PM UTC+1, Adrien Grand wrote:

indexlookup().getDocId() will not work since these ids change when there
is a merge. Using a document property is a better idea if the output of
your computation solely depends on this value. The default configuration
does not let you have access to _id, but you have _uid however. Beware that
you might want to also take the index name into account if your cluster is
serving several indices... But before adding caching, I think it would help
to figure out if it would be possible to not need caching, eg. by modeling
data differently?

On Mon, Mar 16, 2015 at 5:32 PM, Sergey Novikov <sn...@snov.me
<javascript:>> wrote:
Hi,

I'm trying to cache script results using

cache = CacheBuilder.newBuilder()
.maximumSize(CACHE_MAX_SIZE)
.recordStats()
.build();
then in the script I have

@Override

public Integer run() {
try {
return cache.get(getCacheKey(), callable);
} catch (ExecutionException e) {
throw new ScriptException(e.getMessage(), e);
}
}

and the callable is:

new Callable() {
@Override
public Integer call() throws Exception {
    return getCalculatedResult();
}
};
Could you please help me to create a proper cache key? I want to keep
unique results for each document/index. As I understand, cache is shared
between multiple indices, so I need to put it in the cache key.

Questions:

What should I use to identify the document? Can I use
indexLookup().getDocId()? Or I should use docFieldLongs("id").getValue()
(I have this field in documents)? Can I access "_id" property?

Can I get the index/type during script execution?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/474ad09b-3800-4bd0-a50a-97bfd6d9086e%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/474ad09b-3800-4bd0-a50a-97bfd6d9086e%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
http://www.elastic.co

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ae47922a-b970-4089-a46e-d9c6d70d3399%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jpountz · March 16, 2015, 8:41pm

I haven't tried, but getting the value of the _index field should work.

On Mon, Mar 16, 2015 at 12:42 PM, Sergey Novikov snov@snov.me wrote:

Hi Adrien,

Thank you for the answer.

The output of computation depends on document data, and script parameters.
It works already ok, but with caching it seems to be several times faster.

Do you know if it's possible to get the index name from within the script?
I understand I can pass it with the script parameters, but is there a
better solution? Maybe it's already available to the script?

On Monday, March 16, 2015 at 7:20:49 PM UTC+1, Adrien Grand wrote:
indexlookup().getDocId() will not work since these ids change when there
is a merge. Using a document property is a better idea if the output of
your computation solely depends on this value. The default configuration
does not let you have access to _id, but you have _uid however. Beware that
you might want to also take the index name into account if your cluster is
serving several indices... But before adding caching, I think it would help
to figure out if it would be possible to not need caching, eg. by modeling
data differently?

On Mon, Mar 16, 2015 at 5:32 PM, Sergey Novikov sn...@snov.me wrote:
Hi,

I'm trying to cache script results using

cache = CacheBuilder.newBuilder()
.maximumSize(CACHE_MAX_SIZE)
.recordStats()
.build();
then in the script I have

@Override

public Integer run() {
try {
return cache.get(getCacheKey(), callable);
} catch (ExecutionException e) {
throw new ScriptException(e.getMessage(), e);
}
}

and the callable is:

new Callable() {
@Override
public Integer call() throws Exception {
    return getCalculatedResult();
}
};
Could you please help me to create a proper cache key? I want to keep
unique results for each document/index. As I understand, cache is shared
between multiple indices, so I need to put it in the cache key.

Questions:

What should I use to identify the document? Can I use
indexLookup().getDocId()? Or I should use docFieldLongs("id").getValue()
(I have this field in documents)? Can I access "_id" property?

Can I get the index/type during script execution?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/474ad09b-3800-4bd0-a50a-97bfd6d9086e%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/474ad09b-3800-4bd0-a50a-97bfd6d9086e%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
http://www.elastic.co
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ae47922a-b970-4089-a46e-d9c6d70d3399%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ae47922a-b970-4089-a46e-d9c6d70d3399%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
http://www.elastic.co

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAO5%3DkAhazTViBZXo%2Bf8rHVQ6c57FwaQxHHHDYEpHh4N0ph6rSQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Sergey_Novikov · March 17, 2015, 1:22pm

Hi Adrien,

it works fine: docFieldStrings("_index") and docFieldStrings("_uid")

Thanks for your help.

On Monday, March 16, 2015 at 9:41:46 PM UTC+1, Adrien Grand wrote:

I haven't tried, but getting the value of the _index field should work.

On Mon, Mar 16, 2015 at 12:42 PM, Sergey Novikov <sn...@snov.me
<javascript:>> wrote:
Hi Adrien,

Thank you for the answer.

The output of computation depends on document data, and script
parameters. It works already ok, but with caching it seems to be several
times faster.

Do you know if it's possible to get the index name from within the
script? I understand I can pass it with the script parameters, but is there
a better solution? Maybe it's already available to the script?

On Monday, March 16, 2015 at 7:20:49 PM UTC+1, Adrien Grand wrote:
indexlookup().getDocId() will not work since these ids change when there
is a merge. Using a document property is a better idea if the output of
your computation solely depends on this value. The default configuration
does not let you have access to _id, but you have _uid however. Beware that
you might want to also take the index name into account if your cluster is
serving several indices... But before adding caching, I think it would help
to figure out if it would be possible to not need caching, eg. by modeling
data differently?

On Mon, Mar 16, 2015 at 5:32 PM, Sergey Novikov sn...@snov.me wrote:
Hi,

I'm trying to cache script results using

cache = CacheBuilder.newBuilder()
.maximumSize(CACHE_MAX_SIZE)
.recordStats()
.build();
then in the script I have

@Override

public Integer run() {
try {
return cache.get(getCacheKey(), callable);
} catch (ExecutionException e) {
throw new ScriptException(e.getMessage(), e);
}
}

and the callable is:

new Callable() {
@Override
public Integer call() throws Exception {
    return getCalculatedResult();
}
};
Could you please help me to create a proper cache key? I want to keep
unique results for each document/index. As I understand, cache is shared
between multiple indices, so I need to put it in the cache key.

Questions:

What should I use to identify the document? Can I use
indexLookup().getDocId()? Or I should use
docFieldLongs("id").getValue() (I have this field in documents)? Can I
access "_id" property?

Can I get the index/type during script execution?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/474ad09b-3800-4bd0-a50a-97bfd6d9086e%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/474ad09b-3800-4bd0-a50a-97bfd6d9086e%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
http://www.elastic.co
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ae47922a-b970-4089-a46e-d9c6d70d3399%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ae47922a-b970-4089-a46e-d9c6d70d3399%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
http://www.elastic.co

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a460df1b-d0f1-468a-8bfc-26a07a417577%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Questions about caching and AbstractSearchScript - elasticsearch 2.4.6 Elasticsearch	2	340	August 9, 2018
Elastic Scripting caching issue Elasticsearch	8	437	June 25, 2021
Native Filter Scripts + Caching may be broken Elasticsearch	2	378	July 6, 2017
How does script caching work when the script is updated? Elasticsearch	2	416	December 30, 2019
Shard Request Cache and Scripts Elasticsearch	2	425	November 9, 2018

Native script caching

Related topics