Context in Native Scripts

Hello everyone,

 I've been playing with native scripts and have a few questions:

 Is there any notion of context for native scripts? 

For example, is there a way to know that a method "runAsDouble", for 

example, is called for the last time?
I might, for instance, like to send some sort of statistics after a
search is done.

Is there any way to know how many documents the search produced, 

beforehand?
I might want to do some pre calculations based on this number before
the actual scoring begins.

Is there any way to get all the documents (or ids) somehow to process 

(score) them in bulk?
My scoring might depend on the search result, I might want to calculate
an average of a search result field and base my scores on this number.

I apologize in advance, if some of my questions are uninformed. I'm new 

to ES, trying to switch from Solr.

Thank you,

ZS

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/48643754-67cc-497c-8c84-c1565dfcb867%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hello ,

Can you give a more elaborate explanation on the behavior of scoring you
want ?
I dont see any direct way to achieve this.

Also re-scoring might interest you -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-rescore.html

Thanks
Vineeth

On Thu, Sep 11, 2014 at 11:19 PM, zeev.sands@gmail.com wrote:

Hello everyone,

 I've been playing with native scripts and have a few questions:

 Is there any notion of context for native scripts?

For example, is there a way to know that a method "runAsDouble", for

example, is called for the last time?
I might, for instance, like to send some sort of statistics after a
search is done.

Is there any way to know how many documents the search produced,

beforehand?
I might want to do some pre calculations based on this number before
the actual scoring begins.

Is there any way to get all the documents (or ids) somehow to process

(score) them in bulk?
My scoring might depend on the search result, I might want to
calculate an average of a search result field and base my scores on this
number.

I apologize in advance, if some of my questions are uninformed. I'm

new to ES, trying to switch from Solr.

Thank you,

ZS

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/48643754-67cc-497c-8c84-c1565dfcb867%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/48643754-67cc-497c-8c84-c1565dfcb867%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGdPd5mJ2jQ5ueZQepu8Z%2B0Sjo%3DwxhTh%3D3AvREOiJtKMaFOMXA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

Thank you for the reply. Here is an example of a scoring behavior I'm
talking about:

 a) given a user query a set of documents is produced. Let's call 

this set S.
b) suppose each document has a numeric field called "F".
The average of this field values for the set of documents S is
calculated. Let's call this average A.
So is A = sum(F) / N, sum(F) is the sum of the values of field
F for each document in S, and N is |S|, the number of documents in S.
c) final score for each document is the deviation from the average:
score = F - A.

So, in order to calculate the score for each document, I need to know
"A", which depends on all documents produced by the query. This is a
simplified example, the actual score calculation is more involved.

Here is a different case, where I might like to know all the documents
produce by a query in order to score them: I have an external server
that handles the actual score calculation for each document,
communicating millions of documents to this server one document at a
time is expensive. I would prefer to first get all the documents
selected by a query, then send all of this info to the server and get
one reply containing custom scores for all the documents at once.

While I'm at it, a quick additional question: the rescore query and
post_filter look interesting. Is there any native (java) api to
implement custom re-scorer or custom post_filter? Just a link to api
would be very helpful.

Thank you again,
ZS

On 09/11/2014 08:20 PM, vineeth mohan wrote:

Hello ,

Can you give a more elaborate explanation on the behavior of scoring
you want ?
I dont see any direct way to achieve this.

Also re-scoring might interest you -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-rescore.html

Thanks
Vineeth

On Thu, Sep 11, 2014 at 11:19 PM, <zeev.sands@gmail.com
mailto:zeev.sands@gmail.com> wrote:

    Hello everyone,

     I've been playing with native scripts and have a few questions:

     Is there any notion of context for native scripts?

    For example, is there a way to know that a method
"runAsDouble", for example, is called for the last time?
    I might, for instance, like to send some sort of statistics
after a search is done.

    Is there any way to know how many documents the search
produced, beforehand?
    I might want to do some pre calculations based on this number
before the actual scoring begins.

    Is there any way to get all the documents (or ids) somehow to
process (score) them in bulk?
    My scoring might depend on the search result, I might want to
calculate an average of a search result field and base my scores
on this number.

    I apologize in advance, if some of my questions are
uninformed. I'm new to ES, trying to switch from Solr.

    Thank you,

    ZS




-- 
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearch+unsubscribe@googlegroups.com
<mailto:elasticsearch+unsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/48643754-67cc-497c-8c84-c1565dfcb867%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/48643754-67cc-497c-8c84-c1565dfcb867%40googlegroups.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/8PFme4-9Ykw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com
mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5mJ2jQ5ueZQepu8Z%2B0Sjo%3DwxhTh%3D3AvREOiJtKMaFOMXA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5mJ2jQ5ueZQepu8Z%2B0Sjo%3DwxhTh%3D3AvREOiJtKMaFOMXA%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5412FDB8.9050805%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hello Zeev ,

The only way i can think is using 2 query -

  1. Find sum of all scores -
    {
    "aggs": {
    "sum": {
    "sum": {
    "script": "doc.score"
    }
    }
    }
    }
  2. In the second request , using scripting in function score query and
    find the deviation.

Thanks
Vineeth

On Fri, Sep 12, 2014 at 7:35 PM, Zeev Sands zeev.sands@gmail.com wrote:

Hi,

Thank you for the reply. Here is an example of a scoring behavior I'm
talking about:

a) given a user query a set of documents is produced. Let's call this

set S.
b) suppose each document has a numeric field called "F".
The average of this field values for the set of documents S is
calculated. Let's call this average A.
So is A = sum(F) / N, sum(F) is the sum of the values of field F
for each document in S, and N is |S|, the number of documents in S.
c) final score for each document is the deviation from the average:
score = F - A.

So, in order to calculate the score for each document, I need to know "A",
which depends on all documents produced by the query. This is a
simplified example, the actual score calculation is more involved.

Here is a different case, where I might like to know all the documents
produce by a query in order to score them: I have an external server that
handles the actual score calculation for each document, communicating
millions of documents to this server one document at a time is expensive. I
would prefer to first get all the documents selected by a query, then send
all of this info to the server and get one reply containing custom scores
for all the documents at once.

While I'm at it, a quick additional question: the rescore query and
post_filter look interesting. Is there any native (java) api to implement
custom re-scorer or custom post_filter? Just a link to api would be very
helpful.

Thank you again,
ZS

On 09/11/2014 08:20 PM, vineeth mohan wrote:

Hello ,

Can you give a more elaborate explanation on the behavior of scoring you
want ?
I dont see any direct way to achieve this.

Also re-scoring might interest you -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-rescore.html

Thanks
Vineeth

On Thu, Sep 11, 2014 at 11:19 PM, zeev.sands@gmail.com wrote:

Hello everyone,

 I've been playing with native scripts and have a few questions:

 Is there any notion of context for native scripts?

For example, is there a way to know that a method "runAsDouble", for

example, is called for the last time?
I might, for instance, like to send some sort of statistics after a
search is done.

Is there any way to know how many documents the search produced,

beforehand?
I might want to do some pre calculations based on this number before
the actual scoring begins.

Is there any way to get all the documents (or ids) somehow to process

(score) them in bulk?
My scoring might depend on the search result, I might want to
calculate an average of a search result field and base my scores on this
number.

I apologize in advance, if some of my questions are uninformed. I'm

new to ES, trying to switch from Solr.

Thank you,

ZS

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/48643754-67cc-497c-8c84-c1565dfcb867%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/48643754-67cc-497c-8c84-c1565dfcb867%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/8PFme4-9Ykw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5mJ2jQ5ueZQepu8Z%2B0Sjo%3DwxhTh%3D3AvREOiJtKMaFOMXA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5mJ2jQ5ueZQepu8Z%2B0Sjo%3DwxhTh%3D3AvREOiJtKMaFOMXA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5412FDB8.9050805%40gmail.com
https://groups.google.com/d/msgid/elasticsearch/5412FDB8.9050805%40gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nUZW%3Doox5jfivtwga188bceXKWH8_MRi3SpH_rg2nb%2BQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I was hoping to be able to intercept the information somewhere between
searching and scoring to avoid extra round trips, but,I guess, having
two queries works as well, although it might be slow.

Thank you for your help!

On 09/12/2014 11:30 AM, vineeth mohan wrote:

Hello Zeev ,

The only way i can think is using 2 query -

  1. Find sum of all scores -
    {
    "aggs": {
    "sum": {
    "sum": {
    "script": "doc.score"
    }
    }
    }
    }
  2. In the second request , using scripting in function score query
    and find the deviation.

Thanks
Vineeth

On Fri, Sep 12, 2014 at 7:35 PM, Zeev Sands <zeev.sands@gmail.com
mailto:zeev.sands@gmail.com> wrote:

Hi,

Thank you for the reply. Here is an example of a scoring behavior
I'm talking about:

    a) given a user query a set of documents is produced. Let's
call this set S.
    b) suppose each document has a numeric field called "F".
        The average of this field values for the set of documents
S is calculated. Let's call this average A.
         So is A = sum(F) / N, sum(F) is the sum of the values of
field F for each document in S, and  N is |S|, the number of
documents in S.
    c) final score for each document is the deviation from the
average: score = F - A.

So, in order to calculate the score for each document, I need to
know "A", which depends on *all* documents produced by the query.
This is a simplified example, the actual score calculation is more
involved.

Here is a different case, where I might like to know all the
documents produce by a query in order to score them: I have an
external server that handles the actual score calculation for each
document, communicating millions of documents to this server one
document at a time is expensive. I would prefer to first get all
the documents selected by a query, then send all of this info to
the server and get one reply containing custom scores for all the
documents at once.

While I'm at it, a quick additional question: the rescore query
and post_filter look interesting. Is there any native (java) api
to implement custom re-scorer or custom post_filter? Just a link
to api would be very helpful.

Thank you again,
ZS



On 09/11/2014 08:20 PM, vineeth mohan wrote:
Hello ,

Can you give a more elaborate explanation on the behavior of
scoring you want ?
I dont see any direct way to achieve this.

Also re-scoring might  interest  you -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-rescore.html

Thanks
           Vineeth

On Thu, Sep 11, 2014 at 11:19 PM, <zeev.sands@gmail.com
<mailto:zeev.sands@gmail.com>> wrote:


        Hello everyone,

         I've been playing with native scripts and have a few
    questions:

         Is there any notion of context for native scripts?

        For example, is there a way to know that a method
    "runAsDouble", for example, is called for the last time?
        I might, for instance, like to send some sort of
    statistics after a search is done.

        Is there any way to know how many documents the search
    produced, beforehand?
        I might want to do some pre calculations based on this
    number before the actual scoring begins.

        Is there any way to get all the documents (or ids)
    somehow to process (score) them in bulk?
        My scoring might depend on the search result, I might
    want to calculate an average of a search result field and
    base my scores on this number.

        I apologize in advance, if some of my questions are
    uninformed. I'm new to ES, trying to switch from Solr.

        Thank you,

        ZS




    -- 
    You received this message because you are subscribed to the
    Google Groups "elasticsearch" group.
    To unsubscribe from this group and stop receiving emails from
    it, send an email to
    elasticsearch+unsubscribe@googlegroups.com
    <mailto:elasticsearch+unsubscribe@googlegroups.com>.
    To view this discussion on the web visit
    https://groups.google.com/d/msgid/elasticsearch/48643754-67cc-497c-8c84-c1565dfcb867%40googlegroups.com
    <https://groups.google.com/d/msgid/elasticsearch/48643754-67cc-497c-8c84-c1565dfcb867%40googlegroups.com?utm_medium=email&utm_source=footer>.
    For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to a topic
in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/8PFme4-9Ykw/unsubscribe.
To unsubscribe from this group and all its topics, send an email
to elasticsearch+unsubscribe@googlegroups.com
<mailto:elasticsearch+unsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5mJ2jQ5ueZQepu8Z%2B0Sjo%3DwxhTh%3D3AvREOiJtKMaFOMXA%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/CAGdPd5mJ2jQ5ueZQepu8Z%2B0Sjo%3DwxhTh%3D3AvREOiJtKMaFOMXA%40mail.gmail.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.
-- 
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearch+unsubscribe@googlegroups.com
<mailto:elasticsearch+unsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5412FDB8.9050805%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/5412FDB8.9050805%40gmail.com?utm_medium=email&utm_source=footer>.


For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/8PFme4-9Ykw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com
mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nUZW%3Doox5jfivtwga188bceXKWH8_MRi3SpH_rg2nb%2BQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nUZW%3Doox5jfivtwga188bceXKWH8_MRi3SpH_rg2nb%2BQ%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5413626C.1060601%40gmail.com.
For more options, visit https://groups.google.com/d/optout.