ElasticSearch ability to reuse the score generated


(Spyrospph) #1

Hi,

I would like to inquire about elasticsearch ability to assist us on some operations that we are looking for.

Assuming we have a number of documents of the structure:

documentx {
tagx: score
tagy : score
}

where tagx is string and score is an int

For example:

document 1 {
tag1: 15
tag2: 18
}
document 2 {
tag2: 24
}
document 3 {
tag1: 43
tag2: 23
tag3: 10
}
document 4 {
tag3: 14
}

...

Then given a search query of the form "show me documents with tag1 or tag2" the algorithm should:

  1. search all documents and find the ones with either of these

  2. Once found then sum the score of the tags

  3. Then order by this sum i.e. order by SUM(score) DESC

For example the above search would return:

document 3: 66
document 1: 33
document 2: 24

  1. From the result set above , we need to perform an AVG operation on the scores and then create 3 buckets.
    First bucket will be those documents whose score is higher than the AVG.
    The second bucket wil contain documents whose score is between 0,5AVG and AVG.
    Finally the third bucket will contain all documents whose scire is less than 0,5
    (AVG).

So our example would return:
Bucket 1 -> document 3
Bucket 2 -> document 1 and document 2
Bucket 3 -> null

As a further requirement we would like to have facets for these results.

Is this something ES can do and if yes then :

  1. can this be done using built in aggregate functions such as SUM,AVG?
  2. should we build custom functions to perform this?

Thank you for your time.

-S.


(Mark Walkom) #2

You can do this with a sum aggregation to get your tag totals.
But to then go back and grab a list of documents will require a different and separate query.


(system) #3