Greater than based on value from another field

Dave_Clissold · April 23, 2018, 11:40am

Hi, I am trying to build a query that will match when an value of a field is greater than another nested field, but am stuck as to where to begin. Can anyone help or suggest somewhere to begin reading. Example data:

 {
  "id":"1",
  "contains":[
      {
        "amount":20,
        "name":"com1"
      },
      {
        "amount":40,
        "name":"com2"
      }
  ]
},
{
  "id":"2",
  "contains":[
      {
        "amount":40,
        "name":"com1"
      },
      {
        "amount":30,
        "name":"com2"
      }
  ]
}

So in the above, my query would be where the amount of com1 is great than the amount of com2, which would return "2"

Mark_Harwood · April 23, 2018, 12:11pm

If it were possible it would probably take the form of some custom script retrieving the value pairs and comparing them for every doc. This would be linear to the number of docs and sounds expensive so perhaps it's worth considering another approach. Perhaps you could compute this "on the way in" using an ingest-pipeline to compute the difference between the amounts. You could then query and aggregate on this numeric "diff" value held in the index much more efficiently.

Dave_Clissold · April 23, 2018, 12:36pm

That is an idea, simple enough to build a simple string or array with an order and then just additionally match that at query time.

Do you think something similar to this would be efficient?

Where I use the principal of the 'smooshed_arrays' to create com1_com2. the reality of my searches are likely to be 2 or 3 objects to work with, so can easily build these on the way in for example, A_B and A_B_C / A_C_B (when 3 items to query)

Mark_Harwood · April 23, 2018, 1:00pm

Where I use the principal of the 'smooshed_arrays' to create com1_com2. the reality of my searches are

I'm not sure what you intend these arrays to represent in this example.
If they are your name fields, sorted by their amount then, yes that or a concatenated com1_com2 type string might help you find records more easily. I'm not clear on these points:

how many com* type names can exist in each doc?
are you only interested in com1 and com2 or testing arbitrary choices of com* names?
do you only care about pairs or maybe sequences of more than 2 names when sorted by amount?

Maybe sharing the actual business problem you're trying to solve would help?

Dave_Clissold · April 23, 2018, 1:39pm

Sorry yes, the com* are names and the amounts are not necessary for this section of the query, it is purely com1 value is greater than com2 so strings are fine. To answer your questions

1- There can be 5-6 com*, but when above 3 this ordering of amounts becomes a useless query and the results are not really worth it.

2- yes only interested in the names. the amounts are not really of value, it's just com1 is greater than com2.

yes it is only the sequence. Just had a thought that I can add a numerical value for the index position in the sequence and then query that, unless there is a way of assigning this on ingest?

The problem is amount of ingredients in cocktails, so the user can say they want a cocktail with more gin than vodka

Mark_Harwood · April 23, 2018, 2:01pm

The applications of elasticsearch never cease to amaze me

You might want to take a look at SpanQuery family of queries - specifically the SpanNear query which can help you test the position and order of elements.

Dave_Clissold · April 23, 2018, 2:05pm

Crazy right? Come up with something amazing like ES and we'll use it for almost everything we can possibly conceive, including something as trivial as wanting more gin!!!!!

Mark_Harwood · April 23, 2018, 2:32pm

In terms of mappings required to support span queries - you'll obviously need to ensure ingredients are sorted by amount. As for how you pass these values

a single JSON string value would require a delimiter in there e.g. commas and a choice of Analyzer that chops into tokens based on the delimiter OR
a JSON array of the ingredients where you might want to dial back the position_increment to 1 to make the ingredient terms appear closer to each other and so wouldn't need big "slop" values in your SpanNear query

Dave_Clissold · April 23, 2018, 10:17pm

I think I'll use the first option, easier method for getting our data in that format, than trying to implement a position increment.

Thanks for your help Mark. Much appreciated

Topic		Replies	Views
Compare two fields in a query Elasticsearch	3	10297	December 21, 2016
Comparing two numeric fields of a Document Elasticsearch	2	1305	November 23, 2012
Comparison between two objects with in the nested array of each document in elastic search using script Elasticsearch	0	1092	April 6, 2019
How to compare 2 array entries in nested doc when value doesn't matter Elasticsearch	0	828	July 28, 2017
How to compare two field values using query_string in Elasticsearch Elasticsearch elastic-stack-monitoring	1	2319	May 11, 2020

Greater than based on value from another field

Related topics