Hi, I am trying to build a query that will match when an value of a field is greater than another nested field, but am stuck as to where to begin. Can anyone help or suggest somewhere to begin reading. Example data:
If it were possible it would probably take the form of some custom script retrieving the value pairs and comparing them for every doc. This would be linear to the number of docs and sounds expensive so perhaps it's worth considering another approach. Perhaps you could compute this "on the way in" using an ingest-pipeline to compute the difference between the amounts. You could then query and aggregate on this numeric "diff" value held in the index much more efficiently.
That is an idea, simple enough to build a simple string or array with an order and then just additionally match that at query time.
Do you think something similar to this would be efficient?
Where I use the principal of the 'smooshed_arrays' to create com1_com2. the reality of my searches are likely to be 2 or 3 objects to work with, so can easily build these on the way in for example, A_B and A_B_C / A_C_B (when 3 items to query)
Where I use the principal of the 'smooshed_arrays' to create com1_com2. the reality of my searches are
I'm not sure what you intend these arrays to represent in this example.
If they are your name fields, sorted by their amount then, yes that or a concatenated com1_com2 type string might help you find records more easily. I'm not clear on these points:
how many com* type names can exist in each doc?
are you only interested in com1 and com2 or testing arbitrary choices of com* names?
do you only care about pairs or maybe sequences of more than 2 names when sorted by amount?
Maybe sharing the actual business problem you're trying to solve would help?
In terms of mappings required to support span queries - you'll obviously need to ensure ingredients are sorted by amount. As for how you pass these values
a single JSON string value would require a delimiter in there e.g. commas and a choice of Analyzer that chops into tokens based on the delimiter OR
a JSON array of the ingredients where you might want to dial back the position_increment to 1 to make the ingredient terms appear closer to each other and so wouldn't need big "slop" values in your SpanNear query