I am working on elasticsearch aggreations. I have the below scenario: scenario: I have indexed customer data along with orders. Here "orders" are stored as nested type. example document:
{"company":"ABC","orders":[{"order_no":"OL1", "prod_type" : "OLP",
"price":20}, {"order_no":"OL2", "prod_type" : "OLP", "price":50},
{"order_no":"OL3", "prod_type" : "GLP", "price":100} ]}
Here I want to query the customer documents whose total price is greater than 1000 dollars.
note: here I want documents (not aggregated results)
two possibilities here, actually only one, I just want to show it also works on query time, but please dont do that!
Calculate the the total amount of all order items on query time using a Script Field.
Precalculate the order total before indexing it into Elasticsearch. This would be a client side operation.
So why should you go with the second solution as it seems to be more work. A scripted field is exceptionally expensive, as this field calculation has to be executed for each hit returned by query (imagine millions of hits being returned), where as the second solution requires just one range filter and that's it.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.