How to query for "max" value?

Jurgen_kartnaller · June 4, 2011, 8:08am

Hi all,
I'm about to use ES for a large set of data and could not figure out
how to do a query I need.

Having this set of data:

curl -XDELETE localhost:9200/data/
curl -XPUT localhost:9200/data
curl -XPUT localhost:9200/data/values/1 -d
'{"name":"1","ts":"2011-06-01","count":10}'
curl -XPUT localhost:9200/data/values/2 -d
'{"name":"1","ts":"2011-06-02","count":20}'
curl -XPUT localhost:9200/data/values/3 -d
'{"name":"2","ts":"2011-06-02","count":25}'
curl -XPUT localhost:9200/data/values/4 -d
'{"name":"2","ts":"2011-06-04","count":15}'

For every name I want to have the latest document (newest timestamp).

In this case the result should be:
'{"name":"1","ts":"2011-06-02","count":20}'
'{"name":"2","ts":"2011-06-04","count":15}'

I can't figure out how to express this as an elasticsearch query?

I would also note that this query will be performed on a large dataset
with much more than 1G documents.
There will be about 50M different names.

To simplify this it's maybe needed to have the "latest" document
stored under a different index or type to be able to run querys only
on "latest" documents.
In this case I need to duplicate all "latest" documents but have
unique names.

Is this an option, also for performance?

Jürgen

kimchy · June 4, 2011, 9:04am

What are are asking for, if I understand correctly, is grouping basically on the name, and its not implemented. Even when implemented, its going to come with memory and performance costs. An index holding the latest docs is a good solution.

On Saturday, June 4, 2011 at 11:08 AM, jukart wrote:

Hi all,
I'm about to use ES for a large set of data and could not figure out
how to do a query I need.

Having this set of data:

curl -XDELETE localhost:9200/data/
curl -XPUT localhost:9200/data
curl -XPUT localhost:9200/data/values/1 -d
'{"name":"1","ts":"2011-06-01","count":10}'
curl -XPUT localhost:9200/data/values/2 -d
'{"name":"1","ts":"2011-06-02","count":20}'
curl -XPUT localhost:9200/data/values/3 -d
'{"name":"2","ts":"2011-06-02","count":25}'
curl -XPUT localhost:9200/data/values/4 -d
'{"name":"2","ts":"2011-06-04","count":15}'

For every name I want to have the latest document (newest timestamp).

In this case the result should be:
'{"name":"1","ts":"2011-06-02","count":20}'
'{"name":"2","ts":"2011-06-04","count":15}'

I can't figure out how to express this as an elasticsearch query?

I would also note that this query will be performed on a large dataset
with much more than 1G documents.
There will be about 50M different names.

To simplify this it's maybe needed to have the "latest" document
stored under a different index or type to be able to run querys only
on "latest" documents.
In this case I need to duplicate all "latest" documents but have
unique names.

Is this an option, also for performance?

Jürgen

Jurgen_kartnaller · June 4, 2011, 9:32am

On Sat, Jun 4, 2011 at 11:04 AM, Shay Banon shay.banon@elasticsearch.comwrote:

What are are asking for, if I understand correctly, is grouping basically
on the name, and its not implemented. Even when implemented, its going to
come with memory and performance costs. An index holding the latest docs is
a good solution.

Thanks for the answer, thats what I thought, will use a separate index

On Saturday, June 4, 2011 at 11:08 AM, jukart wrote:

Hi all,
I'm about to use ES for a large set of data and could not figure out
how to do a query I need.

Having this set of data:

curl -XDELETE localhost:9200/data/
curl -XPUT localhost:9200/data
curl -XPUT localhost:9200/data/values/1 -d
'{"name":"1","ts":"2011-06-01","count":10}'
curl -XPUT localhost:9200/data/values/2 -d
'{"name":"1","ts":"2011-06-02","count":20}'
curl -XPUT localhost:9200/data/values/3 -d
'{"name":"2","ts":"2011-06-02","count":25}'
curl -XPUT localhost:9200/data/values/4 -d
'{"name":"2","ts":"2011-06-04","count":15}'

For every name I want to have the latest document (newest timestamp).

In this case the result should be:
'{"name":"1","ts":"2011-06-02","count":20}'
'{"name":"2","ts":"2011-06-04","count":15}'

I can't figure out how to express this as an elasticsearch query?

I would also note that this query will be performed on a large dataset
with much more than 1G documents.
There will be about 50M different names.

To simplify this it's maybe needed to have the "latest" document
stored under a different index or type to be able to run querys only
on "latest" documents.
In this case I need to duplicate all "latest" documents but have
unique names.

Is this an option, also for performance?

Jürgen

--
http://www.sfgdornbirn.at
http://www.mcb-bregenz.at

prasad · February 10, 2016, 3:53pm

Hi Shay,

I see that your reply was more than 4 years ago, has anything been changed since then with ES?

Is there any way to implement "point in time" query? So say though, my data is continuously updated, I need to quey & want results back as it was say 3 hours or 6 hours back. Is there any way to implement it?

Thanks,
P.

Topic		Replies	Views
How to parse below SQL query to ES? Elasticsearch	10	2833	November 6, 2017
How to get max value with document data Elasticsearch	4	7684	July 5, 2017
Elasticsearch: aggregation and select docs only having max value of field Elasticsearch	3	2856	August 28, 2019
ElasticSearch Max of Max? Elasticsearch	3	701	March 2, 2017
Aggregation on most recent document in a group Elasticsearch	5	929	May 22, 2018

How to query for "max" value?

Related topics