For every name I want to have the latest document (newest timestamp).
In this case the result should be:
'{"name":"1","ts":"2011-06-02","count":20}'
'{"name":"2","ts":"2011-06-04","count":15}'
I can't figure out how to express this as an elasticsearch query?
I would also note that this query will be performed on a large dataset
with much more than 1G documents.
There will be about 50M different names.
To simplify this it's maybe needed to have the "latest" document
stored under a different index or type to be able to run querys only
on "latest" documents.
In this case I need to duplicate all "latest" documents but have
unique names.
What are are asking for, if I understand correctly, is grouping basically on the name, and its not implemented. Even when implemented, its going to come with memory and performance costs. An index holding the latest docs is a good solution.
On Saturday, June 4, 2011 at 11:08 AM, jukart wrote:
Hi all,
I'm about to use ES for a large set of data and could not figure out
how to do a query I need.
For every name I want to have the latest document (newest timestamp).
In this case the result should be:
'{"name":"1","ts":"2011-06-02","count":20}'
'{"name":"2","ts":"2011-06-04","count":15}'
I can't figure out how to express this as an elasticsearch query?
I would also note that this query will be performed on a large dataset
with much more than 1G documents.
There will be about 50M different names.
To simplify this it's maybe needed to have the "latest" document
stored under a different index or type to be able to run querys only
on "latest" documents.
In this case I need to duplicate all "latest" documents but have
unique names.
What are are asking for, if I understand correctly, is grouping basically
on the name, and its not implemented. Even when implemented, its going to
come with memory and performance costs. An index holding the latest docs is
a good solution.
Thanks for the answer, thats what I thought, will use a separate index
On Saturday, June 4, 2011 at 11:08 AM, jukart wrote:
Hi all,
I'm about to use ES for a large set of data and could not figure out
how to do a query I need.
For every name I want to have the latest document (newest timestamp).
In this case the result should be:
'{"name":"1","ts":"2011-06-02","count":20}'
'{"name":"2","ts":"2011-06-04","count":15}'
I can't figure out how to express this as an elasticsearch query?
I would also note that this query will be performed on a large dataset
with much more than 1G documents.
There will be about 50M different names.
To simplify this it's maybe needed to have the "latest" document
stored under a different index or type to be able to run querys only
on "latest" documents.
In this case I need to duplicate all "latest" documents but have
unique names.
I see that your reply was more than 4 years ago, has anything been changed since then with ES?
Is there any way to implement "point in time" query? So say though, my data is continuously updated, I need to quey & want results back as it was say 3 hours or 6 hours back. Is there any way to implement it?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.