i'm using logstash to manage logs, and as part of it - elasticsearch.
when querying the logs i'm using* scan search* and have the following
fields for each hit: @message, @source, @source_host, @source_path, @tags, *@timestamp,
*@type.
this is my problem - the returned hits are unsorted. I want to sort them by @timestamp. How can I do so in java?
This is what I tried:
i'm using logstash to manage logs, and as part of it - elasticsearch.
when querying the logs i'm using* scan search* and have the following
fields for each hit: @message, @source, @source_host, @source_path, @tags,
*@timestamp, *@type.
this is my problem - the returned hits are unsorted. I want to sort them
by @timestamp. How can I do so in java?
This is what I tried:
Slight correction: Elasticsearch cannot sort scan searches. But I can!
My sorting of response documents is external to Elasticsearch, bolted on
after the query has returned. I use a TreeSet which keeps its members
sorted. I then implement a limit. When the limit is reached:
If the document is outside the upper bound it is ignored.
Otherwise, the document is added and the last document (at the upper
bound) is removed (preserving the specified limit).
As I issue a scan query, I can then push documents into this TreeSet-based
sorter and always end up with the N-most sorted documents. This isn't
something I use a lot, but when I do need it this method is very, very
useful. And the additional time related to sorting (including creating
keys) is minimal.
Of course, there's lots of code to create the sort keys (probably
duplicates a lot of what ES is doing, if I could see its internal comments
But I do have a side question: Is there a way to query (via Java, of
course) the types of each explicitly mapped field? I can create mappings,
but querying the mappings for this kind of thing eludes me. For now, I've
created my own "schema" definition, and then use it to (1) generate the ES
mappings with all the niggly add-ons (such as Finnish character
equivalencies), and then also use it to drive the collation key generation
for this post-query sorting. But if I could ask Elasticsearch directly at
run-time, then this process would become much simpler and more robust.
Brian
On Monday, August 12, 2013 11:55:48 AM UTC-4, Ivan Brusic wrote:
Thanks for the replies.
Brian, I don't have an answer for you, but I have a follow up question to
my original post in light of the replies:
So I can't sort a scan search, but is there a different search which is
sortable AND returns more than the first 10 hits?
I only used scan search because it returns all the hits and not only the
first 10..
seems like a trivial request - return all hits sorted.
is there a trivial way?
On Tuesday, 13 August 2013 00:01:02 UTC+3, InquiringMind wrote:
Ivan,
Slight correction: Elasticsearch cannot sort scan searches. But I can!
My sorting of response documents is external to Elasticsearch, bolted on
after the query has returned. I use a TreeSet which keeps its members
sorted. I then implement a limit. When the limit is reached:
If the document is outside the upper bound it is ignored.
Otherwise, the document is added and the last document (at the upper
bound) is removed (preserving the specified limit).
As I issue a scan query, I can then push documents into this TreeSet-based
sorter and always end up with the N-most sorted documents. This isn't
something I use a lot, but when I do need it this method is very, very
useful. And the additional time related to sorting (including creating
keys) is minimal.
Of course, there's lots of code to create the sort keys (probably
duplicates a lot of what ES is doing, if I could see its internal comments
But I do have a side question: Is there a way to query (via Java, of
course) the types of each explicitly mapped field? I can create mappings,
but querying the mappings for this kind of thing eludes me. For now, I've
created my own "schema" definition, and then use it to (1) generate the ES
mappings with all the niggly add-ons (such as Finnish character
equivalencies), and then also use it to drive the collation key generation
for this post-query sorting. But if I could ask Elasticsearch directly at
run-time, then this process would become much simpler and more robust.
Brian
On Monday, August 12, 2013 11:55:48 AM UTC-4, Ivan Brusic wrote:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.