Choosing shard vs alice in elasticsearch


(Chetana) #1

I am planning to use elasticsearch (ES) for storing event logs. Per day,
the application should store nearly 3000+ events and size will be around
30-50K.
I need to take some statistics monthly, half yearly, yearly.... also year
old data can be ignored sometimes but data should be retained for many years
I want to know the best practices in this scenario

  1. Is it a good idea to create shards based on size/period or create one
    shard with multiple alices based on filter condition?
  2. Does ES merges the search results coming from multiple shards? If yes,
    how fast or better it is compared to lucene's ranking based on Vector space
    model?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cafc1166-549e-41b2-bf64-e8966f2e610a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Chetana) #2

Sorry it is alias not alice

On Friday, March 21, 2014 10:11:53 AM UTC+5:30, Chetana wrote:

I am planning to use elasticsearch (ES) for storing event logs. Per day,
the application should store nearly 3000+ events and size will be around
30-50K.
I need to take some statistics monthly, half yearly, yearly.... also year
old data can be ignored sometimes but data should be retained for many years
I want to know the best practices in this scenario

  1. Is it a good idea to create shards based on size/period or create one
    shard with multiple aliases based on filter condition?
  2. Does ES merges the search results coming from multiple shards? If yes,
    how fast or better it is compared to lucene's ranking based on Vector space
    model?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4eac5bd8-0294-4a91-b5e4-2f60e0f9878b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Adrien Grand) #3

On Fri, Mar 21, 2014 at 5:41 AM, Chetana ambha.career@gmail.com wrote:

  1. Is it a good idea to create shards based on size/period or create one
    shard with multiple alices based on filter condition?

I would recommend on using time-based indices, you can hear about the
rationale at
http://www.elasticsearch.org/videos/big-data-search-and-analytics/ (the
part you are interested in starts at 21:15 but I would recommend wathing
the whole video which gives interesting ideas about how to model data with
Elasticsearch). For example, you could imagine build monthly indices. Then
tools like curator can help you manage old indices, in order to force-merge
(optimize) the read-only onces and deletes the old ones.

  1. Does ES merges the search results coming from multiple shards? If yes,
    how fast or better it is compared to lucene's ranking based on Vector space
    model?

Indeed, Elasticsearch merges search results: each shard returns its top
hits (via Lucene) and the node that coordinates search execution merges
these per-shard top-hits based on a priority queue.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7WvMMUGKkKGomW_djU3Hck-jbAYbTBj%3DYCLHUtaDng4g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4