Hi David,
Thanks for the answer. I agree with you that rolling indexes would improve
the overall architecture, but I believe this would cover only
part of the problem : How to handle a lot of data over time. This indeed is
one part of my problem.
My intention with the initial question was in fact to address the second
problem : write performance. That's the reason why I wanted
to improve things at index level. At index level, there are 3 things that
affect performance : indexing data, segment merging & queries.
My solution tries to keep indexing & queries to a minimum. But I still
didn't found a good solution for the segment merging policy.
To make my point more focused : I'm hoping to get a write performance to
the Lucene segments as close as possible to the native
FS file write. That's why I want to eliminate any segment merging inside
the index as this would affect write performance.
So I hope that somebody might know a way to configure the segment merge
policy in this way : no segment merging, all segments have the
same size.
Cheers
Paul.
On Monday, June 24, 2013 6:59:46 PM UTC+2, David Pilato wrote:
About 2), you can use rolling indexes with an alias on top of them.
So create a new index every day, modify the alias, remove (or close) the
oldest index.
A closed index does not use resources anymore (only disk space). If you
remove it, you will get back you disk space.
Does it answer to your needs?
See:
Elasticsearch Platform — Find real-time answers at scale | Elastic
My 0.01 cents
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs
Le 24 juin 2013 à 17:21, Paul Sabou <paul....@gmail.com <javascript:>> a
écrit :
Hi,
I'm trying to investigate if Elasticsearch would be an acceptable
replacement for a specific type of message queue.
I know that this is not it's intended use case but I believe it makes
sense to explore this possibility and to see it as a special
case of segment merge policy.
*The use case I'm having is an append only index :
*
- lots of data comes in and CREATE many ES documents (ie. time series)
- there is no document UPDATE & no document DELETE
- each document has a timestamp that will be indexed and other fields
that will just be stored (and the timestamp is the only field that will be
searched for)
- there is only one type of consumer (search query) : all documents
with a timestamp more recent than X
The reasons I believe ES might be an acceptable (even if not perfect)
fit for this :
- the underlying Lucene segments are append only (also the message
queue)
- the search is very simple and as such the indexing of documents
would have minimal performance penalty
From what I see now the challenge is :
- to configure ES so that the segment merge policy in such a way as
to only create fixed size Lucene index segments (so no segment merges, only
new segments)
- to configure ES to keep open only the latest X Lucene index
segments and as such to avoid having too many open file descriptors
So, does this make sense and if so how could it be done? Went through the
existing merge policies (
Elasticsearch Platform — Find real-time answers at scale | Elastic) and
none seem a good fit.
Cheers
Paul.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.