Hello, I'm a newcomer on Elasticsearch and I have a few conceptual
question, I hope someone can help me.
I know Elasticsearch is basically a search engine, and is thought as
time series data store with some policy of limited data retention. I've
read in some thread of this list:
"Set expiration dates on the documents you are indexing in ElasticSearch:
This is generally around 90 to 120 days in certain production environments
and 15 to 30 days in development and lower environments."
My question is: Can I use ES for time series data store with a one year
data retention policy? Probably the data stored size is about 1 PiB of logs.
Firstly, ES is not a time series data store, it can definitely be leveraged
as one but it does a lot, lot more!
1 - If that is referring to using document level TTLs then you may want to
avoid that for 1PB of data, TTLs can be resource intensive and at that
level you might find them very difficult. Instead it is better to use time
based indexes, eg daily as per the ELK stack, then you can use
Elasticsearch Curator to handle retention.
2 - No not at this stage unfortunately.
Hello, I'm a newcomer on Elasticsearch and I have a few conceptual
question, I hope someone can help me.
I know Elasticsearch is basically a search engine, and is thought as
time series data store with some policy of limited data retention. I've
read in some thread of this list:
"Set expiration dates on the documents you are indexing in Elasticsearch:
This is generally around 90 to 120 days in certain production environments
and 15 to 30 days in development and lower environments."
My question is: Can I use ES for time series data store with a one year
data retention policy? Probably the data stored size is about 1 PiB of logs.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.