Seems like my replicas were making up the space.
So i start with a couple of GB of data and end up with about x3 , this seems expensive ?
Does anyone have any info on compressing the index's or using some sort of archiving setup ? In other words whats the best way to save space when using ES ?
Does anyone have any recommendations on configuring ES so that the DB size is similar to that of the original data that was parsed into it ?
My CIO wont take me seriously when i tell him for every TB of data we need 3 TB in ES
On Oct 3, 2012, at 2:16 AM, Otis Gospodnetic wrote:
ES stored the original JSON in the special _source field. That right there means your index will be at least 10GB in size. Additionally, it is possible your fields are also stored and not just index - the store part would be another 10GB. On top of that is the inverted index. But I'm not sure how you get to 80 GB. Maybe you are using ngrams somewhere? Maybe you can check with Skywalker plugin what's inside your index?
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html
On Tuesday, October 2, 2012 2:11:56 PM UTC-4, Dylan Johnson wrote:
I have a directory that has 10GB of data and i used Logstash to parse the date to elasticsearch which works fine. I have 2 index and 0 replication however after logstash has finished parsing the date to ES the ES /data directory is 80GB ?
This is unworkable ? Whats the reason for this ?