My configuration is :
Heap : 30GB
core : 24
ES version : 6
We having approx 100cr data (3 months) in single index. We have field date which has format 'yyyymmdd' . Data is pushing in realtime manner it this index. I want to keep deleting 3 months previous data ( where date < 20180501). I am using 'delete_by_query' api. Because writing is going on while taking snapshot when hits 'delete_by_query' api, I am getting version conflict error. How i can tackle such situation without affecting my writing process. Thanks
Have you thought about using more dated based indices? The reason I ask is that delete by query is much more expensive compared to just deleting an index from four months.
@spinscale thanks for reply. I agree with you. If i am making index like index-jan, index-feb, index-mar and whenever i want to delete i can simply delete specific index of month, But what about my search query. Will be my search query will affected when i want to extract data from jan 01 to feb 10? It will query on both index OR it will affect my scroll queries ?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.