Today I'm receive an alert about no free space on 100Gb EBS partition
which I use as "work" directory for single node ES installation.
98Gb was used by gateway. Data for ES produced from MongoDB and takes
all with indexes and other data which is not indexed in ES only 11Gb.
I have moved all data of gateway to /mnt partition (it have 500Gb) and
start ES again but shutting down ES when it hold more 30 minutes in
recovery state.
For example full indexation takes only 25 minutes and now gateway
occupy only 22 Gb (21 Gb after optimizing of indexes).
Is there a way to get clean up gateway? And is there a way to decrease
time of index recovery?
Today I'm receive an alert about no free space on 100Gb EBS partition
which I use as "work" directory for single node ES installation.
98Gb was used by gateway. Data for ES produced from MongoDB and takes
all with indexes and other data which is not indexed in ES only 11Gb.
I have moved all data of gateway to /mnt partition (it have 500Gb) and
start ES again but shutting down ES when it hold more 30 minutes in
recovery state.
For example full indexation takes only 25 minutes and now gateway
occupy only 22 Gb (21 Gb after optimizing of indexes).
Is there a way to get clean up gateway? And is there a way to decrease
time of index recovery?
We noticed that our work/gateway sizes were ballooning and it appeared
that the "flush" command was not getting executed often enough. From
the ES docs, this gets executed based on memory heuristics. Not sure
what type of memory (disk or ram) or what the thresholds are, though.
Today I'm receive an alert about no free space on 100Gb EBS partition
which I use as "work" directory for single node ES installation.
98Gb was used by gateway. Data for ES produced from MongoDB and takes
all with indexes and other data which is not indexed in ES only 11Gb.
I have moved all data of gateway to /mnt partition (it have 500Gb) and
start ES again but shutting down ES when it hold more 30 minutes in
recovery state.
For example full indexation takes only 25 minutes and now gateway
occupy only 22 Gb (21 Gb after optimizing of indexes).
Is there a way to get clean up gateway? And is there a way to decrease
time of index recovery?
We noticed that our work/gateway sizes were ballooning and it appeared
that the "flush" command was not getting executed often enough. From
the ES docs, this gets executed based on memory heuristics. Not sure
what type of memory (disk or ram) or what the thresholds are, though.
We ended up calling flush after every 10K docs.
By default, flush is called every 5000 operations - this is on a per
shard basis.
We noticed that our work/gateway sizes were ballooning and it appeared
that the "flush" command was not getting executed often enough. From
the ES docs, this gets executed based on memory heuristics. Not sure
what type of memory (disk or ram) or what the thresholds are, though.
We ended up calling flush after every 10K docs.
By default, flush is called every 5000 operations - this is on a per
shard basis.
There is a problem in 0.10 where the translog in the gateway was not being
appended correctly and data was getting accumelated each time instead of
adding just the diff. This does not affect the correctness of the gateway,
but does imply more storage needs. I have fixed in in 0.11.
We noticed that our work/gateway sizes were ballooning and it appeared
that the "flush" command was not getting executed often enough. From
the ES docs, this gets executed based on memory heuristics. Not sure
what type of memory (disk or ram) or what the thresholds are, though.
We ended up calling flush after every 10K docs.
By default, flush is called every 5000 operations - this is on a per
shard basis.
The gateway should get cleaned up automatically. Are you storing both the
gateway and the work dir in the same location?
Sorry for delayed reply.
Yes, we are store both gateway and work dir in the same location. After
moving work dir (with gateway) to 500Gb partion ES was fill it in 3 days.
So I was force to disable gateway (and full indexation in my case in 3x
times faster then recovering from gateway).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.