Index Update a column of a document cause the disk space temporary double

Hi There,

We notice if we are updating just one column of the document that already indexed in ES for some reason the disk space size of the node temporary double to the document size. And then after it successful indexed roughly 10 minutes after the space size goes back down to its original size.

We just want to know is this normal behaviour?
And the time taken for the space size to go down is it configurable?

we just need these information to plan out the disk space allocation that we need for each nodes that will work for our use cases when go live.

Cheers,

Kiet Tran

Yes, take a read of https://www.elastic.co/guide/en/elasticsearch/guide/2.x/update-doc.html

Thanks Mark,

I also notice this is also the same for index one new document where the disk space temporary double in size and then sometime later go back to the correct size.

Is this also a correct behaviour for indexing a new document?

And is it possible to control the timing of when ElasticSearch to remove the unwanted space?

Cheers,

Kiet Tran

That may be because of the translog(?).

Why is it such a problem?

Thanks again Mark,

For go live we just want to gauge how much space we need to allocate for disk space to elastic search.

We are doing some load testing at the moment so we let the indexing run for new documents for the last 8 hours.
We noticed the size after the load test is done is sitting around 100GB. We do expect this to go down roughly to 80GB but don't know when this will happen?
What made it to go down to 80 GB in our first run was restarting the test machine which seem very odd but that is what we found.

  • At the moment for our second run we have left the machine on without restarting for the last 2 hours but the size is still not dropping. We are hoping the system able to claim these 20 gigs without restarting the machine so we are planning to leave this on overnight to see this is the case.

100GB of things to indexed is very normal in our go live use cases so we just need to gauge how much space needed for different customers.

thanks again,

Kiet Tran

Hi there,

Please assist in this issue. Just to continue from above that we have left the ES service running over night (16 hours)and did not see the space getting released by ES . So we restarted the ES service and immediately from 100 GB used down to 60 GB used.

We just want to know why this is the case?

Also does Elastic search have any setting that we could set to increase the rate of space release by ES?

Thanks in advance,

Kiet Tran

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.