Data directory is still growing after initial index build

4b3l · January 18, 2018, 2:24pm

Hi,

I have a 3 Node ES Cluster. I have built an index in my new cluster. This took a while to complete.

I've noticed that since the index being built, the data directory is still growing and I am not sure why or where to troubleshoot this, our Application is idle so there aren't any requests going through.

Anyone have any ideas or point me in the right direction?

dadoonet · January 18, 2018, 3:13pm

Might be segment merge is happening behind the scene?
Monitoring (available in X-Pack basic - free) could help to see that may be.

4b3l · January 22, 2018, 11:21am

Thanks for your reply. Not sure which specific monitoring to use,

I used _cat/segments?v and _segments, but it doesn't tell me if there are any active segment merging going on

I don't see anything in the X-pack API documentation.

dadoonet · January 22, 2018, 11:44am

X-Pack monitoring collects information about the segment numbers. And lot of other metrics.
You should give it a try.

4b3l · January 22, 2018, 2:28pm

Thanks. From reading the documentation Segment merging is something that happens constantly behind the scenes but is triggered when there is new data coming in.

Our system has been idle for over a week, e.g. no new data going in.

Seems strange it takes over a week for something like this to happen or is it just how Lucene behaves?

dadoonet · January 22, 2018, 2:56pm

Our system has been idle for over a week, e.g. no new data going in.

Then it looks strange indeed. Could you run:

GET _cat/indices?v

4b3l · January 22, 2018, 3:16pm

The list is big to copy and paste? Is there anything to look for?

health is green, and status open for all indexes.

dadoonet · January 22, 2018, 4:24pm

You can share it on gist.github.com and just add the link here.

4b3l · January 22, 2018, 4:28pm

thanks: Here is the url:

gist.github.com

https://gist.github.com/anonymous/64ed6c7af8d314f605c9caeef86fa1c1

gistfile1.txt

green open cms-fts-1739349137-5551304863904910337-1.0 t6O5wKSDQmuQhxfnfwJoYQ 1 1     115     1  91.7kb   45.8kb
green open cms-fts-1739349137-1164184000067766298-1.0 EQ9X6bT0S0eBA1CZyVzmhQ 1 1     115     1  94.7kb   45.8kb
green open cms-fts-1739349137-8102296655402453357-1.0 U97OWmfETc-XozrWpbpH8w 1 1   48564     1 372.1mb    184mb
green open .watcher-history-6-2018.01.04              iSFxglkcQo-Dr5rUyLCmyQ 1 1    7200     0  10.5mb    5.2mb
green open cms-fts-1739349137-4557588383742586810-1.0 0Rha0FEdRq-Oiq8ObX6ZQg 1 1    2659     0  23.6mb   11.8mb
green open cms-fts-1739349137-4614159382343742196-1.0 lgd17EY8TBmZTXcBODuYqg 1 1     115     1 119.4kb   41.5kb
green open cms-fts-1739349137-3416056144311028579-1.0 PggTGhXTQbmkpQZr82TQiw 1 1    1157     1   5.8mb    2.9mb
green open cms-fts-1739349137-2494095291584237735-1.0 KmFjVD1aST2uROc0FeBRcw 1 1     114     0 154.7kb   77.4kb
green open cms-fts-1739349137-3406387650192252317-1.0 i9dg-PbFQA-lTUB2RQxJYQ 1 1     115     0   108kb   34.5kb
green open cms-fts-1739349137-4072450632992295755-1.0 MxLVVl7uRc6Su74k0tI3gA 1 1   11307     1  15.9mb      8mb

This file has been truncated. show original

dadoonet · January 22, 2018, 4:33pm

Can you explain what are all those indices?
Can you run the same command again and compare which one is still growing?

As you are using xpack, for sure the number of documents in indices like .monitoring-es-* is still increasing. But it should not be that much.

4b3l · January 22, 2018, 4:54pm

Just ran it again it seems the same, I will do this again after 1 hour to confirm.

The indices are our data we have created our own naming convention.

dadoonet · January 22, 2018, 5:11pm

Ok but that's 600 indices on 3 nodes. That is becoming to be a lot.
And probably a waste of resources as you have at most some hundred of mb used per shard.

Also notice that the big indices you have are the monitoring ones (some gb).

4b3l · January 23, 2018, 10:02am

I just did a compare and the monitoring-es indices seem to grow to a max 5.7GB/5.8GBbefore a new one is created on each day, reading some background it keeps up to 7 days for a basic license. What are the purpose of these indices.

dadoonet · January 23, 2018, 10:16am

What are the purpose of these indices.

They are used by X-Pack Monitoring.
You can change xpack.monitoring.collection.interval (defaults to every 10s). See Monitoring Settings in Elasticsearch | Elasticsearch Reference [6.1] | Elastic

Also you can change xpack.monitoring.history.duration to 1d so you should keep only one day of monitoring data online.

Anyway, that's why your disk space is increasing.

4b3l · January 23, 2018, 11:55am

Thank you for your help and patience

dadoonet · January 23, 2018, 12:17pm

You're very welcomed.

Just a note. Let me highlight what I said previously as you might have troubles in the future:

4b3l · January 23, 2018, 2:42pm

Is there a guideline of how many indices we should have or is there an article around this subject?

dadoonet · January 23, 2018, 3:03pm

May I suggest you look at the following resources about sizing:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

system · February 20, 2018, 3:03pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Regarding Unexplained Index Growth Elasticsearch	6	411	November 1, 2017
How does segment merging work Elasticsearch	6	943	July 5, 2017
Monitoring whether the index is merging at a moment Elasticsearch	5	1044	July 6, 2017
ES 2.3.3 Force merge not merging? Elasticsearch	11	3297	January 6, 2017
After migration from Elasticsearch 6.3 to 7.17 the index size on disk doubled Elasticsearch	4	258	May 24, 2023

Data directory is still growing after initial index build

Related topics