Storage footprint not the same as cummulative index size, why is that,

jacobot · November 30, 2018, 9:17am

Hi,

i am issueing this query

GET /_cat/allocation?v&h=disk.indices,disk.avail,disk.used,disk.total,disk.percent

response

disk.indices disk.avail disk.used disk.total disk.percent
      10.2gb     15.2gb    34.6gb     49.9gb           69

So indices is 10.2gb and disk used is 34.6gb?? where is the difference comming from
looking at the data directory its indeed 35Gb

du -h -d 5
44K	./elasticsearch/nodes/0/_state
184K	./elasticsearch/nodes/0/indices/dBvqitmZRt-Fmuc0m27ngQ
695M	./elasticsearch/nodes/0/indices/LT-pbZUmQUWTVj_eZ6ueoQ
211M	./elasticsearch/nodes/0/indices/UrkzhOUPT2mGq5CIxjE2jw
2.2M	./elasticsearch/nodes/0/indices/aNoeXSc1R5-Yv5mi5kAEVw
2.0M	./elasticsearch/nodes/0/indices/FTyJMumRRQ6KlDMkJPpg0g
1.4M	./elasticsearch/nodes/0/indices/FaHslxLmRPmt_epBjRl4oA
84K	./elasticsearch/nodes/0/indices/sZrzFZaUQCq2lntAfOVG1A
6.9M	./elasticsearch/nodes/0/indices/OMgjPF0kSdacpMlDFbhV9A
214M	./elasticsearch/nodes/0/indices/l9nus3Q_SKu5G6lFHsmshw
9.8M	./elasticsearch/nodes/0/indices/bh-yRTBQSOWejk2JaPGGQw
1.9G	./elasticsearch/nodes/0/indices/mxEqic8MTw2W1sDbDsWwYQ
2.0G	./elasticsearch/nodes/0/indices/aw9QKXJbRMinedvKUHeAmw
2.2M	./elasticsearch/nodes/0/indices/L7pYWKHjQVaO8R21hi_AmA
328M	./elasticsearch/nodes/0/indices/NClre8MZSomW4-Hxab3aow
13M	./elasticsearch/nodes/0/indices/7mc_Q_ziTvelFrAX-d_umg
3.0M	./elasticsearch/nodes/0/indices/PN5v9z7dSoip7uMXwSglqg
812K	./elasticsearch/nodes/0/indices/cwesDBMVTOOIYD50690Rhw
186M	./elasticsearch/nodes/0/indices/r1G0zHTJRTaEUMSrjcwj8Q
1.9G	./elasticsearch/nodes/0/indices/yk0y91gNQZOdhMKftSVa1g
1.9G	./elasticsearch/nodes/0/indices/wfu1bw9oR3W3iDZS8x_6OQ
1.9G	./elasticsearch/nodes/0/indices/x_qsGMdqQZi8Es_KKvWuCA
1.9G	./elasticsearch/nodes/0/indices/-CI6VqaYR8KMaHyesiDrdA
2.2M	./elasticsearch/nodes/0/indices/MwUf5GHcSxmmNwRr80szLw
5.6M	./elasticsearch/nodes/0/indices/9efquwiUTOu0Uyba2m3wIQ
1.9G	./elasticsearch/nodes/0/indices/jZdwR2BuSfWK_wR0k_Jx8w
1.9G	./elasticsearch/nodes/0/indices/K5aLBsDjRPGWEuK3AdWgVA
1.9G	./elasticsearch/nodes/0/indices/rqzC5P0KQqeVlPOiaMRJ-g
226M	./elasticsearch/nodes/0/indices/EgLHTHW0S6i6eJT1HOJi7w
1.9G	./elasticsearch/nodes/0/indices/oNRlGK97S7iewBzr2o0dyA
1.9G	./elasticsearch/nodes/0/indices/LBWkHnAeTvaBSSOldBO8Tg
156M	./elasticsearch/nodes/0/indices/MXiwXUudSY2yY5cm27mFcQ
5.9M	./elasticsearch/nodes/0/indices/YhcziY5OTLiLyl6pTAV0WA
310M	./elasticsearch/nodes/0/indices/SFbJg2oYRQCPGbdsdb2LZQ
303M	./elasticsearch/nodes/0/indices/4r3_DX-NTzaK-6IieDkEFA
1.9M	./elasticsearch/nodes/0/indices/8l4-SA5PR6-4MxGXWG7fgA
193M	./elasticsearch/nodes/0/indices/XmiRCtGKRuqIWtLx2J_sbA
25M	./elasticsearch/nodes/0/indices/z2R2msq_QaS6OGF9JBKU7w
1.9G	./elasticsearch/nodes/0/indices/2A6_bEZ6RKe16dkXT2Xa_Q
1.9G	./elasticsearch/nodes/0/indices/Fpj3EBzYRke83n--8Jz0ow
1.9G	./elasticsearch/nodes/0/indices/LsrLw3n2QoKTcxBwQ7A_Bg
285M	./elasticsearch/nodes/0/indices/zWlP4T0gQ5KvOc9Q0zQkkw
1.9G	./elasticsearch/nodes/0/indices/y15InfVUTEWNW_pj4EjW7w
1.9G	./elasticsearch/nodes/0/indices/1JLAYcPiRGO8zaE8v_1iIw
207M	./elasticsearch/nodes/0/indices/P4BK8a0ERwy3Rvn4vf2K4Q
1.9G	./elasticsearch/nodes/0/indices/9sV3vtiNRhKI05YC1VIwoQ
5.0M	./elasticsearch/nodes/0/indices/6uPdPVNeTBGbvdxp0Cc0XQ
35G	./elasticsearch/nodes/0/indices
35G	./elasticsearch/nodes/0
35G	./elasticsearch/nodes
35G	./elasticsearch
0	./logstash
35G	.

why is there a difference?

Note, we did dropped a lot of indices recently, are these somehow lingering files??

Cheers,

dadoonet · November 30, 2018, 10:01am

Can this be caused by the transaction log?
What is your version?

jacobot · November 30, 2018, 12:43pm

version info, how do i find out that the translog are the issue and will it be removed under space pressure?

GET /
{
  "name" : "43aC6EG",
  "cluster_name" : "eds-cluster-elk",
  "cluster_uuid" : "z61nh_jpRoaGaYQpPFSJLg",
  "version" : {
    "number" : "6.4.2",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "04711c2",
    "build_date" : "2018-09-26T13:34:09.098244Z",
    "build_snapshot" : false,
    "lucene_version" : "7.4.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

Not sure how to fetch transaction log size how to find it
here are our cluster settings

{
  "persistent": {
    "action": {
      "auto_create_index": "+.kibana*, +.monitoring*, +.watcher*, +.prod_eds_messages*, +.dev_eds_messages*"
    },
    "indices": {
      "recovery": {
        "max_bytes_per_sec": "50mb"
      }
    },
    "xpack": {
      "monitoring": {
        "collection": {
          "enabled": "true"
        }
      }
    }
  },
  "transient": {}
}

jacobot · November 30, 2018, 4:23pm

Hi, we are reaching, 85% threshold disk usage soon, ((

GET /_cat/allocation?v&h=disk.indices,disk.avail,disk.used,disk.total,disk.percent

disk.indices disk.avail disk.used disk.total disk.percent
      15.8gb      8.1gb    41.8gb     49.9gb           83

15.8G of data, total disk 49.9gb,

disk used 41.8gb, is it going to flush soon (i hope?)

dadoonet · November 30, 2018, 4:42pm

May be you can try to run a flush and see if it helps? https://www.elastic.co/guide/en/elasticsearch/reference/6.5/indices-flush.html

Christian_Dahlqvist · November 30, 2018, 4:50pm

In recent versions the translog is kept around for a while as it allows recovery based on sequence numbers which can speed up recovery significantly. You can tune this if you want to.

jacobot · November 30, 2018, 4:51pm

I did

POST _all/_flush?wait_if_ongoing=true&force=true

reponse

{
  "_shards": {
    "total": 388,
    "successful": 208,
    "failed": 0
  }
}

nothing changed

disk.indices disk.avail disk.used disk.total disk.percent
      16.3gb      7.9gb      42gb     49.9gb           84

jacobot · November 30, 2018, 5:01pm

we do have high insert rate (bulk insert)
Sorry i dont find the doc perticularly clear

index.translog.retention.size

The total size of translog files to keep. Keeping more translog files increases the chance of performing an operation based sync when recovering replicas. If the translog files are not sufficient, replica recovery will fall back to a file based sync. Defaults to 512mb

index.translog.retention.age

The maximum duration for which translog files will be kept. Defaults to 12h .

I must change these params to keep translog disk footprint to a certain size????
We do _bulk api inserts of 500-1500/sec

Note: its not critical data at the moment, its still an mvp at the moment, so its no problem to change params and experiment

jacobot · November 30, 2018, 5:56pm

its is flushing and keeping it below 85%
nice,

disk.indices disk.avail disk.used disk.total disk.percent
      16.9gb      8.6gb    41.3gb     49.9gb           82

thanks for your help, it seems elastic will cleanup as needed

dadoonet · November 30, 2018, 6:14pm

If your index is not "stable" because you are doing a lot of indexing, then many things can happen. For example segment merge.

Which means that at some point of time you might have 2 segments containing each let say 100k docs.
Elasticsearch will create a new segment containing 200k docs and then once done will remove the 2 old segments. Which means that during this operation you might use more disk space than needed.

My 2 cents.

DavidTurner · December 1, 2018, 8:19am

Yes, the disk.indices statistic only counts the store, i.e. segment files, and not the translog. Here is a cluster containing only one shard with ~1MB of segment data and ~4MB of translog:

$ curl 'http://localhost:9200/_cat/allocation?v&h=disk.indices'
disk.indices
    1015.7kb
$ du -k elasticsearch-6.5.1/data-0
...
4120  elasticsearch-6.5.1/data-0/nodes/0/indices/Pw0fY116RE2EuySrE1zLyQ/0/translog
1064  elasticsearch-6.5.1/data-0/nodes/0/indices/Pw0fY116RE2EuySrE1zLyQ/0/index
...
5220  elasticsearch-6.5.1/data-0

jacobot · December 4, 2018, 3:55pm

Thank you @dadoonet and @DavidTurner

Is there a way to look at the internals of the workings, would that be the unallocated segments you speak off?

@DavidTurner, in your example of file listings, is the part of the path Pw0fY116RE2EuySrE1zLyQ the uuid of your index?

jacobot · December 11, 2018, 4:18pm

The problem is ocurring again

with only 10gb in index data (total size disk is 50G) it stopped inserting in indexes, (it hit the 85% threshold)

how can i correct this situation? i cannot imagine that 10gb disk space (single node), will eventually need a 50gb disk (mounted on /usr/share/elasticsearch/data)

thanks you

jacobot · December 11, 2018, 5:43pm

feedback..
I tried this

PUT /_all/_settings
{
   "index":{
      "translog.retention.age":"30m",
      "translog.retention.size":"100mb"
   }
}

and this was the final result

disk.indices disk.avail disk.used disk.total disk.percent
      10.7gb     37.3gb    12.6gb     49.9gb           25

All the numbers seem to add up now) disk.indices i snow very close to disk.used wich is excellent.

system · January 8, 2019, 5:44pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Disk space getting filled up Elasticsearch	11	9264	July 30, 2018
Difference between disk.indices vs disk.used in cat allocation Elasticsearch	2	3930	September 29, 2017
Sometimes disk.Indices greater than disk.used Elasticsearch	2	602	March 8, 2018
Difference in disk.indices and disk.used Elasticsearch	1	1394	August 25, 2018
Store.size vs disk.indices - disk usage Elasticsearch	1	1360	July 5, 2017

Storage footprint not the same as cummulative index size, why is that,

Related topics