Storage footprint not the same as cummulative index size, why is that,


#1

Hi,

i am issueing this query

GET /_cat/allocation?v&h=disk.indices,disk.avail,disk.used,disk.total,disk.percent

response

disk.indices disk.avail disk.used disk.total disk.percent
      10.2gb     15.2gb    34.6gb     49.9gb           69

So indices is 10.2gb and disk used is 34.6gb?? where is the difference comming from
looking at the data directory its indeed 35Gb

du -h -d 5
44K	./elasticsearch/nodes/0/_state
184K	./elasticsearch/nodes/0/indices/dBvqitmZRt-Fmuc0m27ngQ
695M	./elasticsearch/nodes/0/indices/LT-pbZUmQUWTVj_eZ6ueoQ
211M	./elasticsearch/nodes/0/indices/UrkzhOUPT2mGq5CIxjE2jw
2.2M	./elasticsearch/nodes/0/indices/aNoeXSc1R5-Yv5mi5kAEVw
2.0M	./elasticsearch/nodes/0/indices/FTyJMumRRQ6KlDMkJPpg0g
1.4M	./elasticsearch/nodes/0/indices/FaHslxLmRPmt_epBjRl4oA
84K	./elasticsearch/nodes/0/indices/sZrzFZaUQCq2lntAfOVG1A
6.9M	./elasticsearch/nodes/0/indices/OMgjPF0kSdacpMlDFbhV9A
214M	./elasticsearch/nodes/0/indices/l9nus3Q_SKu5G6lFHsmshw
9.8M	./elasticsearch/nodes/0/indices/bh-yRTBQSOWejk2JaPGGQw
1.9G	./elasticsearch/nodes/0/indices/mxEqic8MTw2W1sDbDsWwYQ
2.0G	./elasticsearch/nodes/0/indices/aw9QKXJbRMinedvKUHeAmw
2.2M	./elasticsearch/nodes/0/indices/L7pYWKHjQVaO8R21hi_AmA
328M	./elasticsearch/nodes/0/indices/NClre8MZSomW4-Hxab3aow
13M	./elasticsearch/nodes/0/indices/7mc_Q_ziTvelFrAX-d_umg
3.0M	./elasticsearch/nodes/0/indices/PN5v9z7dSoip7uMXwSglqg
812K	./elasticsearch/nodes/0/indices/cwesDBMVTOOIYD50690Rhw
186M	./elasticsearch/nodes/0/indices/r1G0zHTJRTaEUMSrjcwj8Q
1.9G	./elasticsearch/nodes/0/indices/yk0y91gNQZOdhMKftSVa1g
1.9G	./elasticsearch/nodes/0/indices/wfu1bw9oR3W3iDZS8x_6OQ
1.9G	./elasticsearch/nodes/0/indices/x_qsGMdqQZi8Es_KKvWuCA
1.9G	./elasticsearch/nodes/0/indices/-CI6VqaYR8KMaHyesiDrdA
2.2M	./elasticsearch/nodes/0/indices/MwUf5GHcSxmmNwRr80szLw
5.6M	./elasticsearch/nodes/0/indices/9efquwiUTOu0Uyba2m3wIQ
1.9G	./elasticsearch/nodes/0/indices/jZdwR2BuSfWK_wR0k_Jx8w
1.9G	./elasticsearch/nodes/0/indices/K5aLBsDjRPGWEuK3AdWgVA
1.9G	./elasticsearch/nodes/0/indices/rqzC5P0KQqeVlPOiaMRJ-g
226M	./elasticsearch/nodes/0/indices/EgLHTHW0S6i6eJT1HOJi7w
1.9G	./elasticsearch/nodes/0/indices/oNRlGK97S7iewBzr2o0dyA
1.9G	./elasticsearch/nodes/0/indices/LBWkHnAeTvaBSSOldBO8Tg
156M	./elasticsearch/nodes/0/indices/MXiwXUudSY2yY5cm27mFcQ
5.9M	./elasticsearch/nodes/0/indices/YhcziY5OTLiLyl6pTAV0WA
310M	./elasticsearch/nodes/0/indices/SFbJg2oYRQCPGbdsdb2LZQ
303M	./elasticsearch/nodes/0/indices/4r3_DX-NTzaK-6IieDkEFA
1.9M	./elasticsearch/nodes/0/indices/8l4-SA5PR6-4MxGXWG7fgA
193M	./elasticsearch/nodes/0/indices/XmiRCtGKRuqIWtLx2J_sbA
25M	./elasticsearch/nodes/0/indices/z2R2msq_QaS6OGF9JBKU7w
1.9G	./elasticsearch/nodes/0/indices/2A6_bEZ6RKe16dkXT2Xa_Q
1.9G	./elasticsearch/nodes/0/indices/Fpj3EBzYRke83n--8Jz0ow
1.9G	./elasticsearch/nodes/0/indices/LsrLw3n2QoKTcxBwQ7A_Bg
285M	./elasticsearch/nodes/0/indices/zWlP4T0gQ5KvOc9Q0zQkkw
1.9G	./elasticsearch/nodes/0/indices/y15InfVUTEWNW_pj4EjW7w
1.9G	./elasticsearch/nodes/0/indices/1JLAYcPiRGO8zaE8v_1iIw
207M	./elasticsearch/nodes/0/indices/P4BK8a0ERwy3Rvn4vf2K4Q
1.9G	./elasticsearch/nodes/0/indices/9sV3vtiNRhKI05YC1VIwoQ
5.0M	./elasticsearch/nodes/0/indices/6uPdPVNeTBGbvdxp0Cc0XQ
35G	./elasticsearch/nodes/0/indices
35G	./elasticsearch/nodes/0
35G	./elasticsearch/nodes
35G	./elasticsearch
0	./logstash
35G	.

why is there a difference?

Note, we did dropped a lot of indices recently, are these somehow lingering files??

Cheers,


(David Pilato) #2

Can this be caused by the transaction log?
What is your version?


#3

version info, how do i find out that the translog are the issue and will it be removed under space pressure?

GET /
{
  "name" : "43aC6EG",
  "cluster_name" : "eds-cluster-elk",
  "cluster_uuid" : "z61nh_jpRoaGaYQpPFSJLg",
  "version" : {
    "number" : "6.4.2",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "04711c2",
    "build_date" : "2018-09-26T13:34:09.098244Z",
    "build_snapshot" : false,
    "lucene_version" : "7.4.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

Not sure how to fetch transaction log size how to find it
here are our cluster settings

{
  "persistent": {
    "action": {
      "auto_create_index": "+.kibana*, +.monitoring*, +.watcher*, +.prod_eds_messages*, +.dev_eds_messages*"
    },
    "indices": {
      "recovery": {
        "max_bytes_per_sec": "50mb"
      }
    },
    "xpack": {
      "monitoring": {
        "collection": {
          "enabled": "true"
        }
      }
    }
  },
  "transient": {}
}

#4

Hi, we are reaching, 85% threshold disk usage soon, ((

GET /_cat/allocation?v&h=disk.indices,disk.avail,disk.used,disk.total,disk.percent

disk.indices disk.avail disk.used disk.total disk.percent
      15.8gb      8.1gb    41.8gb     49.9gb           83

15.8G of data, total disk 49.9gb,

disk used 41.8gb, is it going to flush soon (i hope?)


(David Pilato) #5

May be you can try to run a flush and see if it helps? https://www.elastic.co/guide/en/elasticsearch/reference/6.5/indices-flush.html


(Christian Dahlqvist) #6

In recent versions the translog is kept around for a while as it allows recovery based on sequence numbers which can speed up recovery significantly. You can tune this if you want to.


#7

I did

POST _all/_flush?wait_if_ongoing=true&force=true

reponse

{
  "_shards": {
    "total": 388,
    "successful": 208,
    "failed": 0
  }
}

nothing changed

disk.indices disk.avail disk.used disk.total disk.percent
      16.3gb      7.9gb      42gb     49.9gb           84

#8

we do have high insert rate (bulk insert)
Sorry i dont find the doc perticularly clear

index.translog.retention.size

The total size of translog files to keep. Keeping more translog files increases the chance of performing an operation based sync when recovering replicas. If the translog files are not sufficient, replica recovery will fall back to a file based sync. Defaults to 512mb

index.translog.retention.age

The maximum duration for which translog files will be kept. Defaults to 12h .

I must change these params to keep translog disk footprint to a certain size????
We do _bulk api inserts of 500-1500/sec

Note: its not critical data at the moment, its still an mvp at the moment, so its no problem to change params and experiment


#9

its is flushing and keeping it below 85%
nice,

disk.indices disk.avail disk.used disk.total disk.percent
      16.9gb      8.6gb    41.3gb     49.9gb           82

thanks for your help, it seems elastic will cleanup as needed


(David Pilato) #10

If your index is not "stable" because you are doing a lot of indexing, then many things can happen. For example segment merge.

Which means that at some point of time you might have 2 segments containing each let say 100k docs.
Elasticsearch will create a new segment containing 200k docs and then once done will remove the 2 old segments. Which means that during this operation you might use more disk space than needed.

My 2 cents.


(David Turner) #11

Yes, the disk.indices statistic only counts the store, i.e. segment files, and not the translog. Here is a cluster containing only one shard with ~1MB of segment data and ~4MB of translog:

$ curl 'http://localhost:9200/_cat/allocation?v&h=disk.indices'
disk.indices
    1015.7kb
$ du -k elasticsearch-6.5.1/data-0
...
4120  elasticsearch-6.5.1/data-0/nodes/0/indices/Pw0fY116RE2EuySrE1zLyQ/0/translog
1064  elasticsearch-6.5.1/data-0/nodes/0/indices/Pw0fY116RE2EuySrE1zLyQ/0/index
...
5220  elasticsearch-6.5.1/data-0

#12

Thank you @dadoonet and @DavidTurner

Is there a way to look at the internals of the workings, would that be the unallocated segments you speak off?

@DavidTurner, in your example of file listings, is the part of the path Pw0fY116RE2EuySrE1zLyQ the uuid of your index?