Does anybody know why segments memory increases (column 'sm' in _cat API) after searches.
For example, I have the next stat before searches: n hp hm fm fe fcm sm so sqc Thunderstrike 58 86gb 0b 26575 1.7gb 29.9gb 0 0 Burstarr 57 86gb 492.4mb 20461 1.4gb 24.9gb 0 0 Microchip 2 16gb 0b 0 0b 0b 0 0
And after ~30 wide searches: n hp hm fm fe fcm sm so sqc Thunderstrike 72 86gb 5.1gb 139468 1.7gb 44.9gb 0 0 Burstarr 78 86gb 2.8gb 78926 1.4gb 37.4gb 0 0 Microchip 10 16gb 0b 0 0b 0b 0 0
If I restart cluster, I'll get the first values of sm.
Is it Lucene area?
P.S. Yes, I have a LOT of data (~10TB of indices size).
I can add some additional information I investigated recently.
elasticsearch-1.7.3-1.noarch
[localhost-"192.168.238.168" index]# service elasticsearch restart
Stopping elasticsearch: [ OK ]
Starting elasticsearch: [ OK ]
[localhost-"192.168.238.168" index]# curl '127.0.0.1:9200/_cat/nodes?h=sm&v'
sm
34mb
[localhost-"192.168.238.168" index]# curl localhost:9200/_search?q=hello
{result}
[localhost-"192.168.238.168" index]# curl '127.0.0.1:9200/_cat/nodes?h=sm&v'
sm
44.8mb
[localhost-"192.168.238.168" index]# service elasticsearch restart
Stopping elasticsearch: [ OK ]
Starting elasticsearch: [ OK ]
[localhost-"192.168.238.168" index]# curl '127.0.0.1:9200/_cat/nodes?h=sm&v'
sm
34mb```
But!
```[localhost-"192.168.238.168" index]# service elasticsearch restart
Stopping elasticsearch: [ OK ]
Starting elasticsearch: [ OK ]
[localhost-"192.168.238.168" index]# curl '127.0.0.1:9200/_cat/nodes?h=sm&v'
sm
34mb
[localhost-"192.168.238.168" index]# curl localhost:9200/_search?q=h*
{result}
[localhost-"192.168.238.168" index]# curl '127.0.0.1:9200/_cat/nodes?h=sm&v'
sm
34mb```
So sm increases after full term search.
I investigated lucene code a bit and found that sm consists of FieldsProducer, DocValuesProducer, StoredFieldsReader and TermVectorsReader. Who is the culprit? TermsVector?
Is it important? Very important (at least for me), because I have the next typical situation:
1) After ES starting everything is OK, GC works rarely and frees a lot of memory;
```[2015-12-13 14:00:09,446][INFO ][monitor.jvm ] [Sluk] [gc][young][37410][451] duration [863ms], collections [1]/[1s], total [863ms]/[1.1m], memory [76.8gb]->[53.5gb]/
[86gb], all_pools {[young] [26.2gb]->[32mb]/[0b]}{[survivor] [352mb]->[3.2gb]/[0b]}{[old] [50.2gb]->[50.2gb]/[86gb]}```
2) I make 100 searches or more
3) GC works every minute and frees nothing
```[2015-12-15 14:02:20,000][WARN ][monitor.jvm ] [War V] [gc][old][15560][682] duration [25.9s], collections [1]/[30.2s], total [25.9s]/[5.1h], memory [80.2gb]->[77.8gb
]/[86gb], all_pools {[young] [2gb]->[128mb]/[0b]}{[survivor] [192mb]->[0b]/[0b]}{[old] [78gb]->[77.7gb]/[86gb]}```
4) ES becomes unresponsive
But index size the same!
Sorry! Yeah, norms are a thing. They live in memory in some versions but there was an effort to get them to disk like doc values and that's done but I don't remember which version it is in. 2.0 probably.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.