Production es help

Phani_Nadiminti · November 14, 2016, 5:49am

Hi All,
I have 4 node prod cluster all are redat based servers , all are master and data eligible i have 24 GB Memory for each machine and given12 GB heap and another 12GB for server and lucene as it is recommended by Elastic Search.Now i am seeing heap is filling quickly and java not able to call gc also.
we are doing search operations bit heavy but heap space will recover by gc but it is not happening.

Now my questions are: using ES 1.5.2

Do we need to set any parameters to recovery heap space quickly?
Can I increase es heap size up to 15 GB and i will leave 9 GB remaining to server?
please suggest your recommendations on this

Thanks ....

phani

warkolm · November 14, 2016, 6:24am

The first thing you should do is upgrade.

Phani_Nadiminti · November 14, 2016, 9:17am

Thank you mark i can upgrade but , any clue on why my es heap size increasing heavily even though we are not performing any operations. please suggest me.

i am seeing hot_threads running on one of my node:

:: [node3][8BZti1iGTf68soGQR78BWw][prod-nosql03][inet[/172.16.100.28:9300]]{master=true}
Hot threads at 2016-11-14T09:23:08.979Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

1.7% (8.5ms out of 500ms) cpu usage by thread 'RMI TCP Connection(39)-172.16.100.6'
 10/10 snapshots sharing following 16 elements
   java.net.SocketInputStream.socketRead0(Native Method)
   java.net.SocketInputStream.read(SocketInputStream.java:152)
   java.net.SocketInputStream.read(SocketInputStream.java:122)
   java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
   java.io.BufferedInputStream.read(BufferedInputStream.java:254)
   java.io.FilterInputStream.read(FilterInputStream.java:83)
   sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:549)
   sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:828)
   sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.access$400(TCPTransport.java:619)
   sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:684)
   sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:681)
   java.security.AccessController.doPrivileged(Native Method)
   sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:681)
   java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   java.lang.Thread.run(Thread.java:745)

Thanks
phani

Phani_Nadiminti · November 15, 2016, 5:38pm

Hi @warkolm,

Can I increase node heap size up to 16 GB and see i have 24 GB of RAM to my machine i have allocated 12G but still heap is consumption is up to 99% and garbage collector also not calling may i know the reason please.
I left half of the RAM to machine but it is still not recovering the old gen. can i increase up to 16 GB for one of my node and this Increasing RAM size in my node 4 is fine or do i need to change in all four nodes in my prod cluster,

please suggest me

thanks,
phani

Christian_Dahlqvist · November 15, 2016, 6:21pm

How much data do you have in the cluster? How many indices/shards? What kind of load, querying and indexing, is the cluster under?

Phani_Nadiminti · November 16, 2016, 6:40am

Hi @Christian_Dahlqvist

Thank you . I have 1.5 billion docs on prod cluster it is of size 180GB . indices are 52 indices and size of each index is 4GB and some of them are 7 to 8 GB and remaining all are smaller indices. just we are querying es to get some data not performing heavy aggregation operations . Indexing we are pushing daily 1 million documents to it to various indices and i have nodes with all master eligible and all data nodes.

Thanks
phani.

Phani_Nadiminti · November 16, 2016, 7:02am

Hi @Christian_Dahlqvist

Thank you . I have 1.5 billion docs on prod cluster it is of size 180GB . indices are 52 indices and size of each index is 4GB and some of them are 7 to 8 GB and remaining all are smaller indices. just we are querying es to get some data not performing heavy aggregation operations . Indexing we are pushing daily 1 million documents to it to various indices and i have nodes with all master eligible and all data nodes.

Thanks
phani.

Christian_Dahlqvist · November 16, 2016, 7:04am

That sounds reasonable. Do you have any non-default node configuration that could drive heap usage?

Phani_Nadiminti · November 16, 2016, 7:08am

I have 5 shards and 1 Replica and i have derived heap properties based on es recommendations half of the machine ram i.e 12Gb each . did swapoff and not enabled bootstrpmalloc = true because we already disabled swaping. integrated JMX and defined the following property for cache size

indices.fielddata.cache.size: 5gb

and i thought to put the following properties as well will it suggest able.

indices.breaker.fielddata.limit: 60%
indices.breaker.request.limit: 40%
indices.breaker.total.limit : 70%

apart from these properties i have defined all are default properties and enabled CORS for my search APPs

system · December 14, 2016, 7:08am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Service heap size configuration Elasticsearch	5	596	July 6, 2017
Production cluster slows down after 15-20 days of starting the services Elasticsearch	8	955	July 5, 2017
[gc][2661] overhead, spent [263ms] collecting in the last [1s] Elasticsearch	3	257	April 25, 2023
Node uses too much memory, I think Elasticsearch	4	624	July 6, 2017
What is the recommended value for ES_HEAP_SIZE? Elasticsearch	4	1732	July 5, 2017

Production es help

Related topics