About simple 3 node cluster frecuent crashing


(Maykell Frometa) #1

Hi,
i have a simple cluster of 3 nodes, 1 master only, 2 data nodes all of them in windows server 2008 R2 Standard with 8 Gb of RAM. The problem i'm facing recurrently is that when any of the data nodes fail because of memory consuntion the whole cluster fail.
I have followed all the guidelines and articles about how to setup an elasticsearch cluster but the problems remain.
Basically i have 2 problems.
1- The data nodes cunsume all the available server memory and the OS crash.
- Here i setup windows to stop virtual memory swaping and set the ES_HEAP_SIZE, ES_MAX_MEM, ES_MIN_MEM to the half of the total fisical server memory just as indicated in the documentation

2- When any of the data nodes fail, the whole cluster fail.

  • The nodes respond via browser request to :9200 but any request to index data or read data doesn't works

My final goal is to get a configuration that allow the cluster to be up and responding even in the situation where one of the data nodes crash for whatever posible reason. I need to be able to issolate a data node in a way that whatever problem or crash it have does not affect the entire cluster


(Christian Dahlqvist) #2

What is in the Elasticsearch logs when this happens? Do all nodes crash or is it just the data nodes? What is the mix of indexing and querying? Is a single data node able to handle the full query and indexing workload?


(Maykell F) #3

Hi, thank you for the quick response.
1- What is in the Elasticsearch logs when this happens?
please review the attached logs
2- Just data nodes and it impact one node at a time
3- Much more queries than indexing, i don't know how to give a better
measurement, but the rule is much more querying then indexing
4- Yes, a single data node is able to handle the full query and indexing
workload

Note:
I attach the elasticsearch.yml of the 3 nodes in the cluster beside the
logs files of the Node1 at the time of the last crashing episode.

I'm pushing the option of Elasticsearch in my organization but at this
moment i'm stucked with the issues i mention in my previous mail.
Thank you in advance,
Maykell


(Maykell Frometa) #4

Hi,
Have had the opportunity to review my request?
In the above mail i gave you the info you requested, do you need more info?

Thanks,
Maykell


(Christian Dahlqvist) #5

I have not received any data or logs and generally prefer that the discussion is kept in the open whenever possible instead of over private messages so that it can benefit other com unity members. I sounds to me like you are seeing a cascading failure. How did you verify that a single node can handle the full indexing and query load? Which version of Elasticsearch are you using? What does your config look like?


(Maykell Frometa) #6

Hi,
i responded to the email i received with your previous answer and include the config files and logs as attachments, i don't see any action that allow me to attach files other than images to a post in this page, please let me know how to do that.
Any way i'll try to change the file extensions and include those files in this post as images, actually are .zip files.

Related to your questions:
1- > How did you verify that a single node can handle the full indexing and query load?

I'm sure of that because i stoped one of the data nodes several days ago and the remaining one is holding up all the workload. The problem that i have is the same, every few days elastic consume all available server memory and the server crash.

2- > Which version of Elasticsearch are you using? What does your config look like?

Elasticsearch 2.4.1.
Related to config files, how can i upload files as attachments to a post in this page?

Thanks in advance


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.