River puts too much load on only one node

I'm using Elasticsearch 1.5. It uses rivers to ingest data. I've 3 nodes in my cluster. The river that ingest large amount of data puts too much load on a specific node. I want to divert this utilization to another node. Is there any way to balance or totally project this load to any other node in my cluster?

This is one of the reasons why rivers were deprecated and largely replaced by Logstash and other external processes. You should really look to upgrade as 1.5 is very, very old.

Yes you are damn right. But for now i don't have choice to upgrade whole cluster and setup a new one. If you can advise any other option.

By the way thanx for quick rply

I have not touched rivers in years so am not sure, but do not think there is any way to balance the load as it is always one node that does the processing at any time.

Can you predict the behaviour if i simply go and set data false to config file of my desired node.
Will this solve my problem?
Or
It may ruins my cluster?
Or
Any other ambiguity?

I have no idea. Which river are you using? If it is the JDBC river I believe there is a stand-alone version that you could run outside the cluster in order to get better balance. You might also consider Logstash.

Yeah I'm using JDBC river

Have a look at the jdbc input plugin of logstash then.

1 Like

I've used it for earlier. It skips some of my records giving an unresolved error. Whatsoever this is not the solution of my problem. I need to go with rivers and just need to balance or totally divert load.

Not being able to distribute load is an inherent problem with rivers, so as long as you continue using it I do not think you can solve your problem. I believe later versions of the JDBC river allowed execution as a separate process, and this could allow you to move the processing off the ES nodes. Logstash is another option, and I am not aware of any issues like the one you describe so it may be a matter of incorrect configuration.

its the issue with logstash I posted when I was using it a couple of months ago and it went unsolved. However, I can't now move back to logstash.

What is the use case at the end?
You want to be able to use elasticsearch in your application which is using a sql data store?

I shared most of my thoughts there: http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-application/

Basically, I'd recommend modifying the application layer if possible and send data to elasticsearch in the same "transaction" as you are sending your data to the database.

That issue is due to the default mapping and you need to provide an index template that maps that field as gloat/double.

BTW I suspect you were using an old version of the stack, right?

Logstash is super stable nowadays. Try version 7.0.1

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.