JDBC River - Elasticsearch crashes while indexing


(Jesse Adler) #1

Hi,

the following issue seems to be a jdbc river problem, not an elasticsearch
problem.

I am trying to get 150k records from my mysql database and push them to my
index through the jdbc river plugin. Unfortunately elasticsearch
periodically crashes.
I am not quite sure if this is a matter of load (mising ram, missing cpu)
or if it is some sort of misconfiguration. Could somebody please have a
look at my configuration and the error log?

ES crashes at 2013-10-22 09:09:36,388
When I decrease bulk size to 80, same thing happens

Thank you for your help
jesse

elasticsearch 0.90.5
jdbc river: 2.2.2

server: 1GB RAM

elasticsearch config:
cluster.name: mycluster
node.name: "myhost.myhost.net"
path.logs: /var/log/elasticsearch
network.bind_host: 127.0.0.1

every other setting is default

jdbc config:

{
"type" : "jdbc",
"jdbc" : {
"strategy" : "simple",
"driver" : "com.mysql.jdbc.Driver",
"url" : "jdbc:mysql://localhost:3306/table",
"user" : "user",
"password" : "password",
"digesting" : false,
"poll" : "86400s",
"sql" : "select _id, source_timestamp, field1, field2, field3 from
river_jdbc where field3=0"
},
"index" : {
"index" : "myindex",
"type" : "object",
"versioning" : true,
"acknowledge" : true,
"bulk_size" : 100
}
}

elasticsearch log

[2013-10-22 09:09:35,370][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [248] of [100 items], 30 outstanding bulk requests
[2013-10-22 09:09:35,879][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] bulk
[219] success [100 items] [4256ms]
[2013-10-22 09:09:35,954][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [249] of [34 items], 30 outstanding bulk requests
[2013-10-22 09:09:36,008][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] bulk
[220] success [100 items] [4360ms]
[2013-10-22 09:09:36,011][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [250] of [100 items], 30 outstanding bulk requests
[2013-10-22 09:09:36,107][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] bulk
[221] success [100 items] [4439ms]
[2013-10-22 09:09:36,107][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [251] of [100 items], 30 outstanding bulk requests
[2013-10-22 09:09:36,344][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] bulk
[222] success [100 items] [4656ms]
[2013-10-22 09:09:36,350][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [252] of [100 items], 30 outstanding bulk requests
[2013-10-22 09:09:36,386][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] bulk
[223] success [100 items] [4673ms]
[2013-10-22 09:09:36,388][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [253] of [100 items], 30 outstanding bulk requests
[2013-10-22 09:10:33,208][INFO ][node ] [host.host.net]
version[0.90.5], pid[14280], build[c8714e8/2013-09-17T12:50:20Z]
[2013-10-22 09:10:33,209][INFO ][node ] [host.host.net]
initializing ...
[2013-10-22 09:10:33,220][INFO ][plugins ] [host.host.net]
loaded [jdbc-river], sites []
[2013-10-22 09:10:35,780][INFO ][node ] [host.host.net]
initialized
[2013-10-22 09:10:35,781][INFO ][node ] [host.host.net]
starting ...
[2013-10-22 09:10:35,885][INFO ][transport ] [host.host.net]
bound_address {inet[/127.0.0.1:9300]}, publish_address
{inet[/89.238.100.100:9300]}
[2013-10-22 09:10:38,920][INFO ][cluster.service ] [host.host.net]
new_master
[host.host.net][nV8Q5elYQ5qzqwcssHHFXA][inet[/89.238.100.100:9300]],
reason: zen-disco-join (elected_as_master)
[2013-10-22 09:10:38,945][WARN ][cluster.service ] [host.host.net]
failed to connect to node
[[host.host.net][nV8Q5elYQ5qzqwcssHHFXA][inet[/89.238.100.100:9300]]]
org.elasticsearch.transport.ConnectTransportException:
[host.host.net][inet[/89.238.100.100:9300]] connect_timeout[30s]
at
org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:675)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:608)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:576)
at
org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:129)
at
org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:372)
at
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:135)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: java.net.ConnectException: Connection refused:
/89.238.100.100:9300
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:150)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
... 3 more
[2013-10-22 09:10:38,949][INFO ][discovery ] [host.host.net]
alconsulting/nV8Q5elYQ5qzqwcssHHFXA
[2013-10-22 09:10:38,964][INFO ][http ] [host.host.net]
bound_address {inet[/127.0.0.1:9200]}, publish_address
{inet[/89.238.100.100:9200]}
[2013-10-22 09:10:38,964][INFO ][node ] [host.host.net]
started
[2013-10-22 09:10:39,109][DEBUG][action.get ] [host.host.net]
[_river][0]: failed to execute [[_river][river_jdbc][_meta]: routing [null]]
org.elasticsearch.indices.IndexMissingException: [_river] missing
at
org.elasticsearch.indices.InternalIndicesService.indexServiceSafe(InternalIndicesService.java:334)
at
org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:96)
at
org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:42)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction$1.run(TransportShardSingleOperationAction.java:161)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
[2013-10-22 09:10:44,861][INFO ][gateway ] [host.host.net]
recovered [2] indices into cluster_state
[2013-10-22 09:10:47,845][INFO
][org.xbib.elasticsearch.river.jdbc.JDBCRiver] [host.host.net]
[jdbc][river_jdbc] starting JDBC river: URL
[jdbc:mysql://localhost:3306/provisionsfrei], driver
[com.mysql.jdbc.Driver], strategy [simple], index [is24]/[object]
[2013-10-22 09:11:03,322][DEBUG][action.search.type ] [host.host.net]
All shards failed for phase: [query]
^[[B[2013-10-22 09:11:09,960][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [1] of [100 items], 1 outstanding bulk requests
[2013-10-22 09:11:10,330][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [2] of [100 items], 2 outstanding bulk requests
[2013-10-22 09:11:10,450][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [3] of [36 items], 3 outstanding bulk requests
[2013-10-22 09:11:10,852][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [4] of [100 items], 4 outstanding bulk requests
[2013-10-22 09:11:10,896][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [5] of [100 items], 5 outstanding bulk requests
[2013-10-22 09:11:11,065][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [6] of [100 items], 6 outstanding bulk requests
[2013-10-22 09:11:11,112][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [7] of [100 items], 7 outstanding bulk requests
[2013-10-22 09:11:11,185][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [8] of [100 items], 8 outstanding bulk requests
[2013-10-22 09:11:11,276][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new
bulk [9] of [100 items], 9 outstanding bulk requests
[

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #2

Something is restarting your node between 2013-10-22 09:09:36,388 and
2013-10-22
09:10:33,208 - or is something missing from the logs?

Was there heavy load on the cluster?

The cluster triggered restart of the river looks a little bit flaky. But
I'm impressed how the bulk indexing resumes at 2013-10-22 09:11:09,960 :slight_smile:

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jesse Adler) #3

Hi Jörg,

thank you for your very quick reply.

Something is restarting your node between 2013-10-22 09:09:36,388 and 2013-10-22

09:10:33,208 - or is something missing from the logs?

There is nothing missing. this is the seamless logfile.

Was there heavy load on the cluster?

Well, if you are talking about load in terms of elasticsearch i/o I'd say
no. The website is not under heavy user load.
If you are talking about the server itself, I have to say absolutely!
While the river is bulk indexing, CPU is 300% and RAM is being completely
consumed.

I lowered bulk size to 70, but after completion I get the following log
and the river won't restart after configured time. The "failed to reconnect
to node" error appears every 1 minute until I restart the elasticsearch
server.

[2013-10-24 11:50:20,298][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverFlow] next
run, waiting 1d
[2013-10-24 11:50:20,327][INFO
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] bulk
[12078] success [13 items] [2668ms]
[2013-10-24 11:50:35,883][WARN ][cluster.service ]
[host.host.net] failed to reconnect to node
[host.host.net][FWgPevjsTZSlXUBMjTrt7A][inet[/89.238.100.100:9300]]
org.elasticsearch.transport.ConnectTransportException:
[host.host.net][inet[/89.238.100.100:9300]] connect_timeout[30s]
at
org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:675)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:608)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:576)
at
org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:129)
at
org.elasticsearch.cluster.service.InternalClusterService$ReconnectToNodes.run(InternalClusterService.java:475)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: java.net.ConnectException: Connection refused:
/89.238.100.100:9300
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:150)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
... 3 more
[2013-10-24 11:51:35,914][WARN ][cluster.service ]
[host.host.net] failed to reconnect to node
[host.host.net][FWgPevjsTZSlXUBMjTrt7A][inet[/89.238.100.100:9300]]
org.elasticsearch.transport.ConnectTransportException:
[host.host.net][inet[/89.238.100.100:9300]] connect_timeout[30s]
at
org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:675)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:608)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:576)
at
org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:129)
at
org.elasticsearch.cluster.service.InternalClusterService$ReconnectToNodes.run(InternalClusterService.java:475)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: java.net.ConnectException: Connection refused:
/89.238.100.100:9300
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:150)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
... 3 more

The cluster triggered restart of the river looks a little bit flaky. But
I'm impressed how the bulk indexing resumes at 2013-10-22 09:11:09,960 :slight_smile:

me too :slight_smile:
Unfortunately the river seems to die after completion and this is the
major problem. I need to river to run every 24 hours.

Jesse

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4