Hi Mark,
We restarted our data upload process (setting replica=0 and batchsize to
5k). It proceeded for about 24hrs, somewhere after that we started to see
issues with master not being present in the cluster.
The log on the master node has lines like below. It appears there might be
a network issue where some nodes were not pingable and fell below the
threshold of minimum quorum for master node election, and the master sort
of demoted itself. But whats not clear is why a new master did not get
elected for several hours after this. The cluster cannot be queried. Whats
the best way to recover from this situation?
[2015-01-11 19:12:12,612][WARN ][discovery.zen ] [Earthquake]
not enough master nodes, current nodes:
{[Eliminator][8_j19XFyQcupAkRUYCjI6w][CH1SCH050103339][inet[/10.47.44.70:9300]],[Dragoness][5wL9nCAESDOs3jLMBzvunQ][CH1SCH060022834][inet[/10.47.50.155:9300]],[Ludi][7zEEioZIRMigz25356K7Yg][CH1SCH050051103][inet[/10.46.196.234:9300]],[The
Angel][BbFDPq87TMiVgMeNWeJIFg][CH1SCH060040431][inet[/10.47.33.247:9300]],[Helio][5UekTGE2Qq6YcyDb-daxmw][CH1SCH050103338][inet[/10.46.238.181:9300]],[Stacy
X][_sCSZZVtTGmoowmg-1ggVQ][CH1SCH050051544][inet[/10.46.178.74:9300]],[Thunderfist][wKHKYQ9qT9GPLdt6tCQ3Rg][CH1SCH060033137][inet[/10.47.58.209:9300]],[Dominic
Petros][E85WKXuLRDC4_WXzoyaTnQ][CH1SCH060021537][inet[/10.46.234.92:9300]],[Umbo][YFysv6iRSLWSJBax5ObMHg][CH1SCH060021735][inet[/10.46.243.90:9300]],[Earthquake][-8C9YSAbQMy4THZIgV7qUw][CH1SCH060051026][inet[/10.46.153.118:9300]],[Atum][0DWVWBrZR2OFIEQTq9QBBw][CH1SCH050051545][inet[/10.47.50.232:9300]],[Baron
Von
Blitzschlag][mADN2KUYS2uPiuSl4qMHIw][CH1SCH060040630][inet[/10.47.42.139:9300]],[Foolkiller][Iw9PyArIR2OCDaCm1gW-Vw][CH1SCH060040629][inet[/10.46.149.174:9300]],[The
Grip][38K3nW5wR7aUCkR6i8BCiQ][CH1SCH050051645][inet[/10.46.203.126:9300]],[Shamrock][aObajuGRShKIA9TXF5_yQg][CH1SCH050071331][inet[/10.46.194.122:9300]],[Orphan][dCC94_8VSHqP7_hFf3r3Qg][CH1SCH060051535][inet[/10.47.46.65:9300]],[Tombstone][dq1NNI8IREuPig2qaT27Xw][CH1SCH050102843][inet[/10.47.54.208:9300]],[Logan][C5bigQQ1QomGpExc9NQq-w][CH1SCH060040333][inet[/10.46.164.129:9300]],[Rama-Tut][gLnUtgR6Q12r4oJKUiKGZQ][CH1SCH050051642][inet[/10.46.216.169:9300]],[Grotesk][Dc61zzt9QRC6mYDeswKugw][CH1SCH050051640][inet[/10.47.9.152:9300]],[Machine
Teen][O39QiBsaQ6iq1kJvOw_QDA][CH1SCH060021439][inet[/10.46.242.110:9300]],[Edwin
Jarvis][31cJWsbnRDuvWNJ8X5ybJQ][CH1SCH050063532][inet[/10.46.190.247:9300]],[Harry
Osborn][wBaDgNdJSkmW5vCAfODQXA][CH1SCH060021440][inet[/10.47.20.183:9300]],[Brain
Drain][Oy4Mo_2HSO6Y_fX3C11YBw][CH1SCH060040433][inet[/10.47.0.136:9300]],[Romany
Wisdom][20Tc5x2ATdC3o6AdD7f67w][CH1SCH060031345][inet[/10.47.3.169:9300]],[Jekyll][lS_ugcBORj-485okkVScPg][CH1SCH060033044][inet[/10.46.41.179:9300]],[Ajaxis][q4Shk4fhSWuYiZVq2oCnWw][CH1SCH060021243][inet[/10.46.221.243:9300]]{master=false},[Cheetah][GH_6-7OQQ7CH1YrObU6wVA][CH1SCH060051532][inet[/10.46.150.153:9300]],[Marduk
Kurios][Aqfc5EUIQYmvacnY6sGEpg][CH1SCH050051546][inet[/10.46.162.240:9300]],[Left
Hand][VloLsymETGmy9Yu14TkygA][CH1SCH060051521][inet[/10.47.63.194:9300]],[Washout][tHz2w8mGRmKO9W7vUljXrA][CH1SCH050063531][inet[/10.47.54.30:9300]],[Fixer][HH3gxjFoSxeHbvF4SMnVsw][CH1SCH050051643][inet[/10.46.173.163:9300]],[Tyrannus][g1znl166QaO84xJjsABsfQ][CH1SCH060033139][inet[/10.46.42.62:9300]],[Master
Pandemonium][p7ALngKoTrW9NaTKIyP1XA][CH1SCH060051421][inet[/10.47.23.193:9300]],[Digitek][z5SzLFtlTWCL0R5cJQWqUw][CH1SCH050051641][inet[/10.46.154.21:9300]],[Jane
Kincaid][UtjfYoxjQ2mqa40vX_eGhQ][CH1SCH060041325][inet[/10.47.2.144:9300]],[Baron
Samedi][a-gBYy2vTFKgdjwG6jWeSA][CH1SCH060033138][inet[/10.47.63.100:9300]],[Fan
Boy][-iehboPZTHud5kttbYlQyQ][CH1SCH060051438][inet[/10.46.153.84:9300]],[Unseen][PB4SZuwUTwaJtp43iMIArg][CH1SCH060021739][inet[/10.46.217.50:9300]],[Elektro][ZsVkXHPHRJ-VCmvtbb_PHA][CH1SCH050051542][inet[/10.46.179.1:9300]],[Vashti][K2BO32gNTru8xUPqETOL-g][CH1SCH050063239][inet[/10.46.213.4:9300]],}
[2015-01-11 19:12:12,612][INFO ][cluster.service ] [Earthquake]
removed
{[Blackheart][yVhuAFprS3SwXZoKsfbEPg][CH1SCH060040527][inet[/10.46.150.20:9300]],},
reason:
zen-disco-node_failed([Blackheart][yVhuAFprS3SwXZoKsfbEPg][CH1SCH060040527][inet[/10.46.150.20:9300]]),
reason failed to ping, tried [3] times, each with maximum [30s] timeout
[2015-01-11 19:12:12,633][ERROR][cluster.action.shard ] [Earthquake]
unexpected failure during [shard-failed ([agora_v1][23],
node[g1znl166QaO84xJjsABsfQ], relocating [F9SdmzilSHiuib4jihKstw], [P],
s[RELOCATING]), reason [Failed to perform [indices:data/write/bulk[s]] on
replica, message
[SendRequestTransportException[[Klaw][inet[/10.46.158.143:9300]][indices:data/write/bulk[s][r]]];
nested: NodeNotConnectedException[[Klaw][inet[/10.46.158.143:9300]] Node
not connected]; ]]]
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: no
longer master. source: [shard-failed ([agora_v1][23],
node[g1znl166QaO84xJjsABsfQ], relocating [F9SdmzilSHiuib4jihKstw], [P],
s[RELOCATING]), reason [Failed to perform [indices:data/write/bulk[s]] on
replica, message
[SendRequestTransportException[[Klaw][inet[/10.46.158.143:9300]][indices:data/write/bulk[s][r]]];
nested: NodeNotConnectedException[[Klaw][inet[/10.46.158.143:9300]] Node
not connected]; ]]]
at
org.elasticsearch.cluster.ClusterStateUpdateTask.onNoLongerMaster(ClusterStateUpdateTask.java:53)
at
org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:324)
at
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
On Friday, January 9, 2015 at 11:33:58 PM UTC+5:30, Darshat Shah wrote:
Hi Mark,
Thanks for the reply. I need to prototype and demonstrate at this scale
first to ensure feasibility. Once we've proven ES works for this use case
then its quite possible that we'd engage with support for production.
Regarding your questions:
*What version of ES, java? *
[Darshat]ES 1.4.1, JVM 1.8.0 (latest I found from Oracle for Win64)
*What are you using to monitor your cluster? *
Not much really. I tried installing marvel after we ran into issues, but
mostly looking at cat apis and indices apis.
*How many GB is that index? *
About 1GB for every million entries. We don't have string fields but many
numeric fields on which aggregations are needed. With 4.5billion docs, the
total index size was about 4.5 TB spread over the 98 nodes.
*Is it in one massive index? *
Yes one hell of an index if it works! We need aggregations over this data
across multiple fields
*How many GB in your data in total? *
It can be order of 30TB.
*Why do you have 2 replicas? *
Data loss is generally considered not ok. However for our next run, as you
suggest, we will start with 0 replica and update it after bulk load is
done.
*Are you searching while indexing, or just indexing the data? If it's the
latter then you might want to try disabling replica's and then setting the
index refresh rate to -1 for the index, insert your data, and then turn
refresh back on and then let the data index. That's best practice for large
amounts of indexing. *
Just indexing to get historical data up there. We did set the refresh to
-1 before we began upload.
*Also, consider dropping your bulk size down to 5K, that's generally
considered the upper limit for bulk API batches. *
We are going to attempt another upload with these changes. I also set the
index_buffer_size to 40% in case it helps.
On Friday, January 9, 2015 at 3:29:48 PM UTC+5:30, Mark Walkom wrote:
Honestly, with this sort of scale you should be thinking about support
(disclaimer: I work for Elasticsearch support).
However let's see what we can do;
What version of ES, java?
What are you using to monitor your cluster?
How many GB is that index?
Is it in one massive index?
How many GB in your data in total?
Why do you have 2 replicas?
Are you searching while indexing, or just indexing the data? If it's the
latter then you might want to try disabling replica's and then setting the
index refresh rate to -1 for the index, insert your data, and then turn
refresh back on and then let the data index. That's best practice for large
amounts of indexing.
Also, consider dropping your bulk size down to 5K, that's generally
considered the upper limit for bulk API batches.
On 9 January 2015 at 14:44, Darshat Shah dar...@gmail.com wrote:
Hi,
We have a 98 node cluster of ES with each node 32GB RAM. 16GB is
reserved for ES via config file. The index has 98 shards with 2 replicas.
On this cluster we are loading a large number of documents (when done it
would be about 10 billion). About 40million documents are generated per
hour and we are pre-loading several days worth of documents to prototype
how ES will scale, and its query performance.
Right now we are facing problems getting data pre-loaded. Indexing is
turned off. We use NEST client, with batch size of 10k. To speed up data
load, we distribute the hourly data to each of the 98 nodes to insert in
parallel. This worked ok for a few hours till we got 4.5B documents in the
cluster.
After that the cluster state went to red. The outstanding tasks CAT API
shows errors like below. CPU/Disk/Memory seems to be fine on the nodes.
Why are we getting these errors and is there a best practice? any help
greatly appreciated since this blocks prototyping ES for our use case.
thanks
Darshat
Sample errors:
source : shard-failed ([agora_v1][24],
node[00ihc1ToRiqMDJ1lou1Sig], [R],
s[INITIALIZING]),
reason [Failed to start shard, message
[RecoveryFailedException[[agora_v1][24]: Recovery
failed from [Shingen
Harada][RDAwqX9yRgud9f7YtZAJPg][CH1
SCH060051438][inet[/10.46.153.84:9300]] into
[Elfqueen][
00ihc1ToRiqMDJ1lou1Sig][CH1SCH050053435][inet[/10.46.182
.106:9300]]]; nested:
RemoteTransportException[[Shingen
Harada][inet[/10.46.153.84:9300]][internal:index/shard/r
ecovery/start_recovery]]; nested:
RecoveryEngineException[[agora_v1][24] Phase[1]
Execution failed]; nested:
RecoverFilesRecoveryException[[agora_v1][24]
Failed to
transfer [0] files with total size of [0b]];
nested: NoS
uchFileException[D:\app\ES.Elasticsearch_v010\elasticsea
rch-1.4.1\data\AP-elasticsearch\nodes\0\indices\agora_v1
\24\index\segments_6r]; ]]
AND
source : shard-failed ([agora_v1][95],
node[PUsHFCStRaecPA6MuvJV9g], [P],
s[INITIALIZING]),
reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[agora_v1][95]
failed to fetch index version after copying it
over];
nested: CorruptIndexException[[agora_v1][95]
Preexisting corrupted index
[corrupted_1wegvS7BSKSbOYQkX9zJSw] caused by:
CorruptIndexException[Read past EOF while reading
segment infos]
EOFException[read past EOF:
MMapIndexInput(path="D:\
app\ES.Elasticsearch_v010\elasticsearch-1.4.1\data\AP-el
asticsearch\nodes\0\indices\agora_v1\95\index\segments_1
1j")]
org.apache.lucene.index.CorruptIndexException:
Read
past EOF while reading segment infos
at
org.elasticsearch.index.store.Store.readSegmentsI
nfo(Store.java:127)
at
org.elasticsearch.index.store.Store.access$400(St
ore.java:80)
at
org.elasticsearch.index.store.Store$MetadataSnaps
hot.buildMetadata(Store.java:575)
---snip more stack trace-----
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0f24b939-2cba-41a9-8de8-49565f77e567%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0f24b939-2cba-41a9-8de8-49565f77e567%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/00582fb0-f62d-4fd4-ba11-0ef8b7a94ec3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.