You overload the system. My recommendations for standard hardware:
- use smaller bulk request. 100k docs = 400mb is too much. Around 1000
docs = 4mb is ok.
- measure the response time. It should be around 10-100 milliseconds. As
the index gets bigger, response time will increase but only slowly. Not
much more than a second.
- delays in the response are due to the indexing load of the cluster
- run bulk requests in parallel, but throttle at a certain number of
concurrency. Say 30. Wait for bulk response if you reach the concurrecny
limit before continue.
- advanced setting is to adjust the concurrency rate to the available
CPU power, but YMMV
With this approach, and just 1g of heap, I can index millions and
millions of docs in the standard(!) setting, constantly running for
hours and hours. You can add more heap, but it will not change much
regarding performance.
Split brain is only a side effect. What you see is response timeout of
nodes because you overloaded them. The master waits at max 5s and a node
must respond.
Jörg
Am 07.03.2013 12:05, schrieb Vahid:
Thank you both Clinton and Jörg for your replies.
Sorry for bad explanation. I'll try again:
All the applications are developed by java. ES just stores the data
and processing done outside the ES.
Let say we have a manager application named BDM, source index s1 and
destination index s2.
1- On each node BDM sends a query to ES and (16 is No. of the nodes)
load 100k records from s1 index (means total loaded data by all the
nodes are 1.6m records).
2- BDMs process the fetched data and send bulk inserts to s2 index on
ES(100k bulk size).
3- BDMs send bulk update (100k) to s1 index on ES.
Assigned memory to BDM is 5g and it has no problem with loading and
processing the data.
Size of the data for 100k document is about 400mb. 16g of memory
assigned to ES on each node.
Number of records on s1 is 1.1 billion. At start up s2 was empty and
speed was ok for us and assigned memory to ES was 11g.
After processing 600m records and storing them into s2 we got oom
error and I've increased the memory to 16g.
Now after running the jobs for let say 5 hours, split brain occurred
and I observed in the ES logs on some nodes warning for GC and after
that node got out of the cluster and I had to restart the node.
On the bigdesk I see that cache size is about 100mb and during the
loading and indexig the data heap old generation size is about 70% of
hole the assigned memory.
So I don't know why ES uses such huge size of memory that causes to
split brain and also performance problems.
Thanks again for your helps,
Vahid
On Wednesday, March 6, 2013 8:46:32 PM UTC+1, Jörg Prante wrote:
I don't have insight into your application how it works and how much
memory it uses. But you should be aware that building an index of
about
1.1 billion documents while using an extra application on the same
machines will use significant more memory than just inserting
documents
without extra processing. Also I don't know anything about your
method
of processing, if you process documents in batches, if it's inside or
outside ES, if it's Java or not. This all may add up to resource
consumption in many ways.
My personal impression - without knowledge of the nature of your data
processing - is that with reorganizing memory consumption you
should be
able to streamline the processing. Take care of your algorithm,
reduce
the memory footprint. Do not rely only on modifying the heap size
parameter, it does not change much of the nature of your task. For
example, reducing the result set size and scrolling over small
portions
of a result set may contribute to reducing the memory footprint.
If your
algorithm is really bad, the heap could be as large as your machine
allows and it will still fail. If you ask me, 7G on heap, this is so
huge, it should be sufficient for even the most demanding stream
processing applications.
Jörg
Am 06.03.13 16:40, schrieb Vahid:
> Hi all,
>
> We have a cluster with 16 nodes.
> Total memory is 32g for each server. On each node, ES and 3
other java
> process are running. Assigned memory to ES is 12g(xms and xmx are
> equal), set 7g for other java processes and 9g left for the OS.
> There are 2 big index. Application read from one, process and store
> into another index. the source index has about 1.1 billion
records and
> final data in destination index also would be almost
same(1.1billion).
> So at the same time we have read and write. At first when the
> destination index was empty the application speed was
acceptable(read,
> process and write).
> but after reaching to 600 milion data in destination index, Es
crashed
> with OOM error. Now I increased the Es memory to 16g but the speed
> reduced almost to half.
> each node at same time load 100k records, processes and store them.
> each document size is about .5k after ES compression.
> ES configuration:
>
> bootstrap.mlockall: true
> index.number_of_shards: 51
> index.number_of_replicas: 1
> index.store.compress.stored, true
> index.routing.allocation.total_shards_per_node, 7
> index.merge.policy.use_compound_file, false
> index.compound_format, false
> indices.memory.index_buffer_size, 20%
> index.translog.flush_threshold_period, 60s
> index.merge.policy.merge_factor, 30
> index.store.throttle.type, merge
> index.store.throttle.max_bytes_per_sec, 5mb
> index.cache.filter.max_size, 10
> index.cache.filter.expire, 1m
> index.cache.field.type, resident
> index.cache.field.expire, 1m
>
> So I want to know if it is ES limitation and after reaching such
> amount of data we need to create a new index or we have missed some
> things.
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from
it, send
> an email to elasticsearc...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out
<https://groups.google.com/groups/opt_out>.
>
>
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.