Question about ES cluster topologies and frequent OutOfMemoryError

Hello, all

I'm developing a log system in a company, and using elasticsearch our log
system is indexing and storing every log data.
Our log system create indices per day. Within a daily index, type is
seperated by projects. Total number of projects are more than 1,000.
Size of a index is depending on incoming logs, but some time it is almost
500GB(including less than 1billion document) some times it is 100GB.
(Our goal is covering 1TB per day)
There are 15 ES clusters which have 2 master node, 3 search load balancers,
and 10 data node. The number of shard is 20 per index and replica is one.
We are planning to apply route mechanism soon, but not applied yet.
Each machine have 48 GB memory and 32 GB is belonged to elasticsearch.

I have two questions.
First, current topologies are proper for our system? There are no much
concurrent users, but we want to cover at least 100 concurrent search
queries.
(Query size is depending on users. Some thing can be big, but most query
will be not much big)

Second, even if I set the cache type as soft in configuration file like
below, elasticsearch still show OutOfMemoryError many time if there is big
query or many concurrent request.

index.cache.field.type: soft
index.cache.field.max_size: 10000
index.cache.field.expire: 5m
index.merge.policy.max_merged_segment: 200gb
indices.memory.index_buffer_size: 3gb

[2012-11-06 10:21:16,272][WARN ][index.cache.field.data.soft]
[xseed027.kdev] [nelo2-log-2012-11-02] loading field [body] caused out of
memory failure
java.lang.OutOfMemoryError: Java heap space
at
org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:68)
at
org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:90)
at
org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:56)
at
org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:34)
at
org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)
at
org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:130)
at
org.elasticsearch.index.field.data.strings.StringOrdValFieldDataComparator.setNextReader(StringOrdValFieldDataComparator.java:121)
at
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:95)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)
at
org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:195)
at
org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:149)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:487)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:400)
at
org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:176)
at
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:242)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:529)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:518)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:268)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)

Can you help me to resolve my problem?
Thank you,
Jaeik Lee

--

Hi Jaeik,

First, I would give a try to allocating half of memory (24 GB) for JVM and
leaving other half to operating system. Also did you try to
increase index.cache.field.max_size? Which index.store.type are you using
for index storage? And could you give us some details about versions you
used like JVM, elasticsearch, OS?

Best,
Jak Akdemir

On Tue, Nov 6, 2012 at 4:51 AM, humanjack humanjack@gmail.com wrote:

Hello, all

I'm developing a log system in a company, and using elasticsearch our log
system is indexing and storing every log data.
Our log system create indices per day. Within a daily index, type is
seperated by projects. Total number of projects are more than 1,000.
Size of a index is depending on incoming logs, but some time it is almost
500GB(including less than 1billion document) some times it is 100GB.
(Our goal is covering 1TB per day)
There are 15 ES clusters which have 2 master node, 3 search load
balancers, and 10 data node. The number of shard is 20 per index and
replica is one.
We are planning to apply route mechanism soon, but not applied yet.
Each machine have 48 GB memory and 32 GB is belonged to elasticsearch.

I have two questions.
First, current topologies are proper for our system? There are no much
concurrent users, but we want to cover at least 100 concurrent search
queries.
(Query size is depending on users. Some thing can be big, but most query
will be not much big)

Second, even if I set the cache type as soft in configuration file like
below, elasticsearch still show OutOfMemoryError many time if there is big
query or many concurrent request.

index.cache.field.type: soft
index.cache.field.max_size: 10000
index.cache.field.expire: 5m
index.merge.policy.max_merged_segment: 200gb
indices.memory.index_buffer_size: 3gb

[2012-11-06 10:21:16,272][WARN ][index.cache.field.data.soft]
[xseed027.kdev] [nelo2-log-2012-11-02] loading field [body] caused out of
memory failure
java.lang.OutOfMemoryError: Java heap space
at
org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:68)
at
org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:90)
at
org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:56)
at
org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:34)
at
org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)
at
org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:130)
at
org.elasticsearch.index.field.data.strings.StringOrdValFieldDataComparator.setNextReader(StringOrdValFieldDataComparator.java:121)
at
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:95)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)
at
org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:195)
at
org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:149)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:487)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:400)
at
org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:176)
at
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:242)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:529)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:518)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:268)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)

Can you help me to resolve my problem?
Thank you,
Jaeik Lee

--

--

Hi Jak

I haven't try to increase the index.cache.filed.max_size and set the
index.store.type.
I guess that if I increase the index.cache.filed.max_size, there will be
more possibility to show OOM. Isn't it?
Our system informations are like followings.
java version "1.7.0_05"
CentOS release 6.3 (Final) (Linux 2.6.32-279.2.1.el6.x86_64 )
elasticsearch: 0.20.0.RC1

I'll be waiting your response.
Thank you for your suggestion.

On Wednesday, November 7, 2012 1:38:48 PM UTC+9, Jak Akdemir wrote:

Hi Jaeik,

First, I would give a try to allocating half of memory (24 GB) for JVM and
leaving other half to operating system. Also did you try to
increase index.cache.field.max_size? Which index.store.type are you using
for index storage? And could you give us some details about versions you
used like JVM, elasticsearch, OS?

Best,
Jak Akdemir

On Tue, Nov 6, 2012 at 4:51 AM, humanjack <huma...@gmail.com <javascript:>

wrote:

Hello, all

I'm developing a log system in a company, and using elasticsearch our log
system is indexing and storing every log data.
Our log system create indices per day. Within a daily index, type is
seperated by projects. Total number of projects are more than 1,000.
Size of a index is depending on incoming logs, but some time it is almost
500GB(including less than 1billion document) some times it is 100GB.
(Our goal is covering 1TB per day)
There are 15 ES clusters which have 2 master node, 3 search load
balancers, and 10 data node. The number of shard is 20 per index and
replica is one.
We are planning to apply route mechanism soon, but not applied yet.
Each machine have 48 GB memory and 32 GB is belonged to elasticsearch.

I have two questions.
First, current topologies are proper for our system? There are no much
concurrent users, but we want to cover at least 100 concurrent search
queries.
(Query size is depending on users. Some thing can be big, but most query
will be not much big)

Second, even if I set the cache type as soft in configuration file like
below, elasticsearch still show OutOfMemoryError many time if there is big
query or many concurrent request.

index.cache.field.type: soft
index.cache.field.max_size: 10000
index.cache.field.expire: 5m
index.merge.policy.max_merged_segment: 200gb
indices.memory.index_buffer_size: 3gb

[2012-11-06 10:21:16,272][WARN ][index.cache.field.data.soft]
[xseed027.kdev] [nelo2-log-2012-11-02] loading field [body] caused out of
memory failure
java.lang.OutOfMemoryError: Java heap space
at
org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:68)
at
org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:90)
at
org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:56)
at
org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:34)
at
org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)
at
org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:130)
at
org.elasticsearch.index.field.data.strings.StringOrdValFieldDataComparator.setNextReader(StringOrdValFieldDataComparator.java:121)
at
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:95)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)
at
org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:195)
at
org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:149)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:487)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:400)
at
org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:176)
at
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:242)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:529)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:518)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:268)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)

Can you help me to resolve my problem?
Thank you,
Jaeik Lee

--

--

Hello,

The soft type of field data cache doesn't recognize the max_size and
expiration time properties, they are only valid for the resident cache. Do
you use sorting or faceting extensively ? How your does your queries look
like ?

--
Regards,
Rafał Kuć
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Elasticsearch

W dniu środa, 7 listopada 2012 07:43:12 UTC+1 użytkownik humanjack napisał:

Hi Jak

I haven't try to increase the index.cache.filed.max_size and set the
index.store.type.
I guess that if I increase the index.cache.filed.max_size, there will be
more possibility to show OOM. Isn't it?
Our system informations are like followings.
java version "1.7.0_05"
CentOS release 6.3 (Final) (Linux 2.6.32-279.2.1.el6.x86_64 )
elasticsearch: 0.20.0.RC1

I'll be waiting your response.
Thank you for your suggestion.

On Wednesday, November 7, 2012 1:38:48 PM UTC+9, Jak Akdemir wrote:

Hi Jaeik,

First, I would give a try to allocating half of memory (24 GB) for JVM
and leaving other half to operating system. Also did you try to
increase index.cache.field.max_size? Which index.store.type are you using
for index storage? And could you give us some details about versions you
used like JVM, elasticsearch, OS?

Best,
Jak Akdemir

On Tue, Nov 6, 2012 at 4:51 AM, humanjack huma...@gmail.com wrote:

Hello, all

I'm developing a log system in a company, and using elasticsearch our
log system is indexing and storing every log data.
Our log system create indices per day. Within a daily index, type is
seperated by projects. Total number of projects are more than 1,000.
Size of a index is depending on incoming logs, but some time it is
almost 500GB(including less than 1billion document) some times it is 100GB.
(Our goal is covering 1TB per day)
There are 15 ES clusters which have 2 master node, 3 search load
balancers, and 10 data node. The number of shard is 20 per index and
replica is one.
We are planning to apply route mechanism soon, but not applied yet.
Each machine have 48 GB memory and 32 GB is belonged to elasticsearch.

I have two questions.
First, current topologies are proper for our system? There are no much
concurrent users, but we want to cover at least 100 concurrent search
queries.
(Query size is depending on users. Some thing can be big, but most query
will be not much big)

Second, even if I set the cache type as soft in configuration file like
below, elasticsearch still show OutOfMemoryError many time if there is big
query or many concurrent request.

index.cache.field.type: soft
index.cache.field.max_size: 10000
index.cache.field.expire: 5m
index.merge.policy.max_merged_segment: 200gb
indices.memory.index_buffer_size: 3gb

[2012-11-06 10:21:16,272][WARN ][index.cache.field.data.soft]
[xseed027.kdev] [nelo2-log-2012-11-02] loading field [body] caused out of
memory failure
java.lang.OutOfMemoryError: Java heap space
at
org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:68)
at
org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:90)
at
org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:56)
at
org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:34)
at
org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)
at
org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:130)
at
org.elasticsearch.index.field.data.strings.StringOrdValFieldDataComparator.setNextReader(StringOrdValFieldDataComparator.java:121)
at
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:95)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)
at
org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:195)
at
org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:149)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:487)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:400)
at
org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:176)
at
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:242)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:529)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:518)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:268)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)

Can you help me to resolve my problem?
Thank you,
Jaeik Lee

--

--