Hi all,
I need some help here. I started a load test for Elasticsearch before using
that in production environment. I have three EC2 instances that are
configured in following manner which creates a Elasticsearch cluster.
All three machines has the following same hardware configurations.
32GB RAM
160GB SSD hard disk
8 core CPU
Machine 01
Elasticsearch server (16GB heap)
Elasticsearch Java client (Who generates a continues load and report to ES
- 4GB heap)
Machine 02
Elasticsearch server (16GB heap)
Elasticsearch Java client (Who generates a continues load and report to
ES - 4GB heap)
Machine 03
Elasticsearch server (16GB heap)
Elasticsearch Java client (Who queries from ES continuously - 1GB heap)
Note that the two clients together generates around 20K records per second
and report them as bulks with average size of 25. The other client queries
only one query per second. My document has the following format.
{
"_index": "my_index",
"_type": "my_type",
"_id": "7334236299916134105",
"_score": 3.6111107,
"_source": {
"long_1": 96186289301793,
"long_2": 7334236299916134000,
"string_1": "random_string",
"long_3": 96186289301793,
"string_2": "random_string",
"string_3": "random_string",
"string_4": "random_string",
"string_5": "random_string",
"long_4": 5457314198948537000
}
}
The problem is, after few minutes, Elasticsearch reports errors in the logs
like this.
[2015-02-24 08:03:58,070][ERROR][marvel.agent.exporter ] [Gateway]
create failure (index:[.marvel-2015.02.24] type: [cluster_stats]):
RemoteTransportException[[Marvel
Girl][inet[/10.167.199.140:9300]][bulk/shard]]; nested:
EsRejectedExecutionException[rejected execution (queue capacity 50) on
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1@76dbf01];
[2015-02-25 04:23:36,459][ERROR][marvel.agent.exporter ] [Wildside]
create failure (index:[.marvel-2015.02.25] type: [index_stats]):
UnavailableShardsException[[.marvel-2015.02.25][0] [2] shardIt, [0] active
: Timeout waiting for [1m], request:
org.elasticsearch.action.bulk.BulkShardRequest@2e7693b7]
Note that this error happens for different indices and different types.
Again after few minutes, Elasticsearch clients get
NoNodeAvailableException. I hope that is because Elasticsearch cluster
malfunctioning due to above errors. But eventually the clients get
"java.lang.OutOfMemoryError: GC overhead limit exceeded" error.
I did some profiling and found out that increasing
the org.elasticsearch.action.index.IndexRequest instances is the cause for
this OutOfMemory error. I tried even with "index.store.type: memory" and it
seems still the Elasticsearch cluster cannot build the indices to the
required rate.
Please point out any tuning parameters or any method to get rid of these
issues. Or please explain a different way to report and query this amount
of load.
Thanks
Malaka
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/35a29ca5-02f6-4fe9-8600-2cdb91c519cf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.