Questions --- Regarding to Size of Bulk Load., Capasity of a Shards and Performance

mohammad_rafi_g · May 9, 2013, 8:09am

Hi., to All

           After Working with the ElasticSearch for 3 monts., Now I

have got Some Knowledge in ElasticSearch.
But I have Some Questions in my Mind. First I will Share What is my system
Enveronment and then I List out my Questions.
First: My Working Environment
We have Two Elasticsearch Servers[one is master and another is client]
running in Our Server Machines.
Elasticsearch version is., 0.20.5
jvm parametres are for all servers is., -Xms:5gb -Xmx:5gb.
and java version is 1.6.0

Actually *we have some Millions of data[each record has More than 50 Fields]

in Fixed Length Files[Delimitor Files], and
we have implemented some Spring3 based apllications, which will Load data
into ElasyicSearch.
And also we are Now Implementing Applications for Searchings.

And We have Tested Applications in our servers for Testing Performance of
The Index Operations.
[here we tested with <6 to 8> shards with 0 replicas]
And we Have Some statistical data about Index Operations.
Say., e.g., *for Loading 1Million of data with 8 fields our Application
has taken the Time is: [in the Range 9 to 13 Minutes] *
we have used the logic for Index asynchronously as.,
<no of Bulks load is: 10000 records at a time>
bulkRequest.execute(
new ActionListener() {
@Override
public void onResponse(BulkResponse bulkResponse) {
//displaying the time taken for index
}
@Override
public void onFailure(Throwable e) {
//here we display msg if index operation is failure
}
});

Two: Now My Questions are:

What is the Avg Time that ElasticSearch will Support to index 1Million
Records.?
From Performance point of view., What is the bulk size we have to use
for Indexing such Huge Volume of Data?
What is The Maximum Capacity of 1 Shard for Load Data with Same Speed?
For Good Practice., How Many, Shards we have to take in one Single
Index? and Number of Nodes for Server?
Any Suggestion For Boost the Performance of Bulk Loads?

Regards and Thanks to All,
Mohammad Rafi.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · May 9, 2013, 8:28am

This depends on your records and your ES config. You can connect your
client to a lot of powerful ES nodes and put up some network bandwith,
CPU, and fast disks, and ES will be faster than you can send docs
You have to find the "sweet spot" of your current system. Start with
1000 docs per bulk, start with number of concurrent bulks = number of
CPU cores in your system, start bulk index for at least 35 or 40 minutes
(when large segment merging starts). Repeat this from scratch, but with
higher number of docs per bulk request. Increase it to the point where
the docs per second rate does not get higher. Typical range of docs per
second on a single ES node cluster with current PC server hardware I
have observed is everything between 1000 and 10000 docs per second, OOTB.
Always set up one ES node per server. Number of shards per node
should roughly not exceed the number of CPU cores.
Use fast disk susbystem (e.g. SSD RAID0). Use latest Java 7. Disable
refresh, disable replica.

Jörg

Am 09.05.13 10:09, schrieb rafi:

What is the Avg Time that Elasticsearch will Support to index
1Million Records.?

From Performance point of view., What is the bulk size we have to
use for Indexing such Huge Volume of Data?

What is The Maximum Capacity of 1 Shard for Load Data with Same Speed?

For Good Practice., How Many, Shards we have to take in one Single
Index? and Number of Nodes for Server?

Any Suggestion For Boost the Performance of Bulk Loads?

Regards and Thanks to All,
Mohammad Rafi.

You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mohammad_rafi_g · May 9, 2013, 9:19am

Hi., Jörg Prante

Thanks for Reply and for Your Suggestions.

Regards
Mohammad Rafi.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Bulk load performance Elasticsearch	3	914	July 6, 2017
Looking for advice on bulk loading Elasticsearch	6	941	July 6, 2017
Improving Bulk Indexing Elasticsearch	12	4573	July 6, 2017
ElasticSearch bulk api performance Elasticsearch	7	2073	July 6, 2017
Issue Indexing 50mil Docs via Bulk API Elasticsearch	23	2442	July 5, 2017

Questions --- Regarding to Size of Bulk Load., Capasity of a Shards and Performance

Regards and Thanks to All, Mohammad Rafi.

Related topics

Regards and Thanks to All,
Mohammad Rafi.