Questions --- Regarding to Size of Bulk Load., Capasity of a Shards and Performance

Hi., to All

           After Working with the ElasticSearch for 3 monts., Now I 

have got Some Knowledge in ElasticSearch.
But I have Some Questions in my Mind. First I will Share What is my system
Enveronment and then I List out my Questions.
First: My Working Environment
We have Two Elasticsearch Servers[one is master and another is client]
running in Our Server Machines.
Elasticsearch version is., 0.20.5
jvm parametres are for all servers is.,
-Xms:5gb -Xmx:5gb.

and java version is 1.6.0

Actually *we have some Millions of data[each record has More than 50 Fields]

  • in Fixed Length Files[Delimitor Files], and
    we have implemented some Spring3 based apllications, which will Load data
    into ElasyicSearch.
    And also we are Now Implementing Applications for Searchings.

And We have Tested Applications in our servers for Testing Performance of
The Index Operations.
[here we tested with <6 to 8> shards with 0 replicas]
And we Have Some statistical data about Index Operations.
Say., e.g., *for Loading 1Million of data with 8 fields our Application
has taken the Time is: [in the Range 9 to 13 Minutes] *
we have used the logic for Index asynchronously as.,
<no of Bulks load is: 10000 records at a time>
bulkRequest.execute(
new ActionListener() {
@Override
public void onResponse(BulkResponse bulkResponse) {
//displaying the time taken for index
}
@Override
public void onFailure(Throwable e) {
//here we display msg if index operation is failure
}
});

Two: Now My Questions are:

  1. What is the Avg Time that ElasticSearch will Support to index 1Million
    Records.?
  2. From Performance point of view., What is the bulk size we have to use
    for Indexing such Huge Volume of Data?
  3. What is The Maximum Capacity of 1 Shard for Load Data with Same Speed?
  4. For Good Practice., How Many, Shards we have to take in one Single
    Index? and Number of Nodes for Server?
  5. Any Suggestion For Boost the Performance of Bulk Loads?

Regards and Thanks to All,
Mohammad Rafi.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

  1. This depends on your records and your ES config. You can connect your
    client to a lot of powerful ES nodes and put up some network bandwith,
    CPU, and fast disks, and ES will be faster than you can send docs :slight_smile:

  2. You have to find the "sweet spot" of your current system. Start with
    1000 docs per bulk, start with number of concurrent bulks = number of
    CPU cores in your system, start bulk index for at least 35 or 40 minutes
    (when large segment merging starts). Repeat this from scratch, but with
    higher number of docs per bulk request. Increase it to the point where
    the docs per second rate does not get higher. Typical range of docs per
    second on a single ES node cluster with current PC server hardware I
    have observed is everything between 1000 and 10000 docs per second, OOTB.

  3. Always set up one ES node per server. Number of shards per node
    should roughly not exceed the number of CPU cores.

  4. Use fast disk susbystem (e.g. SSD RAID0). Use latest Java 7. Disable
    refresh, disable replica.

Jörg

Am 09.05.13 10:09, schrieb rafi:

  1. What is the Avg Time that Elasticsearch will Support to index
    1Million Records.?
  2. From Performance point of view., What is the bulk size we have to
    use for Indexing such Huge Volume of Data?
  3. What is The Maximum Capacity of 1 Shard for Load Data with Same Speed?
  4. For Good Practice., How Many, Shards we have to take in one Single
    Index? and Number of Nodes for Server?
  5. Any Suggestion For Boost the Performance of Bulk Loads?

Regards and Thanks to All,
Mohammad Rafi.

You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi., Jörg Prante

Thanks for Reply and for Your Suggestions.

Regards
Mohammad Rafi.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.