Indexing performance

We are building an ES index to archive billions of docs. The index are
distributed across 4 servers, each with 4 cpus, each with 32G memory. And
we splitted the index into 32 shards.

When we just began to index two weeks ago, the performance was like 6K
docs each second. Now, as the index gets big, with 3Billion docs indexed
already, total disk space is about 1.2T. The indexing speed decreases to
300 docs/second.

Initially we set refresh_interval to 120s, but in the past two days, the
ES server randomly dropped one shard and the cluster health status became
red. We had to decrease the refresh_interval lower to make the cluster
stable. When the cluster dropped one shard, the log has no any error info.

Two questions.

  1. Is there anyway to increase the performance ?
  2. What's the cause of the random drop of shard?

We are using ES 0.90.3.

Thanks,
Mingfeng

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Do you use monitoring tools, for heap/CPU/GC activity? What is the max heap
size? Is heap exhausted? How large are the segments - is merging the issue?

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

yes. we use bigdesk to monitor the cluster. Heap size allocated is 16G
each node, and it was not exhausted. Largest .fdt file for each shard is
like 2G.

I just found that when a shard, say shard 17, got dropped, if I use manual
allocation to put it back with this command
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{"commands":[{"allocate":
{"index": "boards", "shard":17, "node":"6EuzZatFRTK6q6F2boSOZw",
"allow_primary" : true}}]}'

The shard is actually get deleted. How come?

Ming

On Sun, Sep 15, 2013 at 1:10 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Do you use monitoring tools, for heap/CPU/GC activity? What is the max
heap size? Is heap exhausted? How large are the segments - is merging the
issue?

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

What OS is being used for running the ES cluster? And do you use mmapfs,
maybe bootstrap mlockall?

If you can't see much CPU / GC activity, it may be that your memory
resources reached a limit, so OS is challenged to allocate more?

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Jörg,

It's centos 6.4. No customization of the kernel has been done. I will
monitor the memory and see if it's related. Thanks for reminding.

Ming

On Sun, Sep 15, 2013 at 3:35 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

What OS is being used for running the ES cluster? And do you use mmapfs,
maybe bootstrap mlockall?

If you can't see much CPU / GC activity, it may be that your memory
resources reached a limit, so OS is challenged to allocate more?

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Mingfeng - Just curious to know if you had any luck identifying the
cause of the performance problem.

-Amit.

On Sun, Sep 15, 2013 at 5:25 PM, Mingfeng Yang mfyang@wisewindow.comwrote:

Jörg,

It's centos 6.4. No customization of the kernel has been done. I will
monitor the memory and see if it's related. Thanks for reminding.

Ming

On Sun, Sep 15, 2013 at 3:35 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

What OS is being used for running the ES cluster? And do you use mmapfs,
maybe bootstrap mlockall?

If you can't see much CPU / GC activity, it may be that your memory
resources reached a limit, so OS is challenged to allocate more?

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.