Indexing performance

Mingfeng_Yang · September 15, 2013, 5:03pm

We are building an ES index to archive billions of docs. The index are
distributed across 4 servers, each with 4 cpus, each with 32G memory. And
we splitted the index into 32 shards.

When we just began to index two weeks ago, the performance was like 6K
docs each second. Now, as the index gets big, with 3Billion docs indexed
already, total disk space is about 1.2T. The indexing speed decreases to
300 docs/second.

Initially we set refresh_interval to 120s, but in the past two days, the
ES server randomly dropped one shard and the cluster health status became
red. We had to decrease the refresh_interval lower to make the cluster
stable. When the cluster dropped one shard, the log has no any error info.

Two questions.

Is there anyway to increase the performance ?
What's the cause of the random drop of shard?

We are using ES 0.90.3.

Thanks,
Mingfeng

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · September 15, 2013, 8:10pm

Do you use monitoring tools, for heap/CPU/GC activity? What is the max heap
size? Is heap exhausted? How large are the segments - is merging the issue?

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Mingfeng_Yang · September 15, 2013, 9:55pm

yes. we use bigdesk to monitor the cluster. Heap size allocated is 16G
each node, and it was not exhausted. Largest .fdt file for each shard is
like 2G.

I just found that when a shard, say shard 17, got dropped, if I use manual
allocation to put it back with this command
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{"commands":[{"allocate":
{"index": "boards", "shard":17, "node":"6EuzZatFRTK6q6F2boSOZw",
"allow_primary" : true}}]}'

The shard is actually get deleted. How come?

Ming

On Sun, Sep 15, 2013 at 1:10 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Do you use monitoring tools, for heap/CPU/GC activity? What is the max
heap size? Is heap exhausted? How large are the segments - is merging the
issue?

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · September 15, 2013, 10:35pm

What OS is being used for running the ES cluster? And do you use mmapfs,
maybe bootstrap mlockall?

If you can't see much CPU / GC activity, it may be that your memory
resources reached a limit, so OS is challenged to allocate more?

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Mingfeng_Yang · September 16, 2013, 12:25am

Jörg,

It's centos 6.4. No customization of the kernel has been done. I will
monitor the memory and see if it's related. Thanks for reminding.

Ming

On Sun, Sep 15, 2013 at 3:35 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

What OS is being used for running the ES cluster? And do you use mmapfs,
maybe bootstrap mlockall?

If you can't see much CPU / GC activity, it may be that your memory
resources reached a limit, so OS is challenged to allocate more?

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Amit_Soni · October 8, 2013, 7:41pm

Hi Mingfeng - Just curious to know if you had any luck identifying the
cause of the performance problem.

-Amit.

On Sun, Sep 15, 2013 at 5:25 PM, Mingfeng Yang mfyang@wisewindow.comwrote:

Jörg,

It's centos 6.4. No customization of the kernel has been done. I will
monitor the memory and see if it's related. Thanks for reminding.

Ming

On Sun, Sep 15, 2013 at 3:35 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

What OS is being used for running the ES cluster? And do you use mmapfs,
maybe bootstrap mlockall?

If you can't see much CPU / GC activity, it may be that your memory
resources reached a limit, so OS is challenged to allocate more?

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Abrupt performance drop above a certain index size Elasticsearch	16	1476	July 6, 2017
Index speed degradation Elasticsearch	7	466	July 6, 2017
Rapidly Degrading Bulk Indexing Performance Elasticsearch	7	400	July 6, 2017
Question: How to gauge/improve performance Elasticsearch	17	1038	July 6, 2017
Performance problems Elasticsearch	12	629	July 6, 2017

Indexing performance

Related topics