Very slow index speeds with dynamic mapping and large volume of documents with new fields

Todd_Nine · March 11, 2015, 3:37pm

Hey all,
We're bumping up against a production problem I could use a hand with.
We're experiencing steadily decreasing index speeds. We have 12 c3.4xl
data nodes, and 1 c3.8xl master node (with 2 backups that are smaller).
We're indexing 45 million documents into a single index. Single shard
only, no replicas. As our number of documents grow, our indexing speed
slows to a crawl. We've applied all the standard mlockall, ulimit, and ssd
merge throttling tuning settings, so I feel our cluster is pretty good.

When I inspected the data, I've noticed our user is adding a new field on
every document. When I view the pending tasks on our master, the task
queue is always at least 300+ attempting to perform dynamic mapping. I've
also checked segment merging, we never have more than 1 merge going on, and
even then it lasts for a second or two, not long at all.

This brings me to my question. When dynamic mapping is performed, is this
on the master only? Obviously this would introduce a bottleneck, and
explain our sudden performance drop. I'm at a loss to explain this issue.
Any advice would be appreciated.

Thanks,
Todd

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0611317c-d3c1-4894-8fac-8ac4b36cbf15%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

warkolm · March 13, 2015, 7:12pm

Mapping changes do need to go through the master, so check how it is
performing.

On 11 March 2015 at 08:37, Todd Nine tnine@apigee.com wrote:

Hey all,
We're bumping up against a production problem I could use a hand with.
We're experiencing steadily decreasing index speeds. We have 12 c3.4xl
data nodes, and 1 c3.8xl master node (with 2 backups that are smaller).
We're indexing 45 million documents into a single index. Single shard
only, no replicas. As our number of documents grow, our indexing speed
slows to a crawl. We've applied all the standard mlockall, ulimit, and ssd
merge throttling tuning settings, so I feel our cluster is pretty good.

When I inspected the data, I've noticed our user is adding a new field on
every document. When I view the pending tasks on our master, the task
queue is always at least 300+ attempting to perform dynamic mapping. I've
also checked segment merging, we never have more than 1 merge going on, and
even then it lasts for a second or two, not long at all.

This brings me to my question. When dynamic mapping is performed, is this
on the master only? Obviously this would introduce a bottleneck, and
explain our sudden performance drop. I'm at a loss to explain this issue.
Any advice would be appreciated.

Thanks,
Todd

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0611317c-d3c1-4894-8fac-8ac4b36cbf15%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0611317c-d3c1-4894-8fac-8ac4b36cbf15%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-PeZJU%3DnfRpgT0hQ3dbDE_LKhW8etBAHqVSjbxHXZMMA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Performance issue due to dynamic mapping Elasticsearch	1	1262	January 14, 2019
Slow Indexing speed / Bottleneck Elasticsearch	6	722	September 16, 2020
Elasticsearch performance when dynamic mapping vs strict/false Elasticsearch	3	1277	May 16, 2019
Slow down/throttle indexing Elasticsearch	5	758	July 5, 2017
Index Dimensioning and Optimization (across the Cluster) Elasticsearch	6	376	March 24, 2021

Very slow index speeds with dynamic mapping and large volume of documents with new fields

Related topics