What is the meaning of throttle_time_in_millis?


(Isaac Hazan) #1

I created an index in a 4 nodes Elasticsearch cluster. I added about 3.5 M
documents using the java Elasticsearch API.
When asking for the stats i get a very high number in
throttle_time_in_millis as follows:

{
"_shards": {
"total": 10,
"successful": 10,
"failed": 0
},
"_all": {
"primaries": {
"docs": {
"count": 3855540,
"deleted": 0
},
"store": {
"size_in_bytes": 1203074796,
"throttle_time_in_millis": 980255
},
"indexing": {
"index_total": 3855540,
"index_time_in_millis": 426300,
"index_current": 0,
"delete_total": 0,
"delete_time_in_millis": 0,
"delete_current": 0
},

  1. What is the meaning of throttle_time_in_millis?
  2. What could be the reason for this to increase?

Thx in advance

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/73633235-ae82-4cd2-a74a-3de6cff5cd47%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Adrien Grand) #2

When you add documents to Elasticsearch, this creates new files on disk
that form what is called a segment. Having several segments is fine, but
when you start having too many of them, search is going to be slower this
is why Elasticsearch has a background process that takes care of merging
these segments, so that the total number of segments remains low enough
(usually in the order of ~50 per shard). However, running a background
merge can take lots of resources on the server, especially I/O, and this
might defeat the purpose of making search remain fast since search
operations don't have much I/O capacity left. In order to prevent it from
happening, merges are throttled[1], meaning that they can't write more than
X bytes of data per second. If they try to, Elasticsearch will pause them
for a while before they can keep on merging again.

The throttle_time reported by the stats API gives you the total number of
time that merges have been paused in order to prevent them from stealing
all the server I/O.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-store.html#store-throttling

On Tue, Mar 4, 2014 at 8:39 AM, Isaac Hazan isaac.yann.hazan@gmail.comwrote:

I created an index in a 4 nodes Elasticsearch cluster. I added about 3.5 M
documents using the java Elasticsearch API.
When asking for the stats i get a very high number in
throttle_time_in_millis as follows:

{
"_shards": {
"total": 10,
"successful": 10,
"failed": 0
},
"_all": {
"primaries": {
"docs": {
"count": 3855540,
"deleted": 0
},
"store": {
"size_in_bytes": 1203074796,
"throttle_time_in_millis": 980255
},
"indexing": {
"index_total": 3855540,
"index_time_in_millis": 426300,
"index_current": 0,
"delete_total": 0,
"delete_time_in_millis": 0,
"delete_current": 0
},

  1. What is the meaning of throttle_time_in_millis?
  2. What could be the reason for this to increase?

Thx in advance

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/73633235-ae82-4cd2-a74a-3de6cff5cd47%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/73633235-ae82-4cd2-a74a-3de6cff5cd47%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/groups/opt_out.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7RW1YhyZYBKxXQ183yjSTX-aMXipBR8DZatCmkFRQp4g%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Isaac Hazan) #3

Thx.

Is a segment a single file with multiple documents? Or is it multiple files
that together form a segment? In other terms I don't fully understand why
the notion of segment exists?

Does the fact that I have a high number in the throttling KPI mean that I
have a problem in performance and if so is there a setting to tune it
properly?

Thx

From: elasticsearch@googlegroups.com [mailto:elasticsearch@googlegroups.com]
On Behalf Of Adrien Grand
Sent: Tuesday, March 04, 2014 3:59 PM
To: elasticsearch@googlegroups.com
Subject: Re: What is the meaning of throttle_time_in_millis?

When you add documents to Elasticsearch, this creates new files on disk that
form what is called a segment. Having several segments is fine, but when you
start having too many of them, search is going to be slower this is why
Elasticsearch has a background process that takes care of merging these
segments, so that the total number of segments remains low enough (usually
in the order of ~50 per shard). However, running a background merge can take
lots of resources on the server, especially I/O, and this might defeat the
purpose of making search remain fast since search operations don't have much
I/O capacity left. In order to prevent it from happening, merges are
throttled[1], meaning that they can't write more than X bytes of data per
second. If they try to, Elasticsearch will pause them for a while before
they can keep on merging again.

The throttle_time reported by the stats API gives you the total number of
time that merges have been paused in order to prevent them from stealing all
the server I/O.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-
modules-store.html#store-throttling

On Tue, Mar 4, 2014 at 8:39 AM, Isaac Hazan isaac.yann.hazan@gmail.com
wrote:

I created an index in a 4 nodes Elasticsearch cluster. I added about 3.5 M
documents using the java Elasticsearch API.

When asking for the stats i get a very high number in
throttle_time_in_millis as follows:

{

"_shards": {

  "total": 10,

  "successful": 10,

  "failed": 0

},

"_all": {

  "primaries": {

     "docs": {

        "count": 3855540,

        "deleted": 0

     },

     "store": {

        "size_in_bytes": 1203074796,

        "throttle_time_in_millis": 980255

     },

     "indexing": {

        "index_total": 3855540,

        "index_time_in_millis": 426300,

        "index_current": 0,

        "delete_total": 0,

        "delete_time_in_millis": 0,

        "delete_current": 0

     },
  1. What is the meaning of throttle_time_in_millis?

  2. What could be the reason for this to increase?

Thx in advance

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/73633235-ae82-4cd2-a74a-3de6
cff5cd47%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/73633235-ae82-4cd2-a74a-3de
6cff5cd47%40googlegroups.com?utm_medium=email&utm_source=footer> .
For more options, visit https://groups.google.com/groups/opt_out.

--

Adrien Grand

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/K9GiA2KDwoA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7RW1YhyZYBKxXQ183yjST
X-aMXipBR8DZatCmkFRQp4g%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7RW1YhyZYBKxXQ183yjS
TX-aMXipBR8DZatCmkFRQp4g%40mail.gmail.com?utm_medium=email&utm_source=footer

.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/003f01cf37ba%249acd1510%24d0673f30%24%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(gkwelding) #4

Isaac, this
(http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-merge.html)
gives a good explanation on what segments actually are. It also gives you
the list of settings to do with merges and maybe you can find some
optimisations in there somewhere.

On Tuesday, March 4, 2014 3:01:17 PM UTC, Isaac Hazan wrote:

Thx.

Is a segment a single file with multiple documents? Or is it multiple
files that together form a segment? In other terms I don’t fully understand
why the notion of segment exists?

Does the fact that I have a high number in the throttling KPI mean that I
have a problem in performance and if so is there a setting to tune it
properly?

Thx

From: elasti...@googlegroups.com <javascript:> [mailto:
elasti...@googlegroups.com <javascript:>] *On Behalf Of *Adrien Grand
Sent: Tuesday, March 04, 2014 3:59 PM
To: elasti...@googlegroups.com <javascript:>
Subject: Re: What is the meaning of throttle_time_in_millis?

When you add documents to Elasticsearch, this creates new files on disk
that form what is called a segment. Having several segments is fine, but
when you start having too many of them, search is going to be slower this
is why Elasticsearch has a background process that takes care of merging
these segments, so that the total number of segments remains low enough
(usually in the order of ~50 per shard). However, running a background
merge can take lots of resources on the server, especially I/O, and this
might defeat the purpose of making search remain fast since search
operations don't have much I/O capacity left. In order to prevent it from
happening, merges are throttled[1], meaning that they can't write more than
X bytes of data per second. If they try to, Elasticsearch will pause them
for a while before they can keep on merging again.

The throttle_time reported by the stats API gives you the total number of
time that merges have been paused in order to prevent them from stealing
all the server I/O.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-store.html#store-throttling

On Tue, Mar 4, 2014 at 8:39 AM, Isaac Hazan <isaac.ya...@gmail.com<javascript:>>
wrote:

I created an index in a 4 nodes Elasticsearch cluster. I added about 3.5 M
documents using the java Elasticsearch API.

When asking for the stats i get a very high number in
throttle_time_in_millis as follows:

{

"_shards": {

  "total": 10,

  "successful": 10,

  "failed": 0

},

"_all": {

  "primaries": {

     "docs": {

        "count": 3855540,

        "deleted": 0

     },

     "store": {

        "size_in_bytes": 1203074796,

        "throttle_time_in_millis": 980255

     },

     "indexing": {

        "index_total": 3855540,

        "index_time_in_millis": 426300,

        "index_current": 0,

        "delete_total": 0,

        "delete_time_in_millis": 0,

        "delete_current": 0

     },
  1. What is the meaning of throttle_time_in_millis?

  2. What could be the reason for this to increase?

Thx in advance

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/73633235-ae82-4cd2-a74a-3de6cff5cd47%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/73633235-ae82-4cd2-a74a-3de6cff5cd47%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/groups/opt_out.

--

Adrien Grand

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/K9GiA2KDwoA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7RW1YhyZYBKxXQ183yjSTX-aMXipBR8DZatCmkFRQp4g%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7RW1YhyZYBKxXQ183yjSTX-aMXipBR8DZatCmkFRQp4g%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/db9e02e8-5fb6-4510-81f2-49b3b92fa44b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Adrien Grand) #5

On Tue, Mar 4, 2014 at 4:01 PM, isaac hazan isaac.yann.hazan@gmail.comwrote:

Thx.

Is a segment a single file with multiple documents? Or is it multiple
files that together form a segment? In other terms I don't fully understand
why the notion of segment exists?

The simple answer is that a segment is made of several files. Typically,
there is one that is used to store "stored fields" (allowing to get the
original field values given a document ID), one for the terms dictionary
(the unique terms in your documents), one for postings lists (which given a
term can return the list of documents that contain this term), one for
deleted documents, etc.

And an index is the union of several segments. Searching an index is
effectively searching every segment and merging results together.

But for your information, there is an optimization called "compound file"
which allows to store all these logical files of one segment in a single
physical file when the segment is small. This helps save file descriptors.

Does the fact that I have a high number in the throttling KPI mean that
I have a problem in performance and if so is there a setting to tune it
properly?

A high throttling time is not necessarily an issue, it just means that
merges have been occasionally paused so that search remains fast. You can
disable merge throttling if you want by setting
index.store.throttle.max_bytes_per_sec[1] to -1.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-store.html#store-throttling

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7ww%3Dw8f_b-HyPiMPnc5QZwcb6%3DGFPJzD46o%2BePnOa_4A%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6