Slow Indexing Speed


(Robert Navarro) #1

Hello,

I have a single server ES "cluster" setup right now and it's struggling to
keep up with our indexing load.

The server has 30GB of ram, 15GB locked for elasticsearch.

Here are the ES details:

{
"ok" : true,
"status" : 200,
"name" : "esls1",
"version" : {
"number" : "0.90.3",
"build_hash" : "5c38d6076448b899d758f29443329571e2522410",
"build_timestamp" : "2013-08-06T13:18:31Z",
"build_snapshot" : false,
"lucene_version" : "4.4"
},
"tagline" : "You Know, for Search"
}

Here is the java version:

root@esls1:~# java -version
java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

Here are some of the knobs I've tried to tweak for our indexes...this is
just a snapshot of one index:

Operating System:

Ubuntu 12.04.2 LTS

There are 15 indexes on this node, rotated daily and removed after 14 days.

The incoming index requests are coming out of logstash and it's all logging data.

I suspect the server is IO bound as there is are bursts of 5-10s sustained 10%+ iowait.

What other knobs can I tweak to help speed things along?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Zachary Tong) #2

What's your indexing load (docs/sec) ? Are you querying at the same time?
Often, if you are bound by Disk IO, there isn't much you can do except get
faster disks or more nodes. Do you have SSDs? They are a great investment
if you can afford them. And adding more nodes is almost a linear increase
in indexing speed.

Some more things you can do:

  • If you don't need it, disable the _all field. This bloats the doc
    size (so more bytes to write) and eats up a bit of CPU.
  • I'd put the index.merge.policy.segments_per_tier back to it's default
    (10). By having it set so high, Lucene is going to perform big bursts of
    merging which can easily eat up all your IO and a considerable amount of
    CPU. In general, I've spent a lot of time fiddling with the merge policy
    settings and never found a configuration better than the defaults. Mike
    McCandlesshttp://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.htmlknows best :slight_smile:
  • I noticed you have Term Vectors compressed. Are you actually using
    term vectors? They double your index size and eat up more IO.
  • You could check the indexing thread count and see if you are routinely
    queuing indexing threads. May help to increase that some (although, it may
    not)

-Zach

On Thursday, September 5, 2013 7:39:13 PM UTC-4, Robert Navarro wrote:

Hello,

I have a single server ES "cluster" setup right now and it's struggling to
keep up with our indexing load.

The server has 30GB of ram, 15GB locked for elasticsearch.

Here are the ES details:

{
"ok" : true,
"status" : 200,
"name" : "esls1",
"version" : {
"number" : "0.90.3",
"build_hash" : "5c38d6076448b899d758f29443329571e2522410",
"build_timestamp" : "2013-08-06T13:18:31Z",
"build_snapshot" : false,
"lucene_version" : "4.4"
},
"tagline" : "You Know, for Search"
}

Here is the java version:

root@esls1:~# java -version
java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

Here are some of the knobs I've tried to tweak for our indexes...this is
just a snapshot of one index:

https://gist.github.com/rnavarro/490196cf73ff46e33a8b

Operating System:

Ubuntu 12.04.2 LTS

There are 15 indexes on this node, rotated daily and removed after 14 days.

The incoming index requests are coming out of logstash and it's all logging data.

I suspect the server is IO bound as there is are bursts of 5-10s sustained 10%+ iowait.

What other knobs can I tweak to help speed things along?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Robert Navarro) #3

Hey Zach,

Thanks for the response!

The indexing load isn't particularly high, but the documents being indexed
are pretty large....many of them >1MB. Looking at my logstash indexing
machines I'd say that the docs/sec is in the realm of 200 ish? Is there a
way to see this from the ES side of things?

We do query at the same time, but we generally query indexes that are a
day+ old and have been optimized by a nightly cron....less querying happens
on the "active" daily index.

I don't have SSDs right now, this is running on a rackspace cloud server
with 5 non-ssd block storage volumes attached in a raid5 config. We index
~300GB of data a day...so moving to SSDs is likely very very cost
prohibitive for us.

We have _all disabled in our index template, so that should help things.

I'll drop the index.merge.policy.segments_per_tier back down to 10 to see
how that effects things.

We're using kibana to do the searching on our indexes and from what I
understand (and looking at some of the requests) there are no term vectors
being used.

I'd be more than happy to add nodes to the cluster, but I wasn't certain if
that would help indexing speed as much as it would query speed. However,
now that you mention it...if the shards were all split up one per node that
would make sense that I would have gains there.

Also, I forgot to add a copy of our indexing template...so here it is
(updated to reflect the default merge policy settings):

Thanks for your time and all the food for thought! Much appreciated! :slight_smile:

On Thursday, September 5, 2013 5:23:20 PM UTC-7, Zachary Tong wrote:

What's your indexing load (docs/sec) ? Are you querying at the same time?
Often, if you are bound by Disk IO, there isn't much you can do except get
faster disks or more nodes. Do you have SSDs? They are a great investment
if you can afford them. And adding more nodes is almost a linear increase
in indexing speed.

Some more things you can do:

  • If you don't need it, disable the _all field. This bloats the doc
    size (so more bytes to write) and eats up a bit of CPU.
  • I'd put the index.merge.policy.segments_per_tier back to it's
    default (10). By having it set so high, Lucene is going to perform big
    bursts of merging which can easily eat up all your IO and a considerable
    amount of CPU. In general, I've spent a lot of time fiddling with the
    merge policy settings and never found a configuration better than the
    defaults. Mike McCandlesshttp://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.htmlknows best :slight_smile:
  • I noticed you have Term Vectors compressed. Are you actually using
    term vectors? They double your index size and eat up more IO.
  • You could check the indexing thread count and see if you are
    routinely queuing indexing threads. May help to increase that some
    (although, it may not)

-Zach

On Thursday, September 5, 2013 7:39:13 PM UTC-4, Robert Navarro wrote:

Hello,

I have a single server ES "cluster" setup right now and it's struggling
to keep up with our indexing load.

The server has 30GB of ram, 15GB locked for elasticsearch.

Here are the ES details:

{
"ok" : true,
"status" : 200,
"name" : "esls1",
"version" : {
"number" : "0.90.3",
"build_hash" : "5c38d6076448b899d758f29443329571e2522410",
"build_timestamp" : "2013-08-06T13:18:31Z",
"build_snapshot" : false,
"lucene_version" : "4.4"
},
"tagline" : "You Know, for Search"
}

Here is the java version:

root@esls1:~# java -version
java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

Here are some of the knobs I've tried to tweak for our indexes...this is
just a snapshot of one index:

https://gist.github.com/rnavarro/490196cf73ff46e33a8b

Operating System:

Ubuntu 12.04.2 LTS

There are 15 indexes on this node, rotated daily and removed after 14 days.

The incoming index requests are coming out of logstash and it's all logging data.

I suspect the server is IO bound as there is are bursts of 5-10s sustained 10%+ iowait.

What other knobs can I tweak to help speed things along?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Israel Ekpo) #4

Since you are not actively searching on the indices that are currently
being indexed I would recommend for you to increase the refresh interval
for the index.

Checkout some benchmarks :

Author and Instructor for the Upcoming Book and Lecture Series
Massive Log Data Aggregation, Processing, Searching and Visualization with
Open Source Software

http://massivelogdata.com

On Thu, Sep 5, 2013 at 7:39 PM, Robert Navarro crshman@gmail.com wrote:

Hello,

I have a single server ES "cluster" setup right now and it's struggling to
keep up with our indexing load.

The server has 30GB of ram, 15GB locked for elasticsearch.

Here are the ES details:

{
"ok" : true,
"status" : 200,
"name" : "esls1",
"version" : {
"number" : "0.90.3",
"build_hash" : "5c38d6076448b899d758f29443329571e2522410",
"build_timestamp" : "2013-08-06T13:18:31Z",
"build_snapshot" : false,
"lucene_version" : "4.4"
},
"tagline" : "You Know, for Search"
}

Here is the java version:

root@esls1:~# java -version
java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

Here are some of the knobs I've tried to tweak for our indexes...this is
just a snapshot of one index:

https://gist.github.com/rnavarro/490196cf73ff46e33a8b

Operating System:

Ubuntu 12.04.2 LTS

There are 15 indexes on this node, rotated daily and removed after 14 days.

The incoming index requests are coming out of logstash and it's all logging data.

I suspect the server is IO bound as there is are bursts of 5-10s sustained 10%+ iowait.

What other knobs can I tweak to help speed things along?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Zachary Tong) #5

The indexing load isn't particularly high, but the documents being indexed
are pretty large....many of them >1MB. Looking at my logstash indexing
machines I'd say that the docs/sec is in the realm of 200 ish? Is there a
way to see this from the ES side of things?

You can use the Indices stats APIhttp://www.elasticsearch.org/guide/reference/api/admin-indices-stats/to see the indexing rate. You'll get an output that includes indexing
stats (from the viewpoint of the index, regardless of which shard/machine
it goes towards):

curl -XGET 'localhost:9200/_stats'

[...]
"indexing": {
"index_total": 3,
"index_time_in_millis": 49,
"index_current": 0,
"delete_total": 0,
"delete_time_in_millis": 0,
"delete_current": 0
},
[...]

Another place to look is the Indexing threadpool via the Cluster stats APIhttp://www.elasticsearch.org/guide/reference/api/admin-cluster-nodes-stats/- check to see if you are queuing a lot of threads:

curl -XGET 'http://localhost:9200/_nodes/stats?clear=true&thread_pool=true'

[...]
"index": {
"threads": 3,
"queue": 0,
"active": 0,
"rejected": 0,
"largest": 3,
"completed": 3
},
[...]

Plugins like Bigdesk https://github.com/lukas-vlcek/bigdesk/ and Paramedichttps://github.com/karmi/elasticsearch-paramedicare basically graphical wrappers for these APIs.

I'd be more than happy to add nodes to the cluster, but I wasn't certain if

that would help indexing speed as much as it would query speed. However,
now that you mention it...if the shards were all split up one per node that
would make sense that I would have gains there.

Yep, you'll definitely see an increase as each node adds more indexing
throughput. It isn't exactly linear, but it is fairly close (especially if
your query load is low, as in most logging environments). Realistically,
this is the easiest and fastest way to increase your indexing speed if you
can afford the cost of another node

Hope this helps! Keep us updated if you have any more questions
-Zach

On Thursday, September 5, 2013 5:23:20 PM UTC-7, Zachary Tong wrote:

What's your indexing load (docs/sec) ? Are you querying at the same
time? Often, if you are bound by Disk IO, there isn't much you can do
except get faster disks or more nodes. Do you have SSDs? They are a great
investment if you can afford them. And adding more nodes is almost a
linear increase in indexing speed.

Some more things you can do:

  • If you don't need it, disable the _all field. This bloats the doc
    size (so more bytes to write) and eats up a bit of CPU.
  • I'd put the index.merge.policy.segments_per_tier back to it's
    default (10). By having it set so high, Lucene is going to perform big
    bursts of merging which can easily eat up all your IO and a considerable
    amount of CPU. In general, I've spent a lot of time fiddling with the
    merge policy settings and never found a configuration better than the
    defaults. Mike McCandlesshttp://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.htmlknows best :slight_smile:
  • I noticed you have Term Vectors compressed. Are you actually using
    term vectors? They double your index size and eat up more IO.
  • You could check the indexing thread count and see if you are
    routinely queuing indexing threads. May help to increase that some
    (although, it may not)

-Zach

On Thursday, September 5, 2013 7:39:13 PM UTC-4, Robert Navarro wrote:

Hello,

I have a single server ES "cluster" setup right now and it's struggling
to keep up with our indexing load.

The server has 30GB of ram, 15GB locked for elasticsearch.

Here are the ES details:

{
"ok" : true,
"status" : 200,
"name" : "esls1",
"version" : {
"number" : "0.90.3",
"build_hash" : "5c38d6076448b899d758f29443329571e2522410",
"build_timestamp" : "2013-08-06T13:18:31Z",
"build_snapshot" : false,
"lucene_version" : "4.4"
},
"tagline" : "You Know, for Search"
}

Here is the java version:

root@esls1:~# java -version
java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

Here are some of the knobs I've tried to tweak for our indexes...this is
just a snapshot of one index:

https://gist.github.com/rnavarro/490196cf73ff46e33a8b

Operating System:

Ubuntu 12.04.2 LTS

There are 15 indexes on this node, rotated daily and removed after 14 days.

The incoming index requests are coming out of logstash and it's all logging data.

I suspect the server is IO bound as there is are bursts of 5-10s sustained 10%+ iowait.

What other knobs can I tweak to help speed things along?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6