One of the biggest performance problems we have with ES is page cache
evictions. Indexer warmers are great for getting the cache primed but it
doesn't take much for the cache to be trashed. Some common causes for us
are, logging, backups, and HDFS tasks. Relying completely on the OS page
cache has always seemed odd to me since it's almost completely out of the
applications control, and very inconsistent across platforms, and with very
few configuration optional available.
What are our options for ensuring that performance doesn't degrade because
of disk IO caused by other applications?
Some thoughts I've had...
Call fadvice with POSIX_FADV_WILLNEED after warmers are called
Shell out to vmtouch after warmers.
Configure ES/Lucene to use a heaped based cache (does this even exist?)
I don't know if any of these are possible and I certainly don't know how to
do them out of the box.
Simple: never share an Elasticsearch data node with other intensive
applications with intensive IO, CPU or memory consumption.
Also, Elasticsearch's recommended configuration talks about setting a heap
size of 50% of the available memory on the machine, and avoiding
swappiness. So for most common use cases, if you do sizing right, you
shouldn't be hitting the disk so much (except from _source loading ,
highlighting or if you are using docvalues)
One of the biggest performance problems we have with ES is page cache
evictions. Indexer warmers are great for getting the cache primed but it
doesn't take much for the cache to be trashed. Some common causes for us
are, logging, backups, and HDFS tasks. Relying completely on the OS page
cache has always seemed odd to me since it's almost completely out of the
applications control, and very inconsistent across platforms, and with very
few configuration optional available.
What are our options for ensuring that performance doesn't degrade because
of disk IO caused by other applications?
Some thoughts I've had...
Call fadvice with POSIX_FADV_WILLNEED after warmers are called
Shell out to vmtouch after warmers.
Configure ES/Lucene to use a heaped based cache (does this even exist?)
I don't know if any of these are possible and I certainly don't know how
to do them out of the box.
Sure, dedicated hardward is a solution but it's rather unreasonable in
certain use cases especially when software solutions could exist. We
already have a cluster with plenty of RAM and CPU to spare and the ES
shards + routing ensure each request is handled by very few nodes. The
setup of the cluster also allows for very fast distributed indexing and
scanning.
Trying to justify all new hardware for erratically slowed searches on
existing hardware that is already correctly sized is hard to follow.
Especially when we're talking about indexing billions of documents that
make use of heavy aggregation.
So assuming we have correctly size hardware, how do we solve this? A
software cache at the lucene/ES level seems to clearly solve it, does that
exist? Is OS page cache locking possible (like vmtouch does?)
On Wed, Feb 4, 2015 at 7:07 AM, Itamar Syn-Hershko itamar@code972.com
wrote:
Simple: never share an Elasticsearch data node with other intensive
applications with intensive IO, CPU or memory consumption.
Also, Elasticsearch's recommended configuration talks about setting a heap
size of 50% of the available memory on the machine, and avoiding
swappiness. So for most common use cases, if you do sizing right, you
shouldn't be hitting the disk so much (except from _source loading ,
highlighting or if you are using docvalues)
One of the biggest performance problems we have with ES is page cache
evictions. Indexer warmers are great for getting the cache primed but it
doesn't take much for the cache to be trashed. Some common causes for us
are, logging, backups, and HDFS tasks. Relying completely on the OS page
cache has always seemed odd to me since it's almost completely out of the
applications control, and very inconsistent across platforms, and with very
few configuration optional available.
What are our options for ensuring that performance doesn't degrade
because of disk IO caused by other applications?
Some thoughts I've had...
Call fadvice with POSIX_FADV_WILLNEED after warmers are called
Shell out to vmtouch after warmers.
Configure ES/Lucene to use a heaped based cache (does this even exist?)
I don't know if any of these are possible and I certainly don't know how
to do them out of the box.
Have you looked into filtered cache or query cache on ES? You can increase
the default filter cache from 10% and enable query cache to see if this
will help.
On Wednesday, February 4, 2015 at 6:34:16 AM UTC-8, Andrew White wrote:
Sure, dedicated hardward is a solution but it's rather unreasonable in
certain use cases especially when software solutions could exist. We
already have a cluster with plenty of RAM and CPU to spare and the ES
shards + routing ensure each request is handled by very few nodes. The
setup of the cluster also allows for very fast distributed indexing and
scanning.
Trying to justify all new hardware for erratically slowed searches on
existing hardware that is already correctly sized is hard to follow.
Especially when we're talking about indexing billions of documents that
make use of heavy aggregation.
So assuming we have correctly size hardware, how do we solve this? A
software cache at the lucene/ES level seems to clearly solve it, does that
exist? Is OS page cache locking possible (like vmtouch does?)
On Wed, Feb 4, 2015 at 7:07 AM, Itamar Syn-Hershko <ita...@code972.com
<javascript:>> wrote:
Simple: never share an Elasticsearch data node with other intensive
applications with intensive IO, CPU or memory consumption.
Also, Elasticsearch's recommended configuration talks about setting a
heap size of 50% of the available memory on the machine, and avoiding
swappiness. So for most common use cases, if you do sizing right, you
shouldn't be hitting the disk so much (except from _source loading ,
highlighting or if you are using docvalues)
On Wed, Feb 4, 2015 at 2:50 PM, Andrew White <and...@datarank.com
<javascript:>> wrote:
One of the biggest performance problems we have with ES is page cache
evictions. Indexer warmers are great for getting the cache primed but it
doesn't take much for the cache to be trashed. Some common causes for us
are, logging, backups, and HDFS tasks. Relying completely on the OS page
cache has always seemed odd to me since it's almost completely out of the
applications control, and very inconsistent across platforms, and with very
few configuration optional available.
What are our options for ensuring that performance doesn't degrade
because of disk IO caused by other applications?
Some thoughts I've had...
Call fadvice with POSIX_FADV_WILLNEED after warmers are called
Shell out to vmtouch after warmers.
Configure ES/Lucene to use a heaped based cache (does this even
exist?)
I don't know if any of these are possible and I certainly don't know how
to do them out of the box.
"For now, the query cache will only cache the results of search requests
where ?search_type=count, so it will not cache hits"
And the filter cache is just a bitset, not the expensive IO seeks related
to source fetching. The issue we have is with the fetching of the docs from
disk. From what I can see, ES simply does not cache docs.
Have you looked into filtered cache or query cache on ES? You can increase
the default filter cache from 10% and enable query cache to see if this
will help.
On Wednesday, February 4, 2015 at 6:34:16 AM UTC-8, Andrew White wrote:
Sure, dedicated hardward is a solution but it's rather unreasonable in
certain use cases especially when software solutions could exist. We
already have a cluster with plenty of RAM and CPU to spare and the ES
shards + routing ensure each request is handled by very few nodes. The
setup of the cluster also allows for very fast distributed indexing and
scanning.
Trying to justify all new hardware for erratically slowed searches on
existing hardware that is already correctly sized is hard to follow.
Especially when we're talking about indexing billions of documents that
make use of heavy aggregation.
So assuming we have correctly size hardware, how do we solve this? A
software cache at the lucene/ES level seems to clearly solve it, does that
exist? Is OS page cache locking possible (like vmtouch does?)
On Wed, Feb 4, 2015 at 7:07 AM, Itamar Syn-Hershko ita...@code972.com
wrote:
Simple: never share an Elasticsearch data node with other intensive
applications with intensive IO, CPU or memory consumption.
Also, Elasticsearch's recommended configuration talks about setting a
heap size of 50% of the available memory on the machine, and avoiding
swappiness. So for most common use cases, if you do sizing right, you
shouldn't be hitting the disk so much (except from _source loading ,
highlighting or if you are using docvalues)
One of the biggest performance problems we have with ES is page cache
evictions. Indexer warmers are great for getting the cache primed but it
doesn't take much for the cache to be trashed. Some common causes for us
are, logging, backups, and HDFS tasks. Relying completely on the OS page
cache has always seemed odd to me since it's almost completely out of the
applications control, and very inconsistent across platforms, and with very
few configuration optional available.
What are our options for ensuring that performance doesn't degrade
because of disk IO caused by other applications?
Some thoughts I've had...
Call fadvice with POSIX_FADV_WILLNEED after warmers are called
Shell out to vmtouch after warmers.
Configure ES/Lucene to use a heaped based cache (does this even
exist?)
I don't know if any of these are possible and I certainly don't know
how to do them out of the box.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.