Feedback request: Configuration of ES on a single host

Hello everyone, I wrote a blog post about configuring ElasticSearch on a
single shared host by deactivating replication and limiting memory usage
and I would very much appreciate your feedback because I had a very hard
time finding appropriate documentation to do so.

Thanks for your time.

--

Wow, sorry about that, typical failure of adding the URL:

...

On Wednesday, 12 September 2012 10:50:13 UTC-4, Sébastien Lavoie wrote:

Hello everyone, I wrote a blog post about configuring Elasticsearch on a
single shared host by deactivating replication and limiting memory usage
and I would very much appreciate your feedback because I had a very hard
time finding appropriate documentation to do so.

Thanks for your time.

--

Oooops... First sentence is wrong:

Elasticsearch http://www.elasticsearch.org/ is a powerful, yet easy to use,
search engine based on Solr,

Oh my god... I don't think that ES is based on SOLR :wink:

HTH
David.

Le 12 septembre 2012 à 16:50, "Sébastien Lavoie" sebastien@lavoie.sl a écrit :

Wow, sorry about that, typical failure of adding the URL:
Configure ElasticSearch on a single shared host and reduce memory usage • Websites, Hosting and Friends

...

On Wednesday, 12 September 2012 10:50:13 UTC-4, Sébastien Lavoie wrote:

Hello everyone, I wrote a blog post about configuring Elasticsearch on
a single shared host by deactivating replication and limiting memory
usage and I would very much appreciate your feedback because I had a
very hard time finding appropriate documentation to do so.

Thanks for your time.

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

Thanks for the meta, but ideas about the content ? :stuck_out_tongue:

Seb

On Wed, Sep 12, 2012 at 11:17 AM, David Pilato david@pilato.fr wrote:

**

Oooops... First sentence is wrong:

Elasticsearch http://www.elasticsearch.org/ is a powerful, yet easy to
use, search engine based on Solr,

Oh my god... I don't think that ES is based on SOLR :wink:

HTH

David.

Le 12 septembre 2012 à 16:50, "Sébastien Lavoie" sebastien@lavoie.sl a
écrit :

Wow, sorry about that, typical failure of adding the URL:
Configure ElasticSearch on a single shared host and reduce memory usage • Websites, Hosting and Friends

...

On Wednesday, 12 September 2012 10:50:13 UTC-4, Sébastien Lavoie wrote:

Hello everyone, I wrote a blog post about configuring Elasticsearch on a
single shared host by deactivating replication and limiting memory usage
and I would very much appreciate your feedback because I had a very hard
time finding appropriate documentation to do so.

Thanks for your time.

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

--

One thing you might want to do is change the default max file descriptors
(especially if your running multiple servers on one machine!) to something
higher. You could do ulimit -n unlimited for that.

On Wednesday, September 12, 2012 8:19:21 AM UTC-7, Sébastien Lavoie wrote:

Thanks for the meta, but ideas about the content ? :stuck_out_tongue:

Seb

On Wed, Sep 12, 2012 at 11:17 AM, David Pilato <da...@pilato.fr<javascript:>

wrote:

**

Oooops... First sentence is wrong:

Elasticsearch http://www.elasticsearch.org/ is a powerful, yet easy to
use, search engine based on Solr,

Oh my god... I don't think that ES is based on SOLR :wink:

HTH

David.

Le 12 septembre 2012 à 16:50, "Sébastien Lavoie" <seba...@lavoie.sl<javascript:>>
a écrit :

Wow, sorry about that, typical failure of adding the URL:

Configure ElasticSearch on a single shared host and reduce memory usage • Websites, Hosting and Friends

...

On Wednesday, 12 September 2012 10:50:13 UTC-4, Sébastien Lavoie wrote:

Hello everyone, I wrote a blog post about configuring Elasticsearch on a
single shared host by deactivating replication and limiting memory usage
and I would very much appreciate your feedback because I had a very hard
time finding appropriate documentation to do so.

Thanks for your time.

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

--

Well, I am not sure ES really needs 64k opened files, 15k seems enough.
This is mostly intended for small amount of data. What are the pitfalls of
reducing it too much ? It will spend all its time opening and closing files
?

Perhaps I could give a suggestion to use roughly 10 times the value of
_all/total/docs/count after all base data has been indexed. Ideas ?

Thanks for your time

Seb

On Wed, Sep 12, 2012 at 4:26 PM, William King willtrking@gmail.com wrote:

One thing you might want to do is change the default max file descriptors
(especially if your running multiple servers on one machine!) to something
higher. You could do ulimit -n unlimited for that.

On Wednesday, September 12, 2012 8:19:21 AM UTC-7, Sébastien Lavoie wrote:

Thanks for the meta, but ideas about the content ? :stuck_out_tongue:

Seb

On Wed, Sep 12, 2012 at 11:17 AM, David Pilato da...@pilato.fr wrote:

**

Oooops... First sentence is wrong:

Elasticsearch http://www.elasticsearch.org/ is a powerful, yet easy
to use, search engine based on Solr,

Oh my god... I don't think that ES is based on SOLR :wink:

HTH

David.

Le 12 septembre 2012 à 16:50, "Sébastien Lavoie" seba...@lavoie.sl a
écrit :

Wow, sorry about that, typical failure of adding the URL:
http://blog.lavoie.sl/2012/09/**configure-elasticsearch-on-a-**
single-host.htmlhttp://blog.lavoie.sl/2012/09/configure-elasticsearch-on-a-single-host.html

...

On Wednesday, 12 September 2012 10:50:13 UTC-4, Sébastien Lavoie wrote:

Hello everyone, I wrote a blog post about configuring Elasticsearch on a
single shared host by deactivating replication and limiting memory usage
and I would very much appreciate your feedback because I had a very hard
time finding appropriate documentation to do so.

Thanks for your time.

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

--

--

I wouldnt imagine that ES alone would take up 64k files, not even close to
it. My implementations seem to use 700-1500. However your post is
referencing the problem of running ES alongside Apache and a database.
Apache alone might not take up that many file descriptors but the database
could (depending on the database). The ulimit -n unlimited command will
set the system wide max file descriptors for all services, not just ES. I'm
not sure if the init file MAX_OPEN_FILES setting your referring to sets
system wide file descriptors or not, but if it doesn't you need to set it
with ulimit or some other method.

On Wednesday, September 12, 2012 2:13:50 PM UTC-7, Sébastien Lavoie wrote:

Well, I am not sure ES really needs 64k opened files, 15k seems enough.
This is mostly intended for small amount of data. What are the pitfalls of
reducing it too much ? It will spend all its time opening and closing files
?

Perhaps I could give a suggestion to use roughly 10 times the value of
_all/total/docs/count after all base data has been indexed. Ideas ?

Thanks for your time

Seb

On Wed, Sep 12, 2012 at 4:26 PM, William King <willt...@gmail.com<javascript:>

wrote:

One thing you might want to do is change the default max file descriptors
(especially if your running multiple servers on one machine!) to something
higher. You could do ulimit -n unlimited for that.

On Wednesday, September 12, 2012 8:19:21 AM UTC-7, Sébastien Lavoie wrote:

Thanks for the meta, but ideas about the content ? :stuck_out_tongue:

Seb

On Wed, Sep 12, 2012 at 11:17 AM, David Pilato da...@pilato.fr wrote:

**

Oooops... First sentence is wrong:

Elasticsearch http://www.elasticsearch.org/ is a powerful, yet easy
to use, search engine based on Solr,

Oh my god... I don't think that ES is based on SOLR :wink:

HTH

David.

Le 12 septembre 2012 à 16:50, "Sébastien Lavoie" seba...@lavoie.sl a
écrit :

Wow, sorry about that, typical failure of adding the URL:
http://blog.lavoie.sl/2012/09/**configure-elasticsearch-on-a-**
single-host.htmlhttp://blog.lavoie.sl/2012/09/configure-elasticsearch-on-a-single-host.html

...

On Wednesday, 12 September 2012 10:50:13 UTC-4, Sébastien Lavoie wrote:

Hello everyone, I wrote a blog post about configuring Elasticsearch on
a single shared host by deactivating replication and limiting memory usage
and I would very much appreciate your feedback because I had a very hard
time finding appropriate documentation to do so.

Thanks for your time.

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

--

--

A better way to handle nofile limits is per-user of course. So you can have
an es_admin that has a large number and other users, including root,
running where the system configured them to. See any linux post on
/etc/security/limits.conf

As to behavior, even today, too many programs crash when they run out of
file descriptors; its not a performance hit exactly ... :wink:

On Wednesday, September 12, 2012 9:40:14 PM UTC-4, William King wrote:

I wouldnt imagine that ES alone would take up 64k files, not even close to
it. My implementations seem to use 700-1500. However your post is
referencing the problem of running ES alongside Apache and a database.
Apache alone might not take up that many file descriptors but the database
could (depending on the database). The ulimit -n unlimited command will
set the system wide max file descriptors for all services, not just ES. I'm
not sure if the init file MAX_OPEN_FILES setting your referring to sets
system wide file descriptors or not, but if it doesn't you need to set it
with ulimit or some other method.

On Wednesday, September 12, 2012 2:13:50 PM UTC-7, Sébastien Lavoie wrote:

Well, I am not sure ES really needs 64k opened files, 15k seems enough.
This is mostly intended for small amount of data. What are the pitfalls of
reducing it too much ? It will spend all its time opening and closing files
?

Perhaps I could give a suggestion to use roughly 10 times the value of
_all/total/docs/count after all base data has been indexed. Ideas ?

Thanks for your time

Seb

On Wed, Sep 12, 2012 at 4:26 PM, William King willt...@gmail.com wrote:

One thing you might want to do is change the default max file
descriptors (especially if your running multiple servers on one machine!)
to something higher. You could do ulimit -n unlimited for that.

On Wednesday, September 12, 2012 8:19:21 AM UTC-7, Sébastien Lavoie
wrote:

Thanks for the meta, but ideas about the content ? :stuck_out_tongue:

Seb

On Wed, Sep 12, 2012 at 11:17 AM, David Pilato da...@pilato.fr wrote:

**

Oooops... First sentence is wrong:

Elasticsearch http://www.elasticsearch.org/ is a powerful, yet easy
to use, search engine based on Solr,

Oh my god... I don't think that ES is based on SOLR :wink:

HTH

David.

Le 12 septembre 2012 à 16:50, "Sébastien Lavoie" seba...@lavoie.sl
a écrit :

Wow, sorry about that, typical failure of adding the URL:
http://blog.lavoie.sl/2012/09/**configure-elasticsearch-on-a-**
single-host.htmlhttp://blog.lavoie.sl/2012/09/configure-elasticsearch-on-a-single-host.html

...

On Wednesday, 12 September 2012 10:50:13 UTC-4, Sébastien Lavoie
wrote:

Hello everyone, I wrote a blog post about configuring Elasticsearch on
a single shared host by deactivating replication and limiting memory usage
and I would very much appreciate your feedback because I had a very hard
time finding appropriate documentation to do so.

Thanks for your time.

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

--

--

Hello Seb,

Nice post :slight_smile:

I wouldn't bother about disabling replicas. If you have one node only it
won't replicate anything. Only when you start adding nodes it will allocate
replicas.

On Wednesday, September 12, 2012 6:19:21 PM UTC+3, Sébastien Lavoie wrote:

Thanks for the meta, but ideas about the content ? :stuck_out_tongue:

Seb

On Wed, Sep 12, 2012 at 11:17 AM, David Pilato <da...@pilato.fr<javascript:>

wrote:

**

Oooops... First sentence is wrong:

Elasticsearch http://www.elasticsearch.org/ is a powerful, yet easy to
use, search engine based on Solr,

Oh my god... I don't think that ES is based on SOLR :wink:

HTH

David.

Le 12 septembre 2012 à 16:50, "Sébastien Lavoie" <seba...@lavoie.sl<javascript:>>
a écrit :

Wow, sorry about that, typical failure of adding the URL:

Configure ElasticSearch on a single shared host and reduce memory usage • Websites, Hosting and Friends

...

On Wednesday, 12 September 2012 10:50:13 UTC-4, Sébastien Lavoie wrote:

Hello everyone, I wrote a blog post about configuring Elasticsearch on a
single shared host by deactivating replication and limiting memory usage
and I would very much appreciate your feedback because I had a very hard
time finding appropriate documentation to do so.

Thanks for your time.

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

--

@Billy, Ok, I got your point concerning file descriptors, I updated it,
thanks.
@Radu, This is the recommended way that you can see in the comments of
elasticasearch.yml

Thanks

On Thursday, 13 September 2012 03:53:42 UTC-4, Radu Gheorghe wrote:

Hello Seb,

Nice post :slight_smile:

I wouldn't bother about disabling replicas. If you have one node only it
won't replicate anything. Only when you start adding nodes it will allocate
replicas.

On Wednesday, September 12, 2012 6:19:21 PM UTC+3, Sébastien Lavoie wrote:

Thanks for the meta, but ideas about the content ? :stuck_out_tongue:

Seb

On Wed, Sep 12, 2012 at 11:17 AM, David Pilato da...@pilato.fr wrote:

**

Oooops... First sentence is wrong:

Elasticsearch http://www.elasticsearch.org/ is a powerful, yet easy
to use, search engine based on Solr,

Oh my god... I don't think that ES is based on SOLR :wink:

HTH

David.

Le 12 septembre 2012 à 16:50, "Sébastien Lavoie" seba...@lavoie.sl a
écrit :

Wow, sorry about that, typical failure of adding the URL:

Configure ElasticSearch on a single shared host and reduce memory usage • Websites, Hosting and Friends

...

On Wednesday, 12 September 2012 10:50:13 UTC-4, Sébastien Lavoie wrote:

Hello everyone, I wrote a blog post about configuring Elasticsearch on a
single shared host by deactivating replication and limiting memory usage
and I would very much appreciate your feedback because I had a very hard
time finding appropriate documentation to do so.

Thanks for your time.

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

--

On Thu, Sep 13, 2012 at 12:53 AM, Radu Gheorghe radu0gheorghe@gmail.com wrote:

Hello Seb,

Nice post :slight_smile:

I wouldn't bother about disabling replicas. If you have one node only it
won't replicate anything. Only when you start adding nodes it will allocate
replicas.

While you are correct that the shards will not be allocated until you
start adding more nodes, the index will be in a constant yellow state
since the replica shards have not been allocated.

Changing the replica setting can be done at any time, the shard count
cannot. If you envision running two nodes in the future (more for high
availability than performance), I would set the shard count to 2 (or
higher) from the start. The number of shards does affect the number of
file descriptors used. If you add more nodes, then you can change the
replica count.

Cheers,

Ivan

--