How to run elastic from memory?

Hey,

I have an index about 200gb large, which is read-only. My machine has about
300gb in ram, and i would like to have my index fully in memory. if i just
use index.store.type = "memory" in my index settings, the data does not
persist after a restart. is there a way to have the index fully stored in
ram, but persisted to disk? (especially in a read only index)

I tried to reading this threadhttps://groups.google.com/d/topic/elasticsearch/X2WTO2i-OMk/discussion
and that onehttp://elasticsearch-users.115913.n3.nabble.com/How-to-switch-from-file-storage-to-memory-storage-td1355229.html,
which both using fs gateway, but i didnt manage to get it to work, and also
it seems that the fs gateway is deprecated.

I also found https://github.com/elasticsearch/elasticsearch/issues/82,
which talks about loading index files to memory. sounds really useful, but
i didnt get how to use it from that text, nor did i find anything else that
points to it..

is it possible to do this? does it even make sense?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Elasticsearch relies on Lucene on the lowest level which works best if you
have a large amount of memory available for FileSystem Caches. If you have
such a lot of memory try to give a small amout to ES like a couple of gigs
(not sure if you need a lot, do you sort or facet?) and leave the rest to
the FS. Lucene will take advantage of it and literally load your entire
index into memory just by default.

simon

On Thursday, March 7, 2013 7:00:32 PM UTC+1, Shlomi wrote:

Hey,

I have an index about 200gb large, which is read-only. My machine has
about 300gb in ram, and i would like to have my index fully in memory. if i
just use index.store.type = "memory" in my index settings, the data does
not persist after a restart. is there a way to have the index fully stored
in ram, but persisted to disk? (especially in a read only index)

I tried to reading this threadhttps://groups.google.com/d/topic/elasticsearch/X2WTO2i-OMk/discussion
and that onehttp://elasticsearch-users.115913.n3.nabble.com/How-to-switch-from-file-storage-to-memory-storage-td1355229.html,
which both using fs gateway, but i didnt manage to get it to work, and also
it seems that the fs gateway is deprecated.

I also found Index FS Store: Allow to cache (in memory) specific files · Issue #82 · elastic/elasticsearch · GitHub,
which talks about loading index files to memory. sounds really useful, but
i didnt get how to use it from that text, nor did i find anything else that
points to it..

is it possible to do this? does it even make sense?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

what is your current index store type? It is difficult to change a store
type after index creation - you need some hacks (I never tried it, so I
do not recommend this).

A note about index "fully in memory": the scenario is twofold for read
and write. For read, the data is already being cached at several layers,
but for write, the data should mostly go straight to disk for persistence.

I think you just want read caching, since you ask the index should be
persisted to disk. Index store type "memory" means your writes do not go
to disk, they will be not persistent, which is useful for testing or for
"RAM only" systems which do not power off. Note, the index store is
Java-based memory operation, which deals with JVM heap / NIO buffers
only, and is not necessarily well aligned with the underlying OS virtual
memory subsystem (see swap, page allocation, memory segment
organization, garbage collection). So, read the index store type =
"memory" better like index store type = "jvm".

What are the layers for read caching the ES index in your system? By
default, ES selects "niofs" or "mmapfs" for index store type (I neglect
32bit OS here). This is perfect for having all the available memory used
for index reads from memory. The index files must compete with field
cache, result sets, but that is not a problem, it' a good thing to have
a large ES field cache. In your ES settings, you should let the OS
enough memory so it can load your index files completely into filesystem
cache and organize performant writes (delayed, async disk writes). This
is why there is a rule of thumb saying 50% of RAM for ES heap and 50%
for OS cache.

On Linux/Solaris 64bit, you can lock a process into memory with
bootstrap.mlockall: true. This helps against low-memory situations where
the OS reacts with paging/swapping the JVM. The JVM consists of several
different memory parts, some of them are subject to get moved to disk if
system resources are tight. The mlockall ensures that all ES operations
on an index are executed in RAM. mlockall() does not ensure that your
index files stay always in RAM, this is still managed by the OS virtual
memory subsystem, and the OS decides if there is memory pressure so it
has to reallocate memory of the filesystem cache (e.g. nightly cleanup
runs, rsync, etc.)

On the OS layer, you have a simple method to force your index into RAM.
Just create a RAM filesystem and assign the ES path.data to it. In
modern OS, RAM filesystem data is not copied twice into a filesystem
cache. You should select mount options to disable journaling (ext4:
data=writeback) and synchronous writes (ext4: async) and file access
inode updates (ext4: noatime). Note, for disk persistence, you must copy
the index to another parition with a disk filesystem.

But, I doubt you will get faster read performance with a RAM filesystem
in comparison to mmapfs/mlockall, since the mmapfs caching is optimal
for reads. So in my eyes it's questionable why a RAM filesystem should
be used, it disables the gateway persistence and you must persist your
index by yourself which I find tedious.

Jörg

Am 07.03.13 19:00, schrieb Shlomi:

Hey,

I have an index about 200gb large, which is read-only. My machine has
about 300gb in ram, and i would like to have my index fully in memory.
if i just use index.store.type = "memory" in my index settings, the
data does not persist after a restart. is there a way to have the
index fully stored in ram, but persisted to disk? (especially in a
read only index)

I tried to reading this thread
https://groups.google.com/d/topic/elasticsearch/X2WTO2i-OMk/discussionand
that one
http://elasticsearch-users.115913.n3.nabble.com/How-to-switch-from-file-storage-to-memory-storage-td1355229.html,
which both using fs gateway, but i didnt manage to get it to work, and
also it seems that the fs gateway is deprecated.

I also found Index FS Store: Allow to cache (in memory) specific files · Issue #82 · elastic/elasticsearch · GitHub,
which talks about loading index files to memory. sounds really useful,
but i didnt get how to use it from that text, nor did i find anything
else that points to it..

is it possible to do this? does it even make sense?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

1 Like

Hi,

how can you change the store type from nio to mmapfs without reindexing? I
assumed they are the same thing on disk and just a reboot with the new
setting would be enough.

Mo
On Thursday, March 7, 2013 9:27:16 PM UTC+2, Jörg Prante wrote:

Hi,

what is your current index store type? It is difficult to change a store
type after index creation - you need some hacks (I never tried it, so I
do not recommend this).

A note about index "fully in memory": the scenario is twofold for read
and write. For read, the data is already being cached at several layers,
but for write, the data should mostly go straight to disk for persistence.

I think you just want read caching, since you ask the index should be
persisted to disk. Index store type "memory" means your writes do not go
to disk, they will be not persistent, which is useful for testing or for
"RAM only" systems which do not power off. Note, the index store is
Java-based memory operation, which deals with JVM heap / NIO buffers
only, and is not necessarily well aligned with the underlying OS virtual
memory subsystem (see swap, page allocation, memory segment
organization, garbage collection). So, read the index store type =
"memory" better like index store type = "jvm".

What are the layers for read caching the ES index in your system? By
default, ES selects "niofs" or "mmapfs" for index store type (I neglect
32bit OS here). This is perfect for having all the available memory used
for index reads from memory. The index files must compete with field
cache, result sets, but that is not a problem, it' a good thing to have
a large ES field cache. In your ES settings, you should let the OS
enough memory so it can load your index files completely into filesystem
cache and organize performant writes (delayed, async disk writes). This
is why there is a rule of thumb saying 50% of RAM for ES heap and 50%
for OS cache.

On Linux/Solaris 64bit, you can lock a process into memory with
bootstrap.mlockall: true. This helps against low-memory situations where
the OS reacts with paging/swapping the JVM. The JVM consists of several
different memory parts, some of them are subject to get moved to disk if
system resources are tight. The mlockall ensures that all ES operations
on an index are executed in RAM. mlockall() does not ensure that your
index files stay always in RAM, this is still managed by the OS virtual
memory subsystem, and the OS decides if there is memory pressure so it
has to reallocate memory of the filesystem cache (e.g. nightly cleanup
runs, rsync, etc.)

On the OS layer, you have a simple method to force your index into RAM.
Just create a RAM filesystem and assign the ES path.data to it. In
modern OS, RAM filesystem data is not copied twice into a filesystem
cache. You should select mount options to disable journaling (ext4:
data=writeback) and synchronous writes (ext4: async) and file access
inode updates (ext4: noatime). Note, for disk persistence, you must copy
the index to another parition with a disk filesystem.

But, I doubt you will get faster read performance with a RAM filesystem
in comparison to mmapfs/mlockall, since the mmapfs caching is optimal
for reads. So in my eyes it's questionable why a RAM filesystem should
be used, it disables the gateway persistence and you must persist your
index by yourself which I find tedious.

Jörg

Am 07.03.13 19:00, schrieb Shlomi:

Hey,

I have an index about 200gb large, which is read-only. My machine has
about 300gb in ram, and i would like to have my index fully in memory.
if i just use index.store.type = "memory" in my index settings, the
data does not persist after a restart. is there a way to have the
index fully stored in ram, but persisted to disk? (especially in a
read only index)

I tried to reading this thread
https://groups.google.com/d/topic/elasticsearch/X2WTO2i-OMk/discussionand

that one
<
http://elasticsearch-users.115913.n3.nabble.com/How-to-switch-from-file-storage-to-memory-storage-td1355229.html>,

which both using fs gateway, but i didnt manage to get it to work, and
also it seems that the fs gateway is deprecated.

I also found Index FS Store: Allow to cache (in memory) specific files · Issue #82 · elastic/elasticsearch · GitHub,
which talks about loading index files to memory. sounds really useful,
but i didnt get how to use it from that text, nor did i find anything
else that points to it..

is it possible to do this? does it even make sense?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

There are three file-based index stores which are close relatives:
simplefs, niofs, mmapfs.

To sum up:

  • mmapfs: for 64bit JVM
  • niofs: for 32/64bit JVM on OS that do not support mmap() but NIO
  • simplefs: for all other JVMs that do not support NIO or mmap(), or if
    mmapfs/niofs is known to be buggy

ES selects the default store at startup time by a best-effort approach
by looking into OS and JVM info.

The file-based stores should be compatible in the sense that on a 64bit
JVM, you can shutdown the cluster, change the indes store type, and
bring the cluster back online.

I thought you were talking about switching from file-based to
memory-based stores.

Jörg

Am 08.03.13 05:50, schrieb Mo:

how can you change the store type from nio to mmapfs without
reindexing? I assumed they are the same thing on disk and just a
reboot with the new setting would be enough.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey Jorg

That was an excellent and insightful reply, thank you.
I will probably give both mmapfs/mlockall and ram fs a chance, although
your reasoning makes total sense (maybe up to pre-caching..).

Shlomi

On Thursday, March 7, 2013 9:27:16 PM UTC+2, Jörg Prante wrote:

Hi,

what is your current index store type? It is difficult to change a store
type after index creation - you need some hacks (I never tried it, so I
do not recommend this).

A note about index "fully in memory": the scenario is twofold for read
and write. For read, the data is already being cached at several layers,
but for write, the data should mostly go straight to disk for persistence.

I think you just want read caching, since you ask the index should be
persisted to disk. Index store type "memory" means your writes do not go
to disk, they will be not persistent, which is useful for testing or for
"RAM only" systems which do not power off. Note, the index store is
Java-based memory operation, which deals with JVM heap / NIO buffers
only, and is not necessarily well aligned with the underlying OS virtual
memory subsystem (see swap, page allocation, memory segment
organization, garbage collection). So, read the index store type =
"memory" better like index store type = "jvm".

What are the layers for read caching the ES index in your system? By
default, ES selects "niofs" or "mmapfs" for index store type (I neglect
32bit OS here). This is perfect for having all the available memory used
for index reads from memory. The index files must compete with field
cache, result sets, but that is not a problem, it' a good thing to have
a large ES field cache. In your ES settings, you should let the OS
enough memory so it can load your index files completely into filesystem
cache and organize performant writes (delayed, async disk writes). This
is why there is a rule of thumb saying 50% of RAM for ES heap and 50%
for OS cache.

On Linux/Solaris 64bit, you can lock a process into memory with
bootstrap.mlockall: true. This helps against low-memory situations where
the OS reacts with paging/swapping the JVM. The JVM consists of several
different memory parts, some of them are subject to get moved to disk if
system resources are tight. The mlockall ensures that all ES operations
on an index are executed in RAM. mlockall() does not ensure that your
index files stay always in RAM, this is still managed by the OS virtual
memory subsystem, and the OS decides if there is memory pressure so it
has to reallocate memory of the filesystem cache (e.g. nightly cleanup
runs, rsync, etc.)

On the OS layer, you have a simple method to force your index into RAM.
Just create a RAM filesystem and assign the ES path.data to it. In
modern OS, RAM filesystem data is not copied twice into a filesystem
cache. You should select mount options to disable journaling (ext4:
data=writeback) and synchronous writes (ext4: async) and file access
inode updates (ext4: noatime). Note, for disk persistence, you must copy
the index to another parition with a disk filesystem.

But, I doubt you will get faster read performance with a RAM filesystem
in comparison to mmapfs/mlockall, since the mmapfs caching is optimal
for reads. So in my eyes it's questionable why a RAM filesystem should
be used, it disables the gateway persistence and you must persist your
index by yourself which I find tedious.

Jörg

Am 07.03.13 19:00, schrieb Shlomi:

Hey,

I have an index about 200gb large, which is read-only. My machine has
about 300gb in ram, and i would like to have my index fully in memory.
if i just use index.store.type = "memory" in my index settings, the
data does not persist after a restart. is there a way to have the
index fully stored in ram, but persisted to disk? (especially in a
read only index)

I tried to reading this thread
https://groups.google.com/d/topic/elasticsearch/X2WTO2i-OMk/discussionand

that one
<
http://elasticsearch-users.115913.n3.nabble.com/How-to-switch-from-file-storage-to-memory-storage-td1355229.html>,

which both using fs gateway, but i didnt manage to get it to work, and
also it seems that the fs gateway is deprecated.

I also found Index FS Store: Allow to cache (in memory) specific files · Issue #82 · elastic/elasticsearch · GitHub,
which talks about loading index files to memory. sounds really useful,
but i didnt get how to use it from that text, nor did i find anything
else that points to it..

is it possible to do this? does it even make sense?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

oh, that might have been the original question. I am just curious about
switching from niofs to mmapfs and the steps for doing that :slight_smile:
Thanks for your response.

On Friday, March 8, 2013 12:05:34 PM UTC+4, Jörg Prante wrote:

There are three file-based index stores which are close relatives:
simplefs, niofs, mmapfs.

To sum up:

  • mmapfs: for 64bit JVM
  • niofs: for 32/64bit JVM on OS that do not support mmap() but NIO
  • simplefs: for all other JVMs that do not support NIO or mmap(), or if
    mmapfs/niofs is known to be buggy

ES selects the default store at startup time by a best-effort approach
by looking into OS and JVM info.

The file-based stores should be compatible in the sense that on a 64bit
JVM, you can shutdown the cluster, change the indes store type, and
bring the cluster back online.

I thought you were talking about switching from file-based to
memory-based stores.

Jörg

Am 08.03.13 05:50, schrieb Mo:

how can you change the store type from nio to mmapfs without
reindexing? I assumed they are the same thing on disk and just a
reboot with the new setting would be enough.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.