2.0 in-memory/many-update indices best-practices?


(Yehosef) #1

I see that memory is no longer an option for the index.store.type . I'm not sure what the considerations were - but I have a specific problem that might benefit from putting the index into memory and I'm looking for advice.

We have an aggregation flow were we take raw data, aggregate and enrich it and then save it again. This can happen dozens or hundreds of times for a particular document. The problem is that normally this will thrash IO when the segments are merged. I was hoping to use memory indices so that I wouldn't have to think about the disk until the main processing is finished - then it would be copied to a regular disk-based index.

My usage pattern can tolerate data loss (I just re-run the aggregations).

Would a Ram disk help here? Are people using a ramdisk for ES for this or similar problems?

Are there changes in my configuration when storing on a disk vs ramdisk? Eg, would I still use doc_fields?

Any other helps on improving performance with many updates?

thanks,


(Yehosef) #2

Also - we are using SSD - we can see the IOps go up in the 20-30k/s during heavy activity.


(Mark Walkom) #3

Maybe. But the existence of them is why we removed memory store type, right tool for the job and all :smile:

No, it's the same.

Bit hard to say really. If you are on SSDs you are already pretty well setup. But if that isn't up to scratch then perhaps you can look at your code/queries. Hardware can only solve so much after all.


(Yehosef) #4

Hi Mark - thanks for the reply.

The problem is that it's not really the same thing. There is extra overhead in simulating a disk that I assume a memory index would avoid. If there is something in Lucene that supports this - I would think that's the right tool for the job.

But in general doc fields are little slower but are great because it saves heap. But if I'm already talking memory, why would I want to pay this extra time? But maybe it's only slower because it's talking to the disk but if that's in memory, it'll be the same.

RAM is still several times slower than SSD - from 4-10 times.

But the bigger problems is that the regular Lucene flow to make it's performant and durable involves the in-memory buffer, a refresh process, the translog and a flush/fsync. Those extra steps would not be necessary here because I know it's not going to persist. It seem unavoidable that I'll be paying a significant overhead over a flow that's optimized for memory without pretending there is a disk in the middle.

If it's technically difficult or buggy - or lucene doesn't support it any more, I hear that. But as far thinking that a ramdisk is the same as an in-memory index, I wouldn't think that's actually true.

Apache Spark is built around this idea and Mongo3.2 will have a memory store for this very reason - where speed is far more important than durability. Why don't you just have hdfs or mongo on a ramdisk? I assume because it's not the same.


(Adrien Grand) #5

There is something in Lucene called RAMDirectory, but it is really inefficient, especially in terms of memory management. It has been implemented for testing purposes. If we were to support in-memory indices, I think we should rethink all the data-structures that we are using: things that are efficient in memory and on a disk are very different.


(Adrien Grand) #6

One note about your use case: Elasticsearch is quite bad at handling heavy updates. However if in your case documents are mostly updated soon after they have been created, then things should not be too bad as deletions will be mostly performed in tiny segments, which are cheap to merge. Updates are more costly when they target documents that made it to the larger segments.


(Yehosef) #7

Exactly - I would think that this would have some interesting use cases and would love to see it developed further.


(Yehosef) #8

Aren't most append-only databases (HDFS/Cassandra)? But you think this would still be true using a ramdisk, or preferably a proper in-memory index? EG, how much is just because of the merging process and how much is because of something more specific in the lucene index?

Is there anything I can do to make this easier - eg, maybe I could write to a "special" index (ramdisk or optimized in some way) for write heavy and then once things settle down - migrate to a regular index?


(Jörg Prante) #9

That's already how Lucene works, see NRT feature https://wiki.apache.org/lucene-java/NearRealtimeSearch

Meaning for updates, instead of syncing what's in RAM to disk, we perform merges and keep deletes in RAM until a maximum bytes size is reached or commit is called by the user.

The main challenge of heavy updates of parts of a document are in-place updates. That is nothing that Elasticsearch is capable of, because of the nature of inverted indices, in opposite to MongoDB: http://blog.mongodb.org/post/248614779/fast-updates-with-mongodb-update-in-place It means, Elasticsearch must read the whole document, update it, and reindex it to make it searchable again. This operation is very costly compared to a direct index where a value can be replaced in-memory.

For aggregations, it is possible to use doc values which are not located in an inverted index but on disk. Doc values have no in-place updates afaik. This is a compromise, but with a RAM disk, this can be moved completely to memory. Moving it to another index which is not on RAM disk might be possible by snapshot/restore but I'm not sure. With RAM-only, there are still tradeoffs, especially when moving and GCing large amounts of heap memory with the JVM.


(Yehosef) #10

.[quote="jprante, post:9, topic:35835"]
That's already how Lucene works, see NRT feature https://wiki.apache.org/lucene-java/NearRealtimeSearch
[/quote]

Right, but I'm mixing in my update-intensive documents with my regular insert/update indices. Perhaps we should have a "recent" or "updating" index that is either on a ramdisk or a regular disk-based index with a high refresh size. Then we can write a script or plugin to migrate the documents to the regular index.

Really I think would be great is have this process but that there is no sync to disk. Would that be possible to simulate?

But it this process were also happening in memory, wouldn't it also be fast? It wouldn't be as fast because I'll still have a merge process which you won't have in mongo - but much faster than having dealing with or simulating a disk.

Thanks for your insights - they are very helpful.


(system) #11