The aim of the transaction log in elasticsearch is to basically make sure
that once you index data into elasticsearch it will be there, without the
need to perform a "commit" on lucene each time (which will kill
performance). This means that when a recovery happens, either from gateway
or from another node, the actual index files are recovered, and then the
transaction log is replayed. When an actual flush (in elasticsearch lingo)
happens, a commit on Lucene is performed and a new transaction log is
I think that what you are after is a different kind of transaction log,
which has different characteristics from the current elasticsearch
transaction log. Those differences make it big enough, I think, to not try
and piggyback the current ES transaction log with what you are after. The
current ES transaction log is very simple to implement because of the very
defined task it was created for.
A feature such as you are after does make a lot of sense. I believe that
it can be implemented in ES, but will require a whole new "module" for that,
or implemented on top of ES.
On Thu, Jun 10, 2010 at 11:02 PM, Otis firstname.lastname@example.org wrote:
I'm curious about the Gateway and its transaction log. More
precisely, I am wondering whether one can keep this transaction log
permanently and whether one can configure ES to keep only certain
types of transactions in the log (e.g. keep only doc modifications,
but not additions)?
The use case is a system that indexes some data (say static files in
the file system), then allows one to modify indexed documents, but
doesn't propagate those changes to the original data (say those static
files). Such a system is problematic if one has to reindex the
original data. If that has to be done, all changes (which were
applied only directly to the documents in the index) would be gone.
XA log role:
- If one can keep that transaction log forever, then one could replay
all document "edits" and get the previous state of the index.
If one can store only updates (or maybe updates + deletions) in the
transaction log, then only those could be re-applied and document
addition can remain in an external indexing application (say an app
that indexes files from a file system).
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch