Customizing Directory and IndexWriter behavior via custom ES plug-in

Hi there,

in the last couple of years we managed to customize Apache Lucene (through
its public API) to support branching, tagging and compare in a concurrent
fashion for our server application. We managed to achieve this by using a
couple of custom Directory, exactly one IndexDeletionPolicy and one MergePolicy
implementations. Currently we are considering to replace Lucene with
Elasticsearch on the server-side. Before we jumped into the details of
collecting the differences between the two technologies in respect of the
search and indexing functionality and for instance how to port our custom
collectors and how to replace NDVs, we would like to make sure if it is
possible at all.

I've just checked out the source and realized that the registration of the
services are done via various module implementations and the actual
configured service implementations are injected into the constructors. For
the sake of simplicity is there a way for example to create an
Elasticsearch module which forces the underlying IndexWriter to use the FooCustomDeletionPolicy
instead of the default KeepOnlyLastDeletionPolicy? I assume if this is
straightforward we could use or custom implementations for the directory
and the IndexWriter what we are currently using with pure Lucene. After
doing some research I found this 1 thread. Am I close to the
answer/solution?

I have to notice we would like to achieve this without forking the public
repository.

Thanks in advance for the feedbacks.

Cheers,
Akos

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0345efea-3134-488d-b13d-199a24642422%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

I know the merge policy is configurable but I don't know if it is
pluggable. I imagine it'd be pretty simple to make what you need pluggable
if it isn't already. You'd have to send a pull request but you wouldn't
have to maintain the fork for now than a release I imagine.
On Oct 27, 2014 8:22 AM, "Ákos Kitta" kittaakos@gmail.com wrote:

Hi there,

in the last couple of years we managed to customize Apache Lucene (through
its public API) to support branching, tagging and compare in a concurrent
fashion for our server application. We managed to achieve this by using a
couple of custom Directory, exactly one IndexDeletionPolicy and one MergePolicy
implementations. Currently we are considering to replace Lucene with
Elasticsearch on the server-side. Before we jumped into the details of
collecting the differences between the two technologies in respect of the
search and indexing functionality and for instance how to port our custom
collectors and how to replace NDVs, we would like to make sure if it is
possible at all.

I've just checked out the source and realized that the registration of the
services are done via various module implementations and the actual
configured service implementations are injected into the constructors. For
the sake of simplicity is there a way for example to create an
Elasticsearch module which forces the underlying IndexWriter to use the FooCustomDeletionPolicy
instead of the default KeepOnlyLastDeletionPolicy? I assume if this is
straightforward we could use or custom implementations for the directory
and the IndexWriter what we are currently using with pure Lucene. After
doing some research I found this 1 thread. Am I close to the
answer/solution?

I have to notice we would like to achieve this without forking the public
repository.

Thanks in advance for the feedbacks.

Cheers,
Akos

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0345efea-3134-488d-b13d-199a24642422%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0345efea-3134-488d-b13d-199a24642422%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1-Fy0zAoahc2s3LWMvVLn7VeX8Qc0mBQL9E%3Dw1GYaHog%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Regarding the deletion policy, you can set the class name of your deletion
policy implementation in the setting "index.deletionpolicy.type"

For custom Directory, you have to
patch org.elasticsearch.index.store.IndexStoreModule with your custom index
store. The index store is something like an IndexWriter / Lucene Directory
on steroids. At the moment, it is not possible to add custom index stores
from a plugin (see the fixed enumeration of implementations
in IndexStoreModule)

Jörg

On Mon, Oct 27, 2014 at 1:22 PM, Ákos Kitta kittaakos@gmail.com wrote:

Hi there,

in the last couple of years we managed to customize Apache Lucene (through
its public API) to support branching, tagging and compare in a concurrent
fashion for our server application. We managed to achieve this by using a
couple of custom Directory, exactly one IndexDeletionPolicy and one MergePolicy
implementations. Currently we are considering to replace Lucene with
Elasticsearch on the server-side. Before we jumped into the details of
collecting the differences between the two technologies in respect of the
search and indexing functionality and for instance how to port our custom
collectors and how to replace NDVs, we would like to make sure if it is
possible at all.

I've just checked out the source and realized that the registration of the
services are done via various module implementations and the actual
configured service implementations are injected into the constructors. For
the sake of simplicity is there a way for example to create an
Elasticsearch module which forces the underlying IndexWriter to use the FooCustomDeletionPolicy
instead of the default KeepOnlyLastDeletionPolicy? I assume if this is
straightforward we could use or custom implementations for the directory
and the IndexWriter what we are currently using with pure Lucene. After
doing some research I found this 1 thread. Am I close to the
answer/solution?

I have to notice we would like to achieve this without forking the public
repository.

Thanks in advance for the feedbacks.

Cheers,
Akos

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0345efea-3134-488d-b13d-199a24642422%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0345efea-3134-488d-b13d-199a24642422%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoELE5szxpBTPDNPe-s2XVBTaZPhGeb4Wzwi8rSqQ9TK5Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I stand corrected - there is also the possibility of the setting
"index.store.type", by setting this to a Java class name, you can use this
as the current index store implementation from a plugin.

So, no patching/forking required.

Jörg

On Mon, Oct 27, 2014 at 2:26 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Regarding the deletion policy, you can set the class name of your deletion
policy implementation in the setting "index.deletionpolicy.type"

For custom Directory, you have to
patch org.elasticsearch.index.store.IndexStoreModule with your custom index
store. The index store is something like an IndexWriter / Lucene Directory
on steroids. At the moment, it is not possible to add custom index stores
from a plugin (see the fixed enumeration of implementations
in IndexStoreModule)

Jörg

On Mon, Oct 27, 2014 at 1:22 PM, Ákos Kitta kittaakos@gmail.com wrote:

Hi there,

in the last couple of years we managed to customize Apache Lucene
(through its public API) to support branching, tagging and compare in a
concurrent fashion for our server application. We managed to achieve this
by using a couple of custom Directory, exactly one IndexDeletionPolicy and
one MergePolicy implementations. Currently we are considering to replace
Lucene with Elasticsearch on the server-side. Before we jumped into the
details of collecting the differences between the two technologies in
respect of the search and indexing functionality and for instance how to
port our custom collectors and how to replace NDVs, we would like to make
sure if it is possible at all.

I've just checked out the source and realized that the registration of
the services are done via various module implementations and the actual
configured service implementations are injected into the constructors. For
the sake of simplicity is there a way for example to create an
Elasticsearch module which forces the underlying IndexWriter to use the FooCustomDeletionPolicy
instead of the default KeepOnlyLastDeletionPolicy? I assume if this is
straightforward we could use or custom implementations for the directory
and the IndexWriter what we are currently using with pure Lucene. After
doing some research I found this 1 thread. Am I close to the
answer/solution?

I have to notice we would like to achieve this without forking the public
repository.

Thanks in advance for the feedbacks.

Cheers,
Akos

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0345efea-3134-488d-b13d-199a24642422%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0345efea-3134-488d-b13d-199a24642422%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFQHUNDA6aER6jHkEFcsHS8PEpCPcszq%2BNGqbp_os%2B9_Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Awesome. Thanks a lot for the help. I'll give a try.

On Monday, October 27, 2014 2:30:51 PM UTC+1, Jörg Prante wrote:

I stand corrected - there is also the possibility of the setting
"index.store.type", by setting this to a Java class name, you can use this
as the current index store implementation from a plugin.

So, no patching/forking required.

Jörg

On Mon, Oct 27, 2014 at 2:26 PM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:

Regarding the deletion policy, you can set the class name of your
deletion policy implementation in the setting "index.deletionpolicy.type"

For custom Directory, you have to
patch org.elasticsearch.index.store.IndexStoreModule with your custom index
store. The index store is something like an IndexWriter / Lucene Directory
on steroids. At the moment, it is not possible to add custom index stores
from a plugin (see the fixed enumeration of implementations
in IndexStoreModule)

Jörg

On Mon, Oct 27, 2014 at 1:22 PM, Ákos Kitta <kitt...@gmail.com
<javascript:>> wrote:

Hi there,

in the last couple of years we managed to customize Apache Lucene
(through its public API) to support branching, tagging and compare in a
concurrent fashion for our server application. We managed to achieve this
by using a couple of custom Directory, exactly one IndexDeletionPolicy and
one MergePolicy implementations. Currently we are considering to
replace Lucene with Elasticsearch on the server-side. Before we jumped into
the details of collecting the differences between the two technologies in
respect of the search and indexing functionality and for instance how to
port our custom collectors and how to replace NDVs, we would like to make
sure if it is possible at all.

I've just checked out the source and realized that the registration of
the services are done via various module implementations and the actual
configured service implementations are injected into the constructors. For
the sake of simplicity is there a way for example to create an
Elasticsearch module which forces the underlying IndexWriter to use the FooCustomDeletionPolicy
instead of the default KeepOnlyLastDeletionPolicy? I assume if this is
straightforward we could use or custom implementations for the directory
and the IndexWriter what we are currently using with pure Lucene. After
doing some research I found this 1 thread. Am I close to the
answer/solution?

I have to notice we would like to achieve this without forking the
public repository.

Thanks in advance for the feedbacks.

Cheers,
Akos

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0345efea-3134-488d-b13d-199a24642422%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0345efea-3134-488d-b13d-199a24642422%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5f6310e5-dc98-4a54-9c2c-ce39cecab2b6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.