Elasticsearch Change Tracking plugin


(Thomas Peuss) #1

Hi!

I have coded a small plugin for Elasticsearch that tracks changes to
indices (inspired by https://github.com/elasticsearch/elasticsearch/issues/1242)
and makes that information available through a REST action.

You find the code here: https://github.com/derryx/elasticsearch-changes-plugin

Comments welcome
Thomas


(David Pilato) #2

Heya,

That's a very interesting plugin ! Just like the couchdb _changes API.
Nice work ! I will try to play with it in the next days.

BTW, I just submit a Pull request [1] to add it in the plugins section [2] :

Just a few comments about the plugin. ZIP file is not available in your
repository.
You should add maven-assembly-plugin in your pom.xml to generate it.
I added some comments in your repo about that [3].

HTH
David.

[1] https://github.com/elasticsearch/elasticsearch.github.com/pull/152
[2] http://www.elasticsearch.org/guide/reference/modules/plugins.html
[3]
https://github.com/derryx/elasticsearch-changes-plugin/commit/c7895ca976705d
f9a9a2a37927023d8a0b282ca5#commitcomment-1017313

-----Message d'origine-----
De : elasticsearch@googlegroups.com
[mailto:elasticsearch@googlegroups.com] De la part de Thomas Peuss
Envoyé : lundi 27 février 2012 21:12
À : elasticsearch
Objet : Elasticsearch Change Tracking plugin

Hi!

I have coded a small plugin for Elasticsearch that tracks changes to
indices (inspired by
https://github.com/elasticsearch/elasticsearch/issues/1242)
and makes that information available through a REST action.

You find the code here: https://github.com/derryx/elasticsearch-
changes-plugin

Comments welcome
Thomas


(Thomas Peuss) #3

Hi David!

On 27 Feb., 23:51, "David Pilato" da...@pilato.fr wrote:

Heya,

That's a very interesting plugin ! Just like the couchdb _changes API.
Nice work ! I will try to play with it in the next days.

BTW, I just submit a Pull request [1] to add it in the plugins section [2] :

Just a few comments about the plugin. ZIP file is not available in your
repository.
You should add maven-assembly-plugin in your pom.xml to generate it.
I added some comments in your repo about that [3].

I fix that tonight. Thanks for the hint.

CU
Thomas


(Shay Banon) #4

Heya, looks interesting. A had a quick look at the implementation, note that shards can move to a primary mode, and in this case, the tracking code won't happen. This is not how I would implement _changes (in memory circular buffer), but I can see where it might fit certain scenarios.

On Tuesday, February 28, 2012 at 9:23 AM, Thomas Peuss wrote:

Hi David!

On 27 Feb., 23:51, "David Pilato" <da...@pilato.fr (http://pilato.fr)> wrote:

Heya,

That's a very interesting plugin ! Just like the couchdb _changes API.
Nice work ! I will try to play with it in the next days.

BTW, I just submit a Pull request [1] to add it in the plugins section [2] :

Just a few comments about the plugin. ZIP file is not available in your
repository.
You should add maven-assembly-plugin in your pom.xml to generate it.
I added some comments in your repo about that [3].

I fix that tonight. Thanks for the hint.

CU
Thomas


(Thomas Peuss) #5

Hi Shay!

On 29 Feb., 14:51, Shay Banon kim...@gmail.com wrote:

Heya, looks interesting. A had a quick look at the implementation, note that shards can move to a primary mode, and in this case, the tracking code won't happen. This is not how I would implement _changes (in memory circular buffer), but I can see where it might fit certain scenarios.

First of all I know that this plugin is far from beeing finished. I
just followed the mantra "release early". :wink:

How would I notice that a shard moves to primary mode? Is there an
event I can catch?

How would you do it? Suggestions are more then welcome!

This is my first deeper contact with the ES codebase and it is IMHO
quite hard to understand how objects and services should be used (or
meant to be used). One thing I do not understand is how I should
contact the other nodes in the cluster to collect their changes.

Another functionality that is missing currently because I do not know
how to do it is the following:
The connection is kept open and all changes that are detected get
pushed out to the client until the client closes the connection. I
have found no notion of a "streaming" response.

CU
Thomas


(Shay Banon) #6

Hey,

You can check the shard routing changes to know when a shard changed its routing state. Check the different APIs implementations that go and operate on other nodes (for example, the count API). There is no support for pushing changes from the server to the client, _changes would be implemented as a pull option.

Check the mailing list history, I wrote a bit on what and possibly how a changes API would work.

-shay.banon

On Wednesday, February 29, 2012 at 5:48 PM, Thomas Peuss wrote:

Hi Shay!

On 29 Feb., 14:51, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Heya, looks interesting. A had a quick look at the implementation, note that shards can move to a primary mode, and in this case, the tracking code won't happen. This is not how I would implement _changes (in memory circular buffer), but I can see where it might fit certain scenarios.

First of all I know that this plugin is far from beeing finished. I
just followed the mantra "release early". :wink:

How would I notice that a shard moves to primary mode? Is there an
event I can catch?

How would you do it? Suggestions are more then welcome!

This is my first deeper contact with the ES codebase and it is IMHO
quite hard to understand how objects and services should be used (or
meant to be used). One thing I do not understand is how I should
contact the other nodes in the cluster to collect their changes.

Another functionality that is missing currently because I do not know
how to do it is the following:
The connection is kept open and all changes that are detected get
pushed out to the client until the client closes the connection. I
have found no notion of a "streaming" response.

CU
Thomas


(Thomas Peuss) #7

Hi Shay!

2012/3/1 Shay Banon kimchy@gmail.com:

You can check the shard routing changes to know when a shard changed its
routing state. Check the different APIs implementations that go and operate

Is there something where I can register a handler for such changes? Or
do you mean I should hook up with all shards and choose the right one
when the request comes in?

on other nodes (for example, the count API). There is no support for pushing
changes from the server to the client, _changes would be implemented as a
pull option.

Sure. Client connects and the server keeps the connection open to push
out new data as it comes available. So it is like a never-ending HTTP
request.

Check the mailing list history, I wrote a bit on what and possibly how a
changes API would work.

I do that!

CU
Thomas


(Thomas Peuss) #8

Hi Shay!

Am Donnerstag, 1. März 2012 13:27:10 UTC+1 schrieb kimchy:

Check the mailing list history, I wrote a bit on what and possibly how
a changes API would work.

You suggest to use the translog, right? That is of course a good source of
information for the changes plugin. I keep that in mind.

CU
Thomas


(Shay Banon) #9

Yea, the main idea is to use the translog to keep track of changes. Its tricky though, since you somehow need to keep the translog around long enough for changes to be pulled, and manage the case of getting "all" changes, which require going (in order of operation) over the current index state, and then starting to pull the translog. Its a big feature in terms of work :slight_smile:

On Friday, March 2, 2012 at 2:11 PM, Thomas Peuss wrote:

Hi Shay!

Am Donnerstag, 1. März 2012 13:27:10 UTC+1 schrieb kimchy:

Check the mailing list history, I wrote a bit on what and possibly how a changes API would work.

You suggest to use the translog, right? That is of course a good source of information for the changes plugin. I keep that in mind.

CU
Thomas


(Thomas Peuss) #10

Hi Shay!

Any chance to get notified when a shard changes routing from replica to
primary and vice-versa?

CU
Thomas


(Shay Banon) #11

Yes, in 0.19, its part of the events you already register to (shard routing changed event).

On Monday, March 5, 2012 at 3:34 PM, Thomas Peuss wrote:

Hi Shay!

Any chance to get notified when a shard changes routing from replica to primary and vice-versa?

CU
Thomas


(Thomas Peuss) #12

Hi Shay!

2012/3/5 Shay Banon kimchy@gmail.com:

Yes, in 0.19, its part of the events you already register to (shard routing
changed event).

Perfect.

Thank you
Thomas


(system) #13