[ANN] Elasticsearch reindex plugin

Hi,

this is a plugin which wraps some 'reindex' functionality and executes this
on the server-side. This could be useful

  • if you want to change some index settings which are not updatable (like
    shard count etc => reindexing into a new index)
  • or if you want to change some type settings (reindexing into the same
    index)
  • or if you want to copy/update only specific data into another index =>
    therefor you can specify a query (default is match_all)

Let me know if you have problems or suggestions!

Regards,
Peter.

--

Hi Peter,

Cool plugin!
I think it's also relative to this issue:

I suppose that it only works in the same cluster and that _source must not be
disabled, isn't it?

Cheers
David.

Le 27 novembre 2012 à 15:35, Karussell tableyourtime@gmail.com a écrit :

Hi,

this is a plugin which wraps some 'reindex' functionality and executes this
on the server-side. This could be useful

  • if you want to change some index settings which are not updatable (like
    shard count etc => reindexing into a new index)
  • or if you want to change some type settings (reindexing into the same
    index)
  • or if you want to copy/update only specific data into another index =>
    therefor you can specify a query (default is match_all)

https://github.com/karussell/elasticsearch-reindex

Let me know if you have problems or suggestions!

Regards,
Peter.

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

Hey David,

yes, _source cannot be disabled and also it only works in the same cluster.
But as one could use the code in a pure java application (like I was doing
before) or in a river (like you are proposing in the issue) one can then
reindex into a different cluster too.

Regads,
Peter.

On Tuesday, November 27, 2012 3:58:46 PM UTC+1, David Pilato wrote:

Hi Peter,

Cool plugin!
I think it's also relative to this issue:
https://github.com/elasticsearch/elasticsearch/issues/1077

I suppose that it only works in the same cluster and that _source must
not be disabled, isn't it?

Cheers
David.

Le 27 novembre 2012 à 15:35, Karussell <tabley...@gmail.com <javascript:>>
a écrit :

Hi,

this is a plugin which wraps some 'reindex' functionality and executes
this on the server-side. This could be useful

  • if you want to change some index settings which are not updatable
    (like shard count etc => reindexing into a new index)
  • or if you want to change some type settings (reindexing into the same
    index)
  • or if you want to copy/update only specific data into another index =>
    therefor you can specify a query (default is match_all)

https://github.com/karussell/elasticsearch-reindex

Let me know if you have problems or suggestions!

Regards,
Peter.

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

Oh yes ! I will fork it :wink:

My only concern with the river is that nodes could be incompatible from a
cluster to another one.
That's one of the reason I did not digg into before.
But now, there are some pure REST interfaces and I probaly can use JEST [1] for
example to fetch content from another cluster (I did not check if scan & scroll
API is available from JEST).

Also, it's perhaps a nonsense to consider it as a river and not as an
administrative tool (as you said : in a pure java application).

Regards

[1] https://github.com/searchbox-io/Jest

Le 27 novembre 2012 à 16:10, Karussell tableyourtime@gmail.com a écrit :

Hey David,

yes, _source cannot be disabled and also it only works in the same cluster.
But as one could use the code in a pure java application (like I was doing
before) or in a river (like you are proposing in the issue) one can then
reindex into a different cluster too.

Regads,
Peter.

On Tuesday, November 27, 2012 3:58:46 PM UTC+1, David Pilato wrote:

Hi Peter,

Cool plugin!
I think it's also relative to this issue:
https://github.com/elasticsearch/elasticsearch/issues/1077
https://github.com/elasticsearch/elasticsearch/issues/1077

I suppose that it only works in the same cluster and that _source must
not be disabled, isn't it?

Cheers
David.

Le 27 novembre 2012 à 15:35, Karussell <
https://github.com/elasticsearch/elasticsearch/issues/1077
tabley...@gmail.com> a écrit :

> > > Hi,
this is a plugin which wraps some 'reindex' functionality and executes

this on the server-side. This could be useful

 * if you want to change some index settings which are not updatable

(like shard count etc => reindexing into a new index)
* or if you want to change some type settings (reindexing into the
same index)
* or if you want to copy/update only specific data into another index
=> therefor you can specify a query (default is match_all)

https://github.com/karussell/elasticsearch-reindex

https://github.com/karussell/elasticsearch-reindex

Let me know if you have problems or suggestions!

Regards,
Peter.



--

  <https://github.com/karussell/elasticsearch-reindex>
 <https://github.com/karussell/elasticsearch-reindex>

https://github.com/karussell/elasticsearch-reindex

--
David Pilato
https://github.com/karussell/elasticsearch-reindex
http://www.scrutmydocs.org/ http://www.scrutmydocs.org/
http://dev.david.pilato.fr/ http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

I will fork it :wink:

please :slight_smile: !

My only concern with the river is that nodes could be incompatible from a
cluster to another one.

hmmh, indeed a valid concern. but how would you add Jest to the instance
which hosts the plugin?

Jest uses elasticsearch under the hood (why?)! See this discussion:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Jest-ElasticSearch-Java-Rest-Client-td4023119.html

Regards,
Peter.

--

Oh. Thanks I was not aware of it.

So I assume that I have to use my own pure REST implementation (with SPORE
specification [1]) - but scan & scroll is not written yet.
So I have to wait for... What ? For myself ? WTF :wink:

This way I won't be elasticsearch jar dependent.

[1] https://github.com/dadoonet/spore-elasticsearch

Le 27 novembre 2012 à 16:43, Karussell tableyourtime@gmail.com a écrit :

I will fork it :wink:

please :slight_smile: !

My only concern with the river is that nodes could be incompatible from a
cluster to another one.

hmmh, indeed a valid concern. but how would you add Jest to the instance
which hosts the plugin?

Jest uses elasticsearch under the hood (why?)! See this discussion:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Jest-ElasticSearch-Java-Rest-Client-td4023119.html
http://elasticsearch-users.115913.n3.nabble.com/ANN-Jest-ElasticSearch-Java-Rest-Client-td4023119.html

Regards,
Peter.

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

is there a Java implementation for SPORE?

Also the GET request(s) for scroll should be very simple to be 'hacked'
together via a simple JSONObject + Apache client ...

but do you know if it is easy to add those dependencies when writing a
plugin? Or is it some maven magic where I use the the full
"dependencies-jar"?

Regards,
Peter.

On Tuesday, November 27, 2012 4:54:40 PM UTC+1, David Pilato wrote:

Oh. Thanks I was not aware of it.

So I assume that I have to use my own pure REST implementation (with
SPORE specification [1]) - but scan & scroll is not written yet.
So I have to wait for... What ? For myself ? WTF :wink:

This way I won't be elasticsearch jar dependent.

[1] https://github.com/dadoonet/spore-elasticsearch

Le 27 novembre 2012 à 16:43, Karussell <tabley...@gmail.com <javascript:>>
a écrit :

I will fork it :wink:

please :slight_smile: !

My only concern with the river is that nodes could be incompatible from
a cluster to another one.

hmmh, indeed a valid concern. but how would you add Jest to the instance
which hosts the plugin?

Jest uses elasticsearch under the hood (why?)! See this discussion:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Jest-ElasticSearch-Java-Rest-Client-td4023119.html

Regards,
Peter.

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

Yes. Not released yet but it will be: https://github.com/nicoo/jspore
It's the one I use for my JUnit tests.
https://github.com/dadoonet/spore-elasticsearch/blob/master/pom.xml#L9

For the RSS River or other rivers I wrote, it was quite easy to add dependencies
in the plugin ZIP file.
Is it your question?

See: https://github.com/dadoonet/rssriver/blob/master/pom.xml#L140
and
https://github.com/dadoonet/rssriver/blob/master/src/main/assemblies/esplugin.xml

David.

Le 27 novembre 2012 à 17:45, Karussell tableyourtime@gmail.com a écrit :

is there a Java implementation for SPORE?

Also the GET request(s) for scroll should be very simple to be 'hacked'
together via a simple JSONObject + Apache client ...

but do you know if it is easy to add those dependencies when writing a
plugin? Or is it some maven magic where I use the the full "dependencies-jar"?

Regards,
Peter.

On Tuesday, November 27, 2012 4:54:40 PM UTC+1, David Pilato wrote:

Oh. Thanks I was not aware of it.

So I assume that I have to use my own pure REST implementation (with
SPORE specification [1]) - but scan & scroll is not written yet.
So I have to wait for... What ? For myself ? WTF :wink:

This way I won't be elasticsearch jar dependent.

[1] https://github.com/dadoonet/spore-elasticsearch
https://github.com/dadoonet/spore-elasticsearch
https://github.com/dadoonet/spore-elasticsearch

Le 27 novembre 2012 à 16:43, Karussell <
https://github.com/dadoonet/spore-elasticsearch tabley...@gmail.com> a
écrit :

> > >     > I will fork it ;-)
please :) !

> My only concern with the river is that nodes could be incompatible
> from a cluster to another one.

hmmh, indeed a valid concern. but how would you add Jest to the

instance which hosts the plugin?

Jest uses elasticsearch under the hood (why?)! See this discussion:

http://elasticsearch-users.115913.n3.nabble.com/ANN-Jest-ElasticSearch-Java-Rest-Client-td4023119.html
http://elasticsearch-users.115913.n3.nabble.com/ANN-Jest-ElasticSearch-Java-Rest-Client-td4023119.html

Regards,
Peter.



--



 <http://elasticsearch-users.115913.n3.nabble.com/ANN-Jest-ElasticSearch-Java-Rest-Client-td4023119.html>

http://elasticsearch-users.115913.n3.nabble.com/ANN-Jest-ElasticSearch-Java-Rest-Client-td4023119.html

--
David Pilato

<http://elasticsearch-users.115913.n3.nabble.com/ANN-Jest-ElasticSearch-Java-Rest-Client-td4023119.html>
http://www.scrutmydocs.org/ <http://www.scrutmydocs.org/>
http://dev.david.pilato.fr/ <http://dev.david.pilato.fr/>
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

Hi David,

I've implemented the external cluster thing (for simplicity just with
JSONObject and HttpClient, not sure if it is ok regarding performance/IO).
So if you specify searchHost then this more expensive variation will be
used.

The cool thing is that I can now grab data from production servers into my
local box :slight_smile: (when making the port public for this short time). I also
introduced a waitInSeconds parameter to avoid high load. Warning: the call
is not yet async and stopable etc (except you shutdown the server) ...
probably I should move to the river stuff ... or I'll leave this task for
the reader :wink:

Regards,
Peter.

On Tuesday, November 27, 2012 6:13:15 PM UTC+1, David Pilato wrote:

Yes. Not released yet but it will be: https://github.com/nicoo/jspore
It's the one I use for my JUnit tests.
https://github.com/dadoonet/spore-elasticsearch/blob/master/pom.xml#L9

For the RSS River or other rivers I wrote, it was quite easy to add
dependencies in the plugin ZIP file.
Is it your question?

See: https://github.com/dadoonet/rssriver/blob/master/pom.xml#L140
and

https://github.com/dadoonet/rssriver/blob/master/src/main/assemblies/esplugin.xml

David.

Le 27 novembre 2012 à 17:45, Karussell <tabley...@gmail.com <javascript:>>
a écrit :

is there a Java implementation for SPORE?

Also the GET request(s) for scroll should be very simple to be 'hacked'
together via a simple JSONObject + Apache client ...

but do you know if it is easy to add those dependencies when writing a
plugin? Or is it some maven magic where I use the the full
"dependencies-jar"?

Regards,
Peter.

On Tuesday, November 27, 2012 4:54:40 PM UTC+1, David Pilato wrote:

Oh. Thanks I was not aware of it.

So I assume that I have to use my own pure REST implementation (with
SPORE specification [1]) - but scan & scroll is not written yet.
So I have to wait for... What ? For myself ? WTF :wink:

This way I won't be elasticsearch jar dependent.

[1] https://github.com/dadoonet/spore-elasticsearch
https://github.com/dadoonet/spore-elasticsearch

Le 27 novembre 2012 à 16:43, Karussell <https://github.com/dadoonet/spore-elasticsearch
tabley...@gmail.com> a écrit :

I will fork it :wink:

please :slight_smile: !

My only concern with the river is that nodes could be incompatible from
a cluster to another one.

hmmh, indeed a valid concern. but how would you add Jest to the instance
which hosts the plugin?

Jest uses elasticsearch under the hood (why?)! See this discussion:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Jest-ElasticSearch-Java-Rest-Client-td4023119.html

Regards,
Peter.

--

http://elasticsearch-users.115913.n3.nabble.com/ANN-Jest-ElasticSearch-Java-Rest-Client-td4023119.html

http://elasticsearch-users.115913.n3.nabble.com/ANN-Jest-ElasticSearch-Java-Rest-Client-td4023119.html

--
David Pilato

http://elasticsearch-users.115913.n3.nabble.com/ANN-Jest-ElasticSearch-Java-Rest-Client-td4023119.html
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

Hi,

@peter We were about to discuss to implement something similar, thanks for
it :slight_smile: I will start to play with it as well.

@both Jest using ES for query builder, just opened an issue to make it
optional.

scroll api is not yet availble. Please open issues for missing parts, we
can prioritize according to requirements.

Best,
Ferhat
www.searchbox.io

On Tuesday, November 27, 2012 4:35:07 PM UTC+2, Karussell wrote:

Hi,

this is a plugin which wraps some 'reindex' functionality and executes
this on the server-side. This could be useful

  • if you want to change some index settings which are not updatable (like
    shard count etc => reindexing into a new index)
  • or if you want to change some type settings (reindexing into the same
    index)
  • or if you want to copy/update only specific data into another index =>
    therefor you can specify a query (default is match_all)

https://github.com/karussell/elasticsearch-reindex

Let me know if you have problems or suggestions!

Regards,
Peter.

--