Knapsack plugin problem


(ganeshbabu) #1

Hi @jprante

I am trying to copy docs from one index to another index by using knapsack plugin. I have installed the knapsack plugin in dev environment. It has one master node, one data node & 1 client node. I am using Elasticsearch version 1.7.3 in which i have 1.7.2.0 knapsack plugin version in dev server. I have also installed shield 1.3.2

I tried the following steps from the documentation,

[esadmin@dayrhebfmd001 ETE_elasticsearch-1.7.3]$ curl -XPUT --user esadmin:XXXX 10.7.147.21:9201/test/test/1 -d '{"key":"value 1"}'
{"_index":"test","_type":"test","_id":"1","_version":1,"created":true}[esadmin@dayrhebfmd001 ETE_elasticsearch-1.7.3]$
[esadmin@dayrhebfmd001 ETE_elasticsearch-1.7.3]$ curl -XPUT --user esadmin:XXXXX 10.7.147.21:9201/test/test/2 -d '{"key":"value 2"}'
{"_index":"test","_type":"test","_id":"2","_version":1,"created":true}[esadmin@dayrhebfmd001 ETE_elasticsearch-1.7.3]$

but I am getting the below error when i use export command.

[esadmin@dayrhebfmd001 ETE_elasticsearch-1.7.3]$ curl -XPOST --user esadmin:XXXXX 10.7.147.21:9201/test/test/_export
{"error":"AuthorizationException[action [org.xbib.elasticsearch.knapsack.export] is unauthorized for user [esadmin]]","status":403}[esadmin@dayrhebfmd001 ETE_elasticsearch-1.7.3]$

Could you please help me to resolve this?

Thanks,
Ganeshbabu R


(Jörg Prante) #2

It seems you use a transport client authentication module, maybe Shield.

Knapsack plugin is not supporting Shield authentication, because Shield is not open source.


(ganeshbabu) #3

oh no..

I thought this plugin will be perfect for my scenario. I tried snapshot & restore but in my case i want to restore an index with new mapping but it doesn't happened. New mapping's are not reflected in the restored index. So I thought knapsack plugin would be fit.

In future knapsack plugin will support shield authentication?

Could you suggest an idea I want to restore an index with new mappings?

Thanks,
Ganeshbabu R


(David Pilato) #4

May be this could help http://david.pilato.fr/blog/2015/05/20/reindex-elasticsearch-with-logstash/


(ganeshbabu) #5

Thanks @dadoonet

I read the documentation and definitely I try this and let you know the feedback.


(ganeshbabu) #6

Hi @dadoonet

I tried logstash for copying one docs from dev cluster to QA cluster. As it takes so long to complete the process and in the logstash config file I used scroll value as "5m" and given size has "500". In dev cluster we have an index es_item of nearing size 850 GB using logstash i copying docs from dev to QA cluster.

Should I change any of the settings to improve performance?

Please let us know your suggestions.

Thanks
Ganeshbabu R


(ganeshbabu) #7

Hi @jprante

I disabled shield in my dev environment to use knapsack plugin and I tried using _push command to copy docs from dev cluster to QA cluster. Below is the command I used,

curl -XPOST '10.7.147.21:9200/ogrds_item/_push?&cluster=remote&host=10.7.147.22&port=9201'

After executing the command the following line were shown,
{"running":false,"state":{"mode":"push","node_name":"dayrhebfmd001_DEV_MASTER"}}

Could you please tell me why it is showing "running:false"?

Similarly I tried using _push command in dev environment to copy docs form es_item index to es_itemtest index where it was showing as "running:true"

Please let us know your suggestions.

Thanks,
Ganeshbabu R


(Jörg Prante) #8

running: true denotes that knapsack service could successfully spawn a thread for import/export.

running: false may signal a problem because knapsack service could not spawn a thread successfully. The exact reason is unknown at time of response. Maybe there are exceptions or other messages in the server log.


(ganeshbabu) #9

Thanks for your response @jprante

Here is the exceptions in master ES server log,

[2016-02-02 10:40:16,391][INFO ][BaseTransportClient ] creating transport client, java version 1.8.0_60, effective settings {host=10.7.147.22, port=9201, cluster.name=remote, timeout=30s, client.transport.sniff=true, client.transport.ping_timeout=30s, client.transport.ignore_cluster_name=true, path.plugins=.dontexist}
[2016-02-02 10:40:16,412][INFO ][plugins ] [Living Pharaoh] loaded [], sites []
[2016-02-02 10:40:16,485][INFO ][BaseTransportClient ] transport client settings = {transport.ping_schedule=5s, host=10.7.147.22, port=9201, cluster.name=remote, timeout=30s, client.transport.sniff=true, client.transport.ping_timeout=30s, client.transport.ignore_cluster_name=true, path.plugins=.dontexist, path.home=/opt/esadmin/elasticsearch-1.7.3, config=/opt/esadmin/elasticsearch-1.7.3/config/elasticsearch-dev-master.yml, name=Living Pharaoh, path.logs=/opt/esadmin/elasticsearch-1.7.3/logs, network.server=false, node.client=true, client.type=transport}
[2016-02-02 10:40:16,485][INFO ][BaseTransportClient ] adding custom address for transport client: inet[/10.7.147.22:9201]
[2016-02-02 10:40:46,491][INFO ][client.transport ] [Living Pharaoh] failed to get local cluster state for [#transport#-1][dayrhebfmd001.enterprisenet.org][inet[/10.7.147.22:9201]], disconnecting...
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[/10.7.147.22:9201]][cluster:monitor/state] request_id [0] timed out after [30001ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:529)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2016-02-02 10:40:46,499][INFO ][BaseTransportClient ] configured addresses to connect = [inet[/10.7.147.22:9201]], waiting for 30 seconds to connect ...
[2016-02-02 10:41:16,495][INFO ][client.transport ] [Living Pharaoh] failed to get local cluster state for [#transport#-1][dayrhebfmd001.enterprisenet.org][inet[/10.7.147.22:9201]], disconnecting...
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[/10.7.147.22:9201]][cluster:monitor/state] request_id [1] timed out after [30000ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:529)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2016-02-02 10:41:16,499][INFO ][BaseTransportClient ] connected nodes = []
[2016-02-02 10:41:16,503][INFO ][BulkTransportClient ] closing bulk processor...
[2016-02-02 10:41:16,503][INFO ][BulkTransportClient ] shutting down...
[2016-02-02 10:41:16,520][INFO ][BulkTransportClient ] shutting down completed
[2016-02-02 10:41:16,521][INFO ][KnapsackService ] add: plugin.knapsack.export.state -> []
[2016-02-02 10:41:16,521][INFO ][KnapsackService ] update cluster settings: plugin.knapsack.export.state -> [{"mode":"push","node_name":"dayrhebfmd001_DEV_MASTER"}]

Node name should be matched in dev & QA servers for copy docs?

Is that a reason for this failure?

Please let us know your feedback.

Regards,
Ganeshbabu R


(Jörg Prante) #10

That is a compatibility issue. You run ES 1.7.3 but I released knapsack for ES 1.7.2 (latest 1.x version is 1.7.2.1).

I need to push out a 1.7.3 compatible knapsack plugin.


(ganeshbabu) #11

oh god...

But I tried in dev server(ES 1.7.3 & knapsack plugin 1.7.2.0) to copy docs from es_item to es_item_test and It's worked. Below is the command i used

curl -XPOST '10.7.147.21:9200/es_item/_push?map={"es_item":"e_item_test"}'
{"running":true,"state":{"mode":"push","node_name":"dayrhebfmd001_DEV_MASTER"}}

I think transferring docs from local to remote is not working.

Thanks,
Ganeshbabu R


(ganeshbabu) #12

Hi @jprante

Will knapsack 1.7.3 plugin will be available in near future?

Because why I am asking is we have nearly 850 GB of data (In docs values like 205,059,968) and we want to add new mapping to an new index and copy docs from the existing index and I am sure lot of time will save.
Reindexing can be done in other way also but In our case we have to do lot of manual work for that. Bulk loading will take more time.

Please let us know when will knapsack 1.7.3 plugin be available.

Thanks,
Ganeshbabu R


(Ivan Brusic) #13

The plugin is open source, you can simply bundle your own version. :slight_smile:


(Jörg Prante) #14

I pushed a quick update for ES 1.7.3, please use this link for plugin install

http://xbib.org/repository/org/xbib/elasticsearch/plugin/elasticsearch-knapsack/1.7.3.0/elasticsearch-knapsack-1.7.3.0-plugin.zip


(ganeshbabu) #15

Thanks @jprante

Definitely I will try this and let you know the feedback

Regards,
Ganeshbabu R


(system) #16