Dumping index is slow as hell


(Attila Bukor) #1

Hey guys,

I needed to migrate an index to a new cluster and after a lot of hesitating
I decided to give it a try to taskrabbit's elasticsearch-dump:

I tested it with 10k documents, which worked fine, so I decided to migrate
the real data to the new cluster with the following command:

elasticdump --input=http://oldcluster:9200/my_index
--output=http://newcluster:9200/my_index

"my_index" contains ~5 million documents, so I expected it to take a while,
but not this long. It's been running since 10 AM UTC+1 yesterday and it's
migrated only a bit over 1.5 million docs so far - in roughly 28 hours.

When it started, it indexed around 100 docs per second, by the time I went
home from work (around 5 PM UTC+1), it was only around 30 docs/s, now it's
around 10 docs/s.

Being a newbie with ElasticSearch, I don't even know how to diagnose what is
the reason of this slowness. Could you help me with this?

Keep in mind that I'm at work for 2 or 3 more hours today, but after that,
I won't have access to the servers until next Monday. Feel free to suggest
anything in that time too, I will read it and try to reply, but can't look
into anything or do anything about it.

Regards,
Attila Bukor

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/lei4qh%24370%241%40ger.gmane.org.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #2

Not sure about the answer to your question, but if you're on ES 1.0, you
might want to give snapshot/restore a try:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/220e04db-04d0-4fc3-bdaa-6792f4ddc6f9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #3

Have you benchmarked your cluster? How many docs can you index per second
with bulk indexing?

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH%3DEDqwhKC%3DgZXOwvdf%2B6FJ%3DBOrLmNFpSuraX2-JcTbYA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4