Problem importing wikipedia dump after ES upgrade


#1

HI to everybody,
This morning i have upgraded to the new elasticsearch 5, and I have found some troubles to import wikipedia dump in ES. I have followed the procedure described in this thread http://stackoverflow.com/questions/33630222/indexing-wikipedia-dump-to-elasticsearch-gets-xml-document-structures-must-start , that i have used succesfully with ES 2.x . But now the process ends before to finish the indexing. Then I can't make any kind of search in sense, all queries return me this output:

Request failed to get to the server (status code: 0):

If I try to restart the ES daemon, I can't make any query to my index but the output changes:
{ "error": { "root_cause": [], "type": "search_phase_execution_exception", "reason": "all shards failed", "phase": "query", "grouped": true, "failed_shards": [], "caused_by": { "type": "illegal_index_shard_state_exception", "reason": "CurrentState[RECOVERING] operations only allowed when shard state is one of [POST_RECOVERY, STARTED, RELOCATED]", "index_uuid": "8z18WSAtQkW1ONMjHCWEmg", "shard": "4", "index": "wikiparse" } }, "status": 503 }

All of this happens after a clean installation. What can I do to fix that?
Thanks in advance


(Christoph) #2

I did this myself a while back and ran into issues as well. The mappings that you get when you follow this procedure are most likely still 2.x mappings (thats what wikipedia uses so far I think). There have been a few changes to the mapping api in 5.0, so unless wikipedia is moving to 5.x soon the only option I see is fixing the mappings (and other breaking stuff) by hand. What's wrong with the mappings might be apparent from your logs.


#3

You give to me a good suggestion. Reading my logs I have understood that my problem was only a java heap space. For indexing wikipedia you need only to change "type":"string" in "type":"text" in mapping. So I have solved in this way


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.