Es5 index creation hanging

Hello,
We encountered an issue back in 5.0/5.1 that ElasticSearch was not able to create an index if the mapping contained lots of synonyms.
We are still pushing all the synonyms directly via the mapping, and while this solution worked fine on 5.2 it stopped working on 5.3 and 5.4.

I also have to say that the synonym list is not that huge; the whole _cluster/state is only 2 megabytes (we have 2 indices containing the synonym list, so I'd assume it's around 1M per index)

Here is the topic we opened back then 200% CPU - Elasticsearch 5 index creation very slow with a huge synonyms list

I'm sorry that you're having trouble. We can not help you unless you're specific about what you're doing, what you expect to happen, and what is actually happening.

Hello @jasontedor in this link you can find detailed information of the problem. Pretty much in the latest version, the changes that solved the problem no longer seems to work.

Hello @jasontedor, thanks for replying. Do you need an excerpt of the mapping? It's fairly standard I think, we have some keep words, some synonyms, etc. A few thousands (in total about 50K I think) across multiple analyzers.

  "job_title_keep": {
    "type": "keep",
    "keep_words_case": false,
    "keep_words": %%job_title_keep%%
  },
  "job_title_synonyms": {
    "tokenizer": "whitespace",
    "type": "synonym",
    "synonyms": %%job_title_synonyms%%
  },
  "job_org_keep": {
    "type": "keep",
    "keep_words": %%job_org_keep%%,
    "keep_words_case": false
  },
  "job_org_synonyms": {
    "type": "keep",
    "keep_words": %%job_org_keep%%,
    "keep_words_case": false
  },
  "language_keep": {
    "type": "keep",
    "keep_words": %%language_keep%%,
    "keep_words_case": false
  },
  "degree_keep": {
    "type": "keep",
    "keep_words": %%degree_keep%%,
    "keep_words_case": false
  },

On the above excerpt, we replace the %%...%% by synonym list, solr format, or keep/stop words

When the index creation is hung, would you please stack dump the process (use jstack) and use the hot threads API and share the result here?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.