Synonym token filter feature is time-consuming when synonym dict is big


(Aimerlee) #1

Elasticsearch Version: 5.4.2

I try to create a index, some of fields use a self-developed analyzer, and the analyzer use a big list for tokenizer and synonym. I find that it will take a long time like an hour to finish the creation.
I also try to use the 'synonym token filter' feature ( https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-synonym-tokenfilter.html ), when the settings.index.analysis.analyzer.synonym.filter has hundreds of elements, it will also take minutes to create.
I check the source code and find that the settings will be validated firstly, whick will take a long time to process. Here are some source files related:
core/src/main/java/org/elasticsearch/cluster/metadata/MetaDataUpdateSettingsService.java#updateSettings
core/src/main/java/org/elasticsearch/common/settings/AbstractScopedSettings.java#validate
core/src/main/java/org/elasticsearch/common/settings/Setting.java#groupSetting#get
core/src/main/java/org/elasticsearch/common/settings/Settings.java#getByPrefix
Is this a problem of elasticsearch? or do I use this feature improperly? How can I use this feature?


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.